As academics I like to think that we’ve become fairly used to editing text documents. Whether handwriting on printed documents (fairly old school, but cool), adding comments on PDFs, or using some form of “track changes” I think we’ve learned how to do the editing, and how to incorporate those edits into a finished draft. Large collaborative projects are still often a source of difficulty (how do you deal with ten simultaneous edits of the same draft?!) but we deal.
I’m working on several projects now that use R as a central component in analysis, and now we’re not just editing the text documents, we’re editing the code as well.
People are beginning to migrate to version control software and the literature is increasingly discussing the utility of software programming practices (e.g., Scheller et al., 2010), but given that scientific adoption of programming tools is still in its early stages, there’s no sense that we can expect people to immediately pick up all the associated tools that go along with them. Yes, it would be great if people would start using GitHub or BitBucket (or other version control tools) right away, but they’re still getting used to basic programming concepts (btw Tim Poisot has some great tips for Learning to Code in Ecology).
The other issue is that collaborating with graduate students is still a murky area. How much editing of code can you do before you’ve started doing their work for them? I think we generally have a sense of where the boundaries are for written work, but if code is part of ‘doing the experiment’, how much can you do? Editing is an opportunity to teach good coding practice, and to teach new tools to improve reproducibility and ease of use, but give the student too much and you’ve programmed everything for them.
I’m learning as I go here, and I’d appreciate tips from others (in the comments, or on twitter), but this is what I’ve started doing when working with graduate students:
- Commenting using a ‘special’ tag: Comments in R are just an octothorp (#), I use #* to differentiate what I’m saying from a collaborator’s comments. This is fairly extensible, someone else could comment ‘#s’ or ‘#a’ if you have multiple collaborators.
- Where there are major structural changes (sticking things in functions) I’ll comment heavily at the top, then build the function once. Inside the function I’ll explain what else needs to be done so that I haven’t done it all for them.
- If similar things need to be done further down the code I’ll comment “This needs to be done as above” in a bit more detail, so they have a template & the know where they’re going.
The tricky part about editing code is that it needs to work, so it can be frustratingly difficult to do half-edits without introducing all sorts of bugs or errors. So if code review is part of your editing process, how do you accomplish it?