How do you edit someone else’s code?

As academics I like to think that we’ve become fairly used to editing text documents. Whether handwriting on printed documents (fairly old school, but cool), adding comments on PDFs, or using some form of “track changes” I think we’ve learned how to do the editing, and how to incorporate those edits into a finished draft. Large collaborative projects are still often a source of difficulty (how do you deal with ten simultaneous edits of the same draft?!) but we deal.

Figure 1. If your revisions look like this you should strongly question your choice of (code) reviewer.

Figure 1. If your revisions look like this you should strongly question your choice of (code) reviewer.

I’m working on several projects now that use R as a central component in analysis, and now we’re not just editing the text documents, we’re editing the code as well.

People are beginning to migrate to version control software and the literature is increasingly discussing the utility of software programming practices (e.g., Scheller et al., 2010), but given that scientific adoption of programming tools is still in its early stages, there’s no sense that we can expect people to immediately pick up all the associated tools that go along with them. Yes, it would be great if people would start using GitHub or BitBucket (or other version control tools) right away, but they’re still getting used to basic programming concepts (btw Tim Poisot has some great tips for Learning to Code in Ecology).

The other issue is that collaborating with graduate students is still a murky area. How much editing of code can you do before you’ve started doing their work for them? I think we generally have a sense of where the boundaries are for written work, but if code is part of ‘doing the experiment’, how much can you do? Editing is an opportunity to teach good coding practice, and to teach new tools to improve reproducibility and ease of use, but give the student too much and you’ve programmed everything for them.

I’m learning as I go here, and I’d appreciate tips from others (in the comments, or on twitter), but this is what I’ve started doing when working with graduate students:

  • Commenting using a ‘special’ tag:  Comments in R are just an octothorp (#), I use #* to differentiate what I’m saying from a collaborator’s comments.  This is fairly extensible, someone else could comment ‘#s’ or ‘#a’ if you have multiple collaborators.
  • Where there are major structural changes (sticking things in functions) I’ll comment heavily at the top, then build the function once.  Inside the function I’ll explain what else needs to be done so that I haven’t done it all for them.
  • If similar things need to be done further down the code I’ll comment “This needs to be done as above” in a bit more detail, so they have a template & the know where they’re going.

The tricky part about editing code is that it needs to work, so it can be frustratingly difficult to do half-edits without introducing all sorts of bugs or errors.  So if code review is part of your editing process, how do you accomplish it?

About these ads

4 thoughts on “How do you edit someone else’s code?

  1. I strongly agree about having a special (i.e. searchable) tag on comments. Would also note that automated code testing is critical to make sure that edits don’t unexpectedly break things; this has gotten a lot easier with lightweight packages such as Hadley Wickham’s ‘testthat’.

    • Thanks Ben! I agree in principle about testing, especially if multiple people are editing code. My concern there is that if you are a supervisor editing a mentee’s code you don’t want to get into a situation where you are re-writing everything. In that case I’d probably take a pass at making sure things run so that the student can fix the problems themselves. Pushing a student to make tests is a great idea though!

  2. Writing tests can demonstrate what the code should do without writing the actual code (see test driven development). But the github and bitbucket pull request code commenting options are very useful for this.

  3. Use version control despite your comments about having to teach that too. Anyone who is programming anything nontrivial needs to know about version control. Git doesn’t need github.
    Do not edit student code (similarly do not edit the thesis). Make comments.
    I know it is at least twice as hard to help someone else through the process as to just do it yourself. But you teach a lot more doing it the hard way.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s