Reproducibility and R – Better results, better code, better science.

I made a short presentation for our most recent weekly lab meeting about best practices for reproducible research.  There are a few key points, the first is that the benefits of reproducible research are not just for the community.  Producing reproducible code helps you, both after publication (higher citation rates: Piwowar and Vision, 2013) but in the long run in terms of your ability to tackle bigger projects.

Lets face it, if you intend to pursue a career inside or outside of academia your success is going to depend on tackling progressively larger or more complex projects.  If programming is going to be a part of that then developing good coding practice should be a priority.  One way to get into the habit of developing good practice is to practice.  In the presentation (PDF, figShare) I point to a hierarchy (of sorts) of good scientific coding practice, reproducible programming helps support that practice:

  1. An integrated development environment (IDE) helps you organize your code in a logical manner, helps make some repeatable tasks easier and provides tools and views to make the flow of code easier to read (helping you keep track of what you’re doing)
  2. Version control helps you make incremental changes to your code, to comment the changes clearly, and helps you fix mistakes if you break something.  It also helps you learn from your old mistakes, you can go back through your commit history and see how you fixed problems in the past.
  3. Embedded code helps you produce clean and concise code with a specific purpose, and it help you in the long run by reducing the need to “find and replace” values throughout your manuscript.  It helps reviewers as well.  Your results are simply a summary of the analysis you perform, the code is the analysis.  If you can point readers and reviewers to the code you save everyone time.

So, take a look at the presentation, let me know what you think.  And, if you are an early-career researcher, make now the time to start good coding practice.

Published by

downwithtime

Assistant scientist in the Department of Geography at the University of Wisconsin, Madison. Studying paleoecology and the challenges of large data synthesis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s