Reproducible Research

Leighton Pritchard

Motivation

Irreproducibility hurts!

…all I can hope is that future historians note that one of the core empirical points providing the intellectual foundation for the global move to austerity in the early 2010s was based on someone accidentally not updating a row formula in Excel. (Mike Konczal)

A colleague asks…

Can you show me how you made the figure in that 2014 paper?

A colleague keeps asking…

I’d like to build on your analysis, can you give me the details?

  • Where is the data?
  • What was the analysis (model, parameters)?
  • Where is the code?
  • Can you still run the code?
  • Is it readable/understandable?

Papers are adverts

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. - David Donoho

  • Most bioinformatics methods sections are incomplete! (in my experience)

Reproducible research

What is it?

  • Provide documented code used to produce:
    • results
    • figures
    • tables
  • Provide the configuration of the machine(s) used
    • and/or provide instructions for replicating the environment
  • Provide contact details
    • and/or a database/framework for reporting bugs/issues

Advantages

  • No ambiguity in what was done
  • Work can be reproduced exactly
  • Colleagues can start from where you left off - efficient
  • People can learn directly from your work
  • People can point out “improvements”
  • Makes your work more attractive
  • Positive advert for your skills and competence

How to do it? (I)

  • Automate - write your analyses as scripts/code in a high-level language
  • Use literate programming (notebooks)
  • Use version control
  • Share your code (get a code buddy!)
  • Get some training

11th Grade

How to do it? (II)

Benefits

  • It’s the right thing to do
  • Others will use, debug and enhance your work
  • Others will reproduce and cite your work
  • More opportunities to collaborate

Resources

Online resources