Friday, March 6, 2020

Why it is important to share your code and make your paper accessible

In a recently published paper on "Making simulation results reproducible - Survey, guidelines, and examples based on Gradle and Docker" we asked researchers from all levels about their willingness to share the code of their simulations together with the paper. A little bit to our surprise, the answers were mostly very positive about sharing.
Still, we are currently far from a situation where every published paper is made accessible and has its code shared with the publication. Apart from the obvious cases where an industrial project might require confidentiality of some details, the most frequent reason is probably some laziness or other giving the effort to refactor your code properly for a publication a low priority. At the moment, not publishing the code is the normal habit, while providing the code is still an exception.
This needs to change for several reasons:
  • From the perspective of the researcher who reads the publication, having access to the code eases the understanding of the approach and allows building upon the work of others. The frequent argument that the code would be provided on request is mainly a lip service - first, it adds uncertainty to the reader if and when they would get the code. Second, the provider of the code might not have it prepared. Imagine the effort to dig up some code you wrote ten years ago and to make it proper so you can pass it on.
  • From the perspective of the researcher who publishes a paper, the chance to get their work read, appreciated and cited is much higher if they provide code and materials with it. Considering the time, money and effort that is put into publishing a paper, the effort of also publishing the code is well justified.
  • From a system's perspective, it is of utmost importance that we support each other. It does not make sense that brilliant minds spend time recreating implementations that have been done already. Reproducing research results is, of course, an important factor in science, but the overall ability to reproduce results and check an approach for errors increases with the possibility to have an insight into the code. 
For the same reasons, it is important to have our papers available online instead of locking them behind a paywall. Even if your university pays for the access to some literature databases there are many potential readers of your work that don't enjoy such a service, be it that their university does not provide such access or that they work from a different network at the instant they want to read your paper. My recommendation: go for open access! This could be the gold open access, where the journal provides open access to your paper at their website, however, this "gold" is usually expensive. Alternatively, several publishers offer a green open access model where you are allowed to keep a pre- or post-print version of your paper online at your private or your institution's webpage. To check if a particular publisher offers such a policy, look them up at this page about Publisher copyright policies & self-archiving.

Further reading:
Wilfried Elmenreich, Philipp Moll, Sebastian Theuermann, and Mathias Lux. Making simulation results reproducible - Survey, guidelines, and examples based on Gradle and Docker. PeerJ Computer Science, 5(e240):1–27, Dezember 2019. (doi:10.7717/peerj-cs.240)