The importance of being (R)eproducible
Reproducible Research (RR) or reproducible data analysis is the idea and practice to complement scholarly journal articles with all the information needed to reproduce the results they present.
Very often scientific studies rely on complex textual explanations of what has been done to analyze the data that can overwhelm the reader that has to accept them as an act of faith.
To avoid this, a good way to understand better what has been done is to provide the raw data and an univoque description of the procedures used to analyze it. This practice will allow the scientific community to reproduce the results, work along with the data and assert the validity of the results.
Many tools have appeared with the advent of Big Data and the need to analyze large datasets, specially around R, a language and environment for statistical computing and graphics that’s becoming a kind of standard de facto in open science.
Here are some tools of the R ecosystem that allows to publish the results along with the methods and the data.
Tools for Reproducibility
- RStudio IDE is a powerful and productive user interface for R, free, open source and multiplatform.
- rOpenSci is a collection of analyses and methods can be easily shared, replicated, and extended by other researchers, accessible through the R statistical programming environment.
- KnitR is an elegant, flexible and fast dynamic report generation with R that allows to incorporate R, Python and other live code snippets in a document, and comes packaged with RStudio.
- Slidify is a tool to write slides in R Markdown, a format that combines the core syntax of Markdown with embedded code chunks that are executed.
- RPubs is a tool to publish and share directly from R Markdown.
About this post
This post was first published on LibTechNotes, a blog from the Library team at the Universitat Oberta de Catalunya to share our everyday findings, solutions and inspirations.