output | ||||
---|---|---|---|---|
|
author: Caleb Kibet date: 10 August, 2018 autosize: true
"Really reproducible research" in computational sciences means:
the data and code used to make a finding are available and they are sufficient for an independent researcher to recreate the finding
- It provides a standard to judge scientific claims
- Reproducibility enhances replicability.
- Helps avoid effort duplication & encourages cumulative knowledge development
- Higher research impact for the researcher
- Instils better work habits and teamwork
![collaborator] (https://image.slidesharecdn.com/layton-repro-research-talk-2015-05-06-150507202403-lva1-app6892/95/reproducible-research-first-steps-6-638.jpg?cb=1431031632)
- Document everything!
- Everything is a (text) file.
- All files should be human readable.
- Explicitly tie your files together.
- Have a plan to organize, store, and make your files available.
- Report your research transparently
- Use version Control
- Use proper documentation: README
- Use Literate programming: RMarkdown, Jupyter Notebooks
- Data: Adhere to FAIR data principles
- Software: Github and other repos
- Choose a file structure that works for you
- Use relative paths when possible and organize your files
- Makes paths less dependent on particular File or System structure.
- Avoid putting spaces in your file and directory names
- Include a README.md that describes the purpose and structure of your project
- Create project within a folder in your computer
- Create folder for your code
- Create folder for Data
- Raw: Downloaded or gathered from the field
- Derived: processed through your analysis
- Create a folder for figures generated from your analysis
- NB: Ensure separation of information
Literate programming is a crucial part of a reproducible quantitative research. Being able to directly link your analyses, your results, and the code you used to produce the results makes tracing your steps much easier.
A demo on:
- Quick tour of RStudio
- Creating a project
- setting working directory
- Creating folders
- RStudio has inbuilt version control support
- Learn more here
R Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents.
- Uses Knitr to execute the embedded code and pandoc to convert to output
- The flow is:
![RMarkdown] (https://d33wubrfki0l68.cloudfront.net/61d189fd9cdf955058415d3e1b28dd60e1bd7c9b/9791d/images/rmarkdownflow.png)
![Output formats] (https://d33wubrfki0l68.cloudfront.net/00ed9c32053cbc805efa51b66be570558480a4c8/7a292/images/rmarkdownoutputformats.png)
An R Notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input
Shiny is an R package that makes it easy to build interactive web applications (apps) straight from R.
- Reproducible Research with R and RStudio Second Edition is a great reference text.
- https://rmarkdown.rstudio.com/lesson-11.html
- https://rmarkdown.rstudio.com/articles_intro.html
- Organizing Projects: http://kbroman.org/Tools4RR/assets/lectures/06_org_eda.pdf
- Research reproducibility: http://stm.sciencemag.org/content/8/341/341ps12.full