Start | End | Dur. | Topic |
---|---|---|---|
9H 0M | 9H 20M | 0:20 | Pres.: Workflow for Open Reproducible Computational Social Science |
9H 20M | 9H 30M | 0:10 | Discussion and Q&A |
9H 30M | 10H 0M | 0:30 | DIY: Setting up a WORCS project with a FAIR Theory |
10H 0M | 10H 20M | 0:20 | Pres.: WORCS with targets: Sustainable reproducibility |
10H 20M | 10H 30M | 0:10 | Discussion and Q&A |
11H 00M | 11H 30M | 0:30 | DIY: Using worcs with targets |
11H 3M | 11H 50M | 0:20 | Pres.: Integration testing, or: catching errors early |
11H 50M | 12H 00M | 0:10 | Discussion and Q&A |
12H 00M | 12H 30M | 0:30 | DIY: Integration Testing OR Parallel Computing |
“Open science is just good science” (Jonathan Tennant, 2018)
Formal definitions:
Relevant to openness and reproducibility:
Sterling, 1959:
NO MORE manuscript_final_final_SERIOUSLYFINAL.doc
“Track Changes” on steroids: record entire project history
If something breaks, you can figure out what happened.
Facilitates collaboration and experimentation!
Tracks changes to (text-based) files line by line:
One command in worcs
: git_update("Describe your changes")
Image credit: Software Carpentries
worcs
repository is backed up in a remote repository like GitHub;
GitHub is a “cloud backup” with “social networking” features
GitHub can be used to ‘tag’ specific states of the repository, e.g. a preregistration.
R-packages
renv
installs all dependencies from the listworcs
check_worcs_installation()
git_update("Commit message")
rticles
, papaja
, and prereg
@essential
and @@nonessential
targets
open_data()
:.csv
(text based, human / machine readable)closed_data()
:synthetic()
.csv
)load_data()
:.worcs
file; default read.csv()
manuscript.Rmd
)rmarkdown::render("manuscript.Rmd")
)manuscript.pdf
, table1.csv
)worcs::reproduce()
generates the endpoints from the entry point via the recipe
worcs::check_endpoints()
verifies that the results are identical
worcs
is a good starting point for new R-users
check_worcs_installation()
worcs
with targets
Making projects reproducible often involves frequently re-running code to ensure results are still valid.
targets
A “pipeline tool” for (computationally demanding) R-projects
This does not overlap with, but perfectly complements, worcs
workflows
Any pipeline tools does the following:
The most famous pipeline tool is the general purpose GNU Make
targets
is an R-exclusive pipeline toolworcs
To use targets
in a WORCS project:
_targets.R
script.manuscript.rmd
fileworcs::add_targets()
A targets
workflow is executed by running targets::tar_make()
worcs
sets the recipe to targets::tar_make()
, so worcs::reproduce()
also executes the pipelineworcs
makes sure that the last step of the pipeline is to render an Rmarkdown to report the resultsResults from the pipeline can be loaded into an Rmarkdown document using:
targets::tar_load(result_name)
targets::tar_load_everything()
This integrates the pipeline results into your dynamic document.
Often, rendering the Rmarkdown document will be the final step of your pipeline
targets
MarkdownYou can run targets
directly within an Rmarkdown file by:
Warning: Running code interactively in combination with tar_make()
may introduce bugs. It is safer to only use tar_make()
.
Definition: Software engineering practice where (new) code is subjected to tests to ensure correct functioning and catch mistakes.
We can apply integration testing to ensure that:
adding(2, 2) == 4
before adding()
two unknown numbersWORCS facilitates making analyses reproducible; integration tests verify reproducibility
Why: Increase trust in scientific findings by verifying that the study yields the reported results
Some journals now have “reproducibility editors” who perform these checks (e.g., Research Synthesis Methods)
worcs
provides functionality for integration testing research code
Based on testthat
: Test that functions behave according to your expectations
testthat
easily integrates into existing workflowsworcs::add_testthat()
: Set Up testthat
usethis::use_test()
: initialize a basic test file and open it for editing.worcs::github_action_testthat()
: add a GitHub action that evaluates the integration tests.worcs::test_worcs()
: Run all tests in worcs
projecttestthat::test_file()
: Test a single file
Define an entry point, endpoint(s), and a recipe to get from entry- to endpoint(s).
Basic worcs
:
manuscript.Rmd
manuscript.pdf
render("manuscript.Rmd")
With targets
:
targets::tar_make()
_targets.R
add_recipe()
to customize recipeadd_endpoint()
to start tracking files as endpointssnapshot_endpoints()
to update state of endpoints
reproduce()
runs the recipe to reproduce the projectcheck_endpoints()
verifies that endpoints remain unchanged after reproduce()
It works on my machine ¯\_(ツ)_/¯
‘GitHub’ allows you to run code in the cloud (= on their servers) to:
reproduce()
or check_endpoints()
You can add a reproducibility status badge to your README.md
worcs
GitHub Actionsgithub_action_testthat()
:
worcs
project’s testthat
integration test suitegithub_action_reproduce()
:
github_action_check_endpoints()
worcs::reproduce()
on GitHub via GitHub Actions:
Sometimes it SHOULD fail: if your function/analysis pipeline has changed
Sometimes it should NOT fail
renv::snapshot()
and renv::restore()
)