Skip to contents

This tutorial walks you through the steps of creating a reproducible project with the worcs package. The learning goals are:

Checking the Installation

Open RStudio. Load the worcs package, and run the installation check:

You should see all green checkmarks, optionally some “information” messages. If you see any failed tests - instructions should be printed on how to remedy the issue. Please follow these instructions. If you are in a “worcshop” with a live instructor, ask for help after you’ve tried to remedy the issues.

Creating a New worcs Project

In Rstudio, click File > New Project > New directory > WORCS Project Template

Type an appropriate name for the remote Repository in its textbox. This name will be used to create a new GitHub repository on your account. For example, you could name it “demo_worcs_project”.

Keep the checkbox for renv checked if you want to use dependency management (recommended).

For this tutorial - select “none” in the preregistration template dropdown menu. You can always add a preregistration later using add_preregistration().

For this tutorial, select the manuscript template “github_document”, which has few dependencies. Optionally, you can choose a different template.

Select a license for your project (we recommend a CC-BY license, which allows free use of the licensed material as long as the creator is credited).

When you click “Create Project”, the new project should open in RStudio (either in a new, or in the current session.

Verify that you see a README.md file, which is the welcoming page for users of your repository. Edit this template to explain how users should interact with the project.

Prepare a dataset using prepare_data.R

The data preparation script should turn your source data into an analysis-ready data.frame (or other data object, but you will need to specify custom functions for reading and loading the data in that case).

Two important steps usually occur before data is added to a repository:

  • Removing any and all potentially identifying information in the case of sensitive data
  • Minimal data cleaning required to store the data to a file. The remainder of the data cleaning will be done reproducibly.

You can use your own data. If you don’t have your own data, you can use some demo data:

Below is a minimal prepare_data.R script. Adapt it for your own data (which will require you to copy the file to your worcs project directory, and load them into memory).

# Inside prepare_data.R:
library(worcs)

# Example methods of loading a data file
# df <- readxl::read_xlsx("penguins.xlsx", 1)
# df <- foreign::read.spss("penguins.sav", to.data.frame = TRUE)
# df <- read.csv("penguins.csv", stringsAsFactors = FALSE)

# Example data
df <- iris
# Remove a colum containing "potentially identifying information"
df[["Species"]] <- NULL

# Inspect the prepared data
descriptives(df)

Add the Dataset to the Repository

End the file prepare_data.R with the following command to save the prepared dataset and publish it on GitHub. If you do not want to publish your data on GitHub, use closed_data() instead. This tutorial assumes you use open_data().

To confirm that the project now knows how to load the dataset, remove df from the environment, then run load_data() in the console:

Add some demo analyses

Open your manuscript.Rmd file. There, edit an existing code chunk, or remove them and create a new code chunk. First, load worcs and the data we just created:

# Inside manuscript.Rmd:
library(worcs)
load_data()

Now, add some mock analyses.

You can insert your own analysis code, or play around with the following functions:

# Descriptive statistics
res_desc <- descriptives(df)
write.csv(res_desc, "res_desc.csv", row.names = FALSE)

# Simple model
mod <- lm(Sepal.Length ~ Sepal.Width, data = df)
res_mod <- summary(mod)
write.csv(res_mod$coefficients, "res_coef.csv", row.names = FALSE)

Note that, in this code, we write the results to spreadsheet files. You can also print them in the document, for example using:

knitr::kable(res_mod, caption = "My regression model coefficients, for a model with $R^2 `r report(res_mod[['r.squared']])`$.")

Reproduce the Project

Run the following code in the terminal:

This should render the manuscript that you’ve prepared.