3.4 Data manipulation (optional)

Now that we have the Meta-Analysis data in RStudio, let’s do a few manipulations with the data. These functions might come in handy when were conducting analyses later on.

Going back to the output of the str() function, we see that this also gives us details on the type of column data we have stored in our data. There a different abbreviations signifying different types of data.

Abbreviation	Type	Description
num	Numerical	This is all data stored as numbers (e.g. 1.02)
chr	Character	This is all data stored as words
log	Logical	These are variables which are binary, meaning that they signify that a condition is either TRUE or FALSE
factor	Factor	Factors are stored as numbers, with each number signifying a different level of a variable. A possible factor of a variable might be 1 = low, 2 = medium, 3 = high

3.4.1 Converting to factors

Let’s look at the different kinds of interventions, df$interventioncode. We can have a look at this variable by typing the name of our dataset, then adding the selector $ and then adding the variable we want to have a look at. This variable is currently a character vector (text). We want it to be a factor: That’s a categorical variable.

To convert this to a factor variable now, we use the factor() function.

df$interventioncode <- factor(df$interventioncode)
df$interventioncode

We now see that the variable has been converted to a factor with the levels “Acts of Kindness,”Other“, and”Prosocial Spending".

3.4.2 Selecting specific studies

It may often come in handy to select certain studies for further analyses, or to exclude some studies in further analyses (e.g., if they are outliers).

To do this, we can use the filter function in the dplyr package, which is part of the tidyverse package we installed before.

So, let’s load the package first.

library(dplyr)

Let’s say we want to do a Meta-Analysis with studies conducted in the USA, or partly conducted in the USA, only. To do this, we need to create a new dataset containing only these studies using the dplyr::filter() function. The dplyr:: part is necessary as there is more than one `filter function in R, and we want to use to use the one of the dplyrpackage.

The R code to store these studies in a new dataset called df_usa looks like this:

df_usa <- dplyr::filter(df, location %in% c("USA", "USA/Korea"))

Note that the %in%-Command tells the filter function to search for cases whose location is included in the vector c("USA", "USA/Korea"). Now, let’s have a look at the new data df_usa we just created.

study_id	effect_id	d	vi	n1i	n2c	sex	age	location	donor	donorcode	interventioniv	interventioncode	control	controlcode	recipients	outcomedv	outcomecode
Aknin, Fleerackers, et al. (2014)	6	0.38	0.0342225	60	59	41	19.90	USA	Typical	Typical	Prosocial Purchase	Prosocial Spending	Personal Purchase	Self Help	Anonymous Sick Children	PANAS	PN Affect
Aknin, Fleerackers, et al. (2014)	7	0.44	0.0344293	60	59	41	19.90	USA	Typical	Typical	Prosocial Purchase	Prosocial Spending	Personal Purchase	Self Help	Anonymous Sick Children	ORH	Other
Donnelly, Grant, et al. (2017) Study 1	21	0.77	0.0373841	59	56	52	22.57	USA	Typical	Typical	Social recycling	Other	Trash/Recycling	Neutral Activity	Unknown lab workers	H	Happiness
Donnelly, Grant, et al. (2017) Study 1	22	0.85	0.0369597	59	59	52	22.57	USA	Typical	Typical	Social recycling	Other	Take item	Self Help	Unknown lab workers	H	Happiness
Donnelly, Grant, et al. (2017) Study 2b	23	1.25	0.0222388	107	108	50	37.77	USA	Typical	Typical	Social recycling	Other	Trash	Neutral Activity	Unknown lab workers	PA	PN Affect
Layous, Kurtz, J, et al. (under review) Study 1	27	0.08	0.0288015	70	69	16	18.55	USA	Typical	Typical	AK	Acts of Kindness	Track daily activity	Neutral Activity	Someone known	SHS	Happiness

Note that the function can also be used for any other type of data and variable. We can also use it to e.g., only select studies where the donors were “typical”:

df_typical <- dplyr::filter(df, donorcode == "Typical")

3.4.3 Changing cell values

Sometimes, even when preparing your data in EXCEL, you might want to change values in RStudio once you have imported your data.

To do this, we have to select a cell in our data frame in RStudio. This can be done by adding [x,y] to our dataset name, where x signifies the number of the row we want to select, and y signifies the number of the column.

To see how this works, let’s select a variable using this command first:

df[8,1]

## [1] "Aknin, Hamlin, et al. (2012) "

We now see the 6th study in our dataframe, and the value of this study for Column 1 (the author name) is displayed. Let’s say we had a typo in this name and want to have it changed. In this case, we have to give this exact cell a new value.

df[8,1] <- "Aknin, et al. (2012)"

Let’s check if the name has changed.

df[8,1]

## [1] "Aknin, et al. (2012)"

You can also use this function to change any other type of data, including numericals and logicals. Only for characters, you have to put the values you want to insert in "".