Caspar J. Van Lissa
Utrecht University | Open Science Community Utrecht
Tilburg
University
Funded by NWO Veni grant VI.Veni.191G.090
Machine learning can help advance theory formation in developmental psychology
Many preregistered hypotheses are not supported (Scheel, Schijen, and Lakens 2021)
Theory: A model that describes the nature of relations between several phenomena in sufficient detail that it allows for the derivation of quantifiable hypotheses that can be subject to (severe) testing.
Why should statisticians care about theory?
De Groot’s empirical cycle: a model of knowledge production through scientific research (de Groot 1961)
How can we improve exploratory research?
Role of flexibility:
“unguided exploration” is effortful and inflates the risk of spurious results
\[ FDR = \frac{FP}{TP + FP} \]
Machine learning:
Psychology can benefit from its superior predictive performance (Yarkoni and Westfall 2017)
NEW: Implications for the theory crisis have not yet been discussed
First step in Theory Construction Methods is identifying relevant phenomena (Borsboom et al. 2020)
Text mining may be suitable for phenomena detection (van Lissa 2021)
Every study examines only a piece of the puzzle; we never see the complete picture
Machine learning accommodates more predictors than classical methods
Including all relevant predictors is important:
Many machine learning methods accommodate:
Psychological theories rarely account for complex effects
Some machine learning methods incorporate theoretical elements
When the theory is a (nomological) network:
All of these methods allow for theory guided exploration using machine learning
Explain heterogeneity at a more fine-grained level than the whole sample
When using data models to guide theory formation, generalizability is essential
Checks and balances to curtail overfitting
Naive interpretative approach
Comparing predictive performance of simpler parametric model that represents these ‘insights’ to machine learning model
Data models -> formal theory (Haslbeck et al. 2021)
Developmentally sensitive period (Zimmermann and Iwanski 2014)
20% develop psychopathology (Lee et al. 2014)
Potentially lifelong implications for mental health and well-being
Substantial empirical research, but no overarching theoretical framework (Buss, Cole, and Zhou 2019)
First step: identifying relevant phenomena (Borsboom et al. 2020)
We conducted a text mining systematic review (TMSR) (van Lissa 2021)
6653 papers on Addresses emotion regulation in population overlapping with adolescence [10-24]
<doi.org/10.1007/s40894-021-00160-7>
All code and data available at https://github.com/cjvanlissa/veni_sysrev
Workflow for Open Reproducible Code in Science (WORCS) used to make analyses reproducible (Van Lissa et al. 2020)
Phenomena relevant to adolescents’ emotion regulation according to theory (a) and narrative reviews (b; transparent nodes indicate constructs also present in the theory).