Caspar J. Van Lissa

Utrecht University | Open Science Community Utrecht
Tilburg University
Funded by NWO Veni grant VI.Veni.191G.090

Conceptual introduction

Central thesis

Machine learning can help advance theory formation in developmental psychology

  • Replication crisis (Scheel 2022; Lavelle 2021)
  • Solutions improve deductive (theory-testing) research practices
    • questionable research practices (QRPs)
    • replication
    • preregistration

Many preregistered hypotheses are not supported (Scheel, Schijen, and Lakens 2021)

  • Lack of “good theories” another explanation for replication crisis

Defining theory

Theory: A model that describes the nature of relations between several phenomena in sufficient detail that it allows for the derivation of quantifiable hypotheses that can be subject to (severe) testing.

  • most psychological theories fall short of this definition
    • insufficiently precise
    • too flexible
  • Compare to Walasek, Frankenhuis, and Panchanathan (2022)

Why should statisticians care about theory?

  • It dictates what models are sensible
  • Useful as an instrument of cumulative knowledge acquisition
  • Communicating social scientific research to its consumers

Deduction vs induction

  • Lack of theory cannot be overcome by improving deductive research
  • Theory formation requires inductive (exploratory) research (Creswell and Clark 2017)

Empirical cycle

De Groot’s empirical cycle: a model of knowledge production through scientific research (de Groot 1961)

Rigorous exploration

How can we improve exploratory research?

Role of flexibility:

  • Confirmatory methods poorly suited to exploratory research
    • p-values: Type I error is 5%, but false discovery rate (FDR) is 14-50% (Vidgen and Yasseri 2016).
    • Model fit indices: manually specify models, no guarantee that best model is included

“unguided exploration” is effortful and inflates the risk of spurious results

\[ FDR = \frac{FP}{TP + FP} \]

Machine learning

Machine learning:

  • automated model building
  • checks and balances to prevent overfitting
  • maximize predictive performance and generalizability of results

Psychology can benefit from its superior predictive performance (Yarkoni and Westfall 2017)

  • E.g., personalized (mental) health care, automated assessment aids

NEW: Implications for the theory crisis have not yet been discussed

Unguided vs rigorous exploration

Phenomena detection

First step in Theory Construction Methods is identifying relevant phenomena (Borsboom et al. 2020)

  • Phenomena: stable and general features of the world
  • Few quantitative methods for phenomena detection (aside from expert opinion)

Text mining may be suitable for phenomena detection (van Lissa 2021)

  • Unsupervised learning method
  • Clusters in keywords and abstracts

Holistic approach

Every study examines only a piece of the puzzle; we never see the complete picture

Machine learning accommodates more predictors than classical methods

  • Regularization
  • Variable selection

Including all relevant predictors is important:

  • Assumption for causal interpretation
    • Multicollinearity
  • Good theory incorporates most important causes
    • Cannot assess the relative utility across studies
  • Include potential predictors from various theories, and undertheorized factors

Complex effects

Many machine learning methods accommodate:

  • non-linear effects
  • higher-order interactions, without having to specify the nature of these effects a-priori.
  • Automatic: tree-based methods
  • Manual: penalized methods

Psychological theories rarely account for complex effects

  • Machine learning can provide nuance by revealing them

Theoretical elements

Some machine learning methods incorporate theoretical elements

  • E.g.: assumption that development follows a latent growth curve (LGC).
  • SEM forests (Brandmaier et al. 2016)
  • regularized SEM (Jacobucci, Grimm, and McArdle 2016)

When the theory is a (nomological) network:

  • LASSO-penalized Gaussian graphical model (GGM) (Epskamp, Rhemtulla, and Borsboom 2017)
  • E.g.: network theory of major depression (Cramer et al. 2016)

All of these methods allow for theory guided exploration using machine learning

Person-centered approaches

Explain heterogeneity at a more fine-grained level than the whole sample

  • E.g., latent class analyses
    • “Which individuals are similar?”
    • Other unsupervised learning methods exist
  • RI-CLPM and DSEM
    • Rarely explain heterogeneity in within-person effects
    • Regularization
  • Tree-based models
    • Group individuals based on predictors to maximize homogeneity of the outcome
    • “Why are these individuals similar?”


When using data models to guide theory formation, generalizability is essential

Checks and balances to curtail overfitting

  • cross-validation to estimate predictive accuracy

From models to theory

Naive interpretative approach

  • Variable importance metrics
    • congruent with theoretical assumptions about important predictors?
    • any theoretically important predictors rank low?
    • any undertheorized factors rank high?
  • Marginal associations
    • non-linear effects?
    • high importance but flat marginal association?

Comparing predictive performance of simpler parametric model that represents these ‘insights’ to machine learning model

From models to theory 2

Data models -> formal theory (Haslbeck et al. 2021)

  • abductive formal theory construction (AFTC) framework
    • construct formal theory, generate data, mathematically compare to empirical data
    • Discrepancies: amend formal theory


  • There is a paucity of good theory
  • Need for exploratory research for theory formation
  • Machine learning for rigorous exploration
    • automates model building
    • incorporates checks and balances for generalizable results
  • Unsupervised learning can assist in phenomenon detection
  • Supervised learning to identify important predictors
  • Some algorithms incorporate basic theoretical elements

Applied examples

Emotion regulation in adolescence

Developmentally sensitive period (Zimmermann and Iwanski 2014)

20% develop psychopathology (Lee et al. 2014)

Potentially lifelong implications for mental health and well-being

Substantial empirical research, but no overarching theoretical framework (Buss, Cole, and Zhou 2019)

Towards integrative theory

First step: identifying relevant phenomena (Borsboom et al. 2020)

We conducted a text mining systematic review (TMSR) (van Lissa 2021)

  • Narrative reviews: small samples, confirmation bias, emphasize positive results (Littell 2008)
  • TMSR: Unlimited sample size, transparent, objective, reproducible

6653 papers on Addresses emotion regulation in population overlapping with adolescence [10-24]


Open science

Baseline network

Phenomena relevant to adolescents' emotion regulation according to theory (a) and narrative reviews (b; transparent nodes indicate constructs also present in the theory).

Phenomena relevant to adolescents’ emotion regulation according to theory (a) and narrative reviews (b; transparent nodes indicate constructs also present in the theory).

  1. Theory (b) narrative reviews; transparent nodes indicate constructs also present in theory.

TMSR results