In my Stats 1 class, many exercises describe situations like:
Grades are normally distributed, with \(\mu = 6, \sigma = 2\)
A student said: This is a stupid question, but are grades normally distributed?
Utrecht University, dept. Methodology & Statistics
In my Stats 1 class, many exercises describe situations like:
Grades are normally distributed, with \(\mu = 6, \sigma = 2\)
A student said: This is a stupid question, but are grades normally distributed?
Actual grades of the 347 students:
Latent variable | Continuous | Categorical |
---|---|---|
Continuous | Factor analysis | IRT |
Categorical | Mixture model | Latent class analysis |
Analysis goals:
Approach:
Pitfalls:
library(tidyLPA) grades %>% estimate_profiles(1:2)
## tidyLPA analysis using mclust: ## ## Model Classes AIC BIC Entropy prob_min prob_max n_min n_max ## 1 1 1022.35 1029.59 1.00 1.00 1.00 1.00 1.00 ## 1 2 993.73 1008.21 0.74 0.91 0.94 0.43 0.57 ## BLRT_p ## ## 0.01
grades %>% estimate_profiles(1:2) %>% compare_solutions()
## Compare tidyLPA solutions: ## ## Model Classes BIC ## 1 1 1029.590 ## 1 2 1008.211 ## ## Best model according to BIC is Model 1 with 2 classes. ## ## An analytic hierarchy process, based on the fit indices AIC, AWE, BIC, CLC, and KIC (Akogul & Erisoglu, 2017), suggests the best solution is Model 1 with 2 classes.
Show model parameters
Show parameter uncertainty
Show classification uncertainty
If number of classes is is exploratory…
These visualizations…
id_edu[, c("com3", "exp3")] %>% estimate_profiles(1:4) %>% plot_profiles()
id_edu[, c("com3", "exp3")] %>% estimate_profiles(1:4) %>% plot_density()
Describe heterogeneity in developmental trajectories
Analysis goals:
# Load MplusAutomation library(MplusAutomation) # Do the latent class growth analysis createMixtures(classes = 1:4, filename_stem = "growth", model_overall = "i s q | ec1@0 ec2@1 ec3@2 ec4@3 ec5@4 ec6@5; i@0; s@0; q@0", rdata = empathy[, 1:6], ANALYSIS = "PROCESSORS = 2;") runModels(filefilter = "growth") results_growth <- readModels(filefilter = "growth")
mixtureSummaryTable(results_growth)
## Title Classes AIC BIC aBIC Entropy T11_VLMR_PValue ## 1 1 classes 1 4951.896 4989.213 4960.649 NA NA ## 2 2 classes 2 4172.444 4226.347 4185.088 0.819 0.0000 ## 3 3 classes 3 3938.068 4008.556 3954.601 0.811 0.0015 ## 4 4 classes 4 3851.840 3938.913 3872.264 0.838 0.0167 ## T11_LMR_PValue BLRT_PValue min_N max_N min_prob max_prob ## 1 NA NA 467 467 1.000 1.000 ## 2 0.0000 0 200 267 0.931 0.961 ## 3 0.0019 0 93 238 0.880 0.926 ## 4 0.0189 0 18 221 0.888 0.909
plotGrowthMixtures(results_growth, rawdata = TRUE)
Describe state changes
createMixtures( classes = 4, filename_stem = "lta", model_overall = "c2 ON c1;", model_class_specific = c( "[com3] (com{C}); [exp3] (exp{C});", "[com5] (com{C}); [exp5] (exp{C});" ), rdata = id_edu[, c("com3", "exp3", "com5", "exp5")] )
runModels(filefilter = "lta") lta_id <- readModels("lta_2_class.out")
plotLTA(lta_id)
Good practices:
Mixture model estimated using EM algorithm
Two things must be estimated:
The problem is that these two are interdependent
So, we go back and forth: