In my Stats 1 class, many exercises describe situations like:
Grades are normally distributed, with \(\mu = 6, \sigma = 2\)
A student said: This is a stupid question, but are grades normally distributed?
Utrecht University, dept. Methodology & Statistics
In my Stats 1 class, many exercises describe situations like:
Grades are normally distributed, with \(\mu = 6, \sigma = 2\)
A student said: This is a stupid question, but are grades normally distributed?
Actual grades of the 347 students:
| Latent variable | Continuous | Categorical |
|---|---|---|
| Continuous | Factor analysis | IRT |
| Categorical | Mixture model | Latent class analysis |
Analysis goals:
Approach:
Pitfalls:
library(tidyLPA) grades %>% estimate_profiles(1:2)
## tidyLPA analysis using mclust: ## ## Model Classes AIC BIC Entropy prob_min prob_max n_min n_max BLRT_p ## 1 1 1 1585.41 1593.11 1.00 1.00 1.00 1.00 1.00 ## 2 1 2 1545.82 1561.22 0.74 0.88 0.96 0.49 0.51 0.01
grades %>% estimate_profiles(1:2) %>% compare_solutions()
## Warning: The solution with the minimum number of classes under consideration ## was considered to be the best solution according to one or more fit indices. ## Examine your results with care; consider adding a smaller number of classes.
## Warning: The solution with the maximum number of classes under consideration ## was considered to be the best solution according to one or more fit indices. ## Examine your results with care and consider estimating more classes.
## Compare tidyLPA solutions: ## ## Model Classes BIC ## 1 1 1593.111 ## 1 2 1561.222 ## ## Best model according to BIC is Model 1 with 2 classes. ## ## An analytic hierarchy process, based on the fit indices AIC, AWE, BIC, CLC, and KIC (Akogul & Erisoglu, 2017), suggests the best solution is Model 1 with 2 classes.
grades %>%
estimate_profiles(1:2,
variances = c("equal", "varying"),
covariances = c("zero", "zero"))
estimate_profiles calls Mclust(), or mplusModeler().
Additional arguments are passed to these functions.
grades %>%
estimate_profiles(1:2) %>%
compare_solutions(statistics = c("AIC", "BIC", "KIC"))
Akogul & Erisoglu, 2017
Applied Saaty’s (1978) AHP to a range of model fit indices
Relative importance based on performance recovering true number of classes in simulations
Show model parameters
Show parameter uncertainty
Show classification uncertainty
If number of classes is is exploratory…
These visualizations…
Show model parameters
Show parameter uncertainty
Show classification uncertainty
If number of classis is exploratory…
grades %>% estimate_profiles(1:2) %>% plot_profiles
grades %>% estimate_profiles(1:2) %>% plot_density
id_edu[, c("com3", "exp3")] %>%
estimate_profiles(1:4) %>%
plot_profiles()
id_edu[, c("com3", "exp3")] %>%
estimate_profiles(1:4) %>%
plot_density()
Describe heterogeneity in developmental trajectories
Analysis goals:
# Load MplusAutomation
library(MplusAutomation)
# Do the latent class growth analysis
createMixtures(classes = 1:4,
filename_stem = "growth",
model_overall = "i s q | ec1@0 ec2@1 ec3@2 ec4@3 ec5@4 ec6@5;
i@0; s@0; q@0",
rdata = empathy[, 1:6],
ANALYSIS = "PROCESSORS = 2;")
runModels(filefilter = "growth")
results_growth <- readModels(filefilter = "growth")
mixtureSummaryTable(results_growth)
DOI: 10.1007/s10964-014-0152-5
plotGrowthMixtures(results_growth, rawdata = TRUE)
Describe state changes
results_id <- estimate_profiles(id_edu[, c("com3", "exp3")],
n_profiles = 1:4)
results_id
results_id2 <- estimate_profiles(id_edu[, c("com5", "exp5")],
n_profiles = 1:4)
results_id2
createMixtures(
classes = 4,
filename_stem = "lta",
model_overall = "c2 ON c1;",
model_class_specific = c(
"[com3] (com{C}); [exp3] (exp{C});",
"[com5] (com{C}); [exp5] (exp{C});"
),
rdata = id_edu[, c("com3", "exp3", "com5", "exp5")]
)
runModels(filefilter = "lta")
lta_id <- readModels("lta_2_class.out")
plotLTA(lta_id)
Good practices:
Mixture model estimated using EM algorithm
Two things must be estimated:
The problem is that these two are interdependent
So, we go back and forth: