-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path03-Results.Rmd
2693 lines (2356 loc) · 187 KB
/
03-Results.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
output: html_document
editor_options:
chunk_output_type: console
---
# Results
In this chapter, the relevant findings from every step performed are analyzed following the analytical strategy indicated in the previous chapter. First, latent class analysis for each country included in the sample selected is performed in order to identify the optimal number of classes for each country separately. Secondly, a global analysis is performed to identify how many latent classes are identified including all different countries, with this information it is possible to establish the number of classes that can be compared across countries. Finally, a multigroup latent class analysis is performed considering all the previous information. The multigroup analysis is constructed in multiple steps, the most restricted model until the less restricted model is evaluated. As a final step, a confirmatory latent class analysis is performed using some theoretical hypotheses that were defined based on the previous results.
This procedure is performed for the two scales that were used to create the Students' endorsement of equal rights and opportunities indicators separately.
```{r}
ISC_lvRlca <- data_model %>%
dplyr::select(all_of(Id), all_of(sample),
all_of(Scales)
#all_of(Schl_cate),
#all_of(Stud_cate),
#all_of(Scores)
)
cnt <- unique(ISC_lvRlca$IDCNTRY)
cbPalette <- brewer.pal(n = 8, name = "Dark2")
```
```{r}
load("data/MplusModels_LCA.RData")
load("data/MplusModels_ByCountry.RData")
load("data/MplusModels_MGCntry.RData")
```
For every analysis in this chapter, the model fit statistics table includes all the statistics that are retrieved by MPLUS software. Here is a brief description of the meaning behind every column that will be shown. The first table with the model fit statistics for all different models indicates first the number of classes used in the model, the total number of parameters estimated, the final and best Log-likelihood, the values for information criteria AIC, BIC, aBIC. The entropy indicated in each table corresponds to the relative entropy, where a perfect classification is 1, the table also indicates the log likelihood reduction (LL Reduction) from adding one class into the model. Two tests for model fit are indicated as well, the value of the statistic and the p-value associated with the Vuong-Lo-Mendell-Rubin likelihood ratio test (VLMR) and Lo-Mendell-Rubin adjusted LRT Test (LMR).
Conditional response probabilities plots are included as a summary for every independent model, the x-axis includes all the items included in the model, the y axis corresponds to the values of the probability to agree with that item, these values range from 0 to 1, where values close to 1 indicate that is highly likely to agree with that item, in contrast, values close to 0 indicate unlikely to agree to that item. On the other hand, values around 0.5 indicate randomness in the response, for this reason, is not possible to indicate a clear tendency to agree or to disagree. The different classes identified in the plots are colored differently but the colors remain the same when the response pattern is similar across countries. When a new class was identified a new color was used. The sample size for each class appears at the end of the x-axis colored with the same color as the class.
## Students' endorsement of gender equality scale
As mentioned in the previous chapter, this scale is composed of 6 items, in the following tables and plots, these items were ordered from positive to negative items for an easier interpretation of the results. This ordering consider first all the items that were positive worded in the instrument *Men and women should have equal opportunities to take part in government* (GND1), *Men and women should have the same rights in every way* (GND2) and *Men and women should get equal pay when they are doing the same jobs* (GND5), followed by the three other items that are negatively worded *Women should stay out of politics (r)* (GND3), *Not many jobs available, men should have more right to a job than women (r)* (GND4), *Men are better qualified to be political leaders than women (r)* (GND6). As mentioned before all these variables were recoded in two categories, Agree and Disagree. All 14 countries were analyzed independently and then pooled in the same dataset.
### Analysis by country
Multiple latent class models with 1 to 6-classes\footnote{Summary with all models can be found in Appendix, table @ref(tab:detailed1).} were performed in each country in order to evaluate the model fit of each one of them. The results are summarized in table \@ref(tab:summodelfitcntry1). In most European countries, the best model fit based on the different criteria indicated previously are by including 3 or 4 latent classes.
For Belgium, Croatia, Denmark, Latvia, and The Netherlands there is no significant improvement in the log-likelihood from two to three latent classes. In this sense, BIC and aBIC simultaneously have the lowest values in the 3-class model.
On the other hand, in Bulgaria, Estonia, Malta, Slovenia, and Sweden according to the statistical tests, BIC, and aBIC criteria, the best model is a 4-class model. In Finland, Italy and Lithuania models, the BIC, aBIC differ from the statistical test indicating a better fit for the 3-class model.
Norway is the only country from the sample where the best model fit is the one with 5 latent classes according to the statistical tests and BIC and aBIC.
It is a common tendency in all the evaluated countries that the AIC value is lower in the models with one more class than indicated by the statistical tests and BIC and aBIC statistics. That is consistent with the indication that this criterion tends to overfit the data.
Values of Entropy are higher when the tests are significant but consistent with a better fit of the data, the lower entropy found in the 4-class model is in Latvia (73.7\%) and the highest value in Norway (96\%). The log-likelihood reduction is consistent in all countries, where having more than 3 latent classes reduces the log-likelihood around 0.2\% and 1\%.
```{r modelfitcnty1, echo=FALSE, results='hide'}
#check order of country names
#resultsbyall %>% arrange(Country) %>% select(Country) %>% unique()
ModelfitByContry("ByCountry_GND", title = "Model fit statistics LCA by country Students' endorsement of gender equality", fontn = 10) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 10) %>%
column_spec(c(2,3,11), width = "3em") %>%
column_spec(c(4:10,12), width = "4em") %>%
column_spec(c(1), width = "6em") %>%
pack_rows("Belgium (Flemish)",1,6) %>%
# row_spec(2:3, background = "lightgray") %>%
row_spec(c(3), bold = TRUE) %>%
pack_rows("Bulgaria",7,12) %>%
# row_spec(9:10, background = "lightgray") %>%
row_spec(c(10), bold = TRUE) %>%
pack_rows("Croatia",13,18) %>%
# row_spec(14:15, background = "lightgray") %>%
row_spec(c(16), bold = TRUE) %>%
pack_rows("Denmark",19,24) %>%
# row_spec(21:22, background = "lightgray") %>%
row_spec(c(22), bold = TRUE) %>%
pack_rows("Estonia",25,30) %>%
# row_spec(27:28, background = "lightgray") %>%
row_spec(c(27), bold = TRUE) %>%
pack_rows("Finland",31,36) %>%
# row_spec(32:33, background = "lightgray") %>%
row_spec(c(33), bold = TRUE) %>%
pack_rows("Italy",37,42) %>%
# row_spec(39:40, background = "lightgray") %>%
row_spec(c(39), bold = TRUE) %>%
pack_rows("Latvia",43,48) %>%
# row_spec(45:46, background = "lightgray") %>%
row_spec(c(45), bold = TRUE) %>%
pack_rows("Lithuania",49,54) %>%
# row_spec(51:52, background = "lightgray") %>%
row_spec(c(51), bold = TRUE) %>%
pack_rows("Malta",55,60) %>%
# row_spec(57:58, background = "lightgray") %>%
row_spec(c(58), bold = TRUE) %>%
pack_rows("The Netherlands",61,66) %>%
# row_spec(62:63, background = "lightgray") %>%
row_spec(c(63), bold = TRUE) %>%
pack_rows("Norway",67,72) %>%
# row_spec(70:71, background = "lightgray") %>%
row_spec(c(71), bold = TRUE) %>%
pack_rows("Slovenia",73,78) %>%
# row_spec(75:76, background = "lightgray") %>%
row_spec(c(76), bold = TRUE) %>%
pack_rows("Sweden",79,84) %>%
# row_spec(81:82, background = "lightgray") %>%
row_spec(c(82), bold = TRUE) %>%
collapse_rows(1, valign = "top") %>%
footnote(general = "The best loglikelihood value was not replicated for the following models: ",
number = "Croatia, 6 classes model") %>%
print()
cat('\n')
cat('\n')
```
All models selected accomplish at least one or more of the criteria established for a good fit. The bivariate residuals were also analyzed, and all countries have residuals around the range of acceptable [-2 ; 2] as shown in the figure \@ref(fig:resid1cnt). There is just one value is outside the ranges in Malta with a 4-class model.
\blandscape
```{r summodelfitcntry1}
ModelfitByContry("ByCountry_GND",
title = "Best model, fit statistics individual country model Students' endorsement of gender equality",
filterval = TRUE, fontn = 10) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 10) %>%
column_spec(c(2,3,9:10,12), width = "3em") %>%
column_spec(c(4:8,11,13), width = "4em") %>%
column_spec(c(1), width = "8em") %>%
footnote(general = "Best model based on the lowest value of BIC") %>%
print()
cat('\n')
cat('\n')
```
\elandscape
```{r resid1cnt, fig.width=6, fig.height=5, fig.cap="Bivariate standardized residuals individual country models for Students' endorsement of gender equality", fig.pos='H'}
residuals_ByCntryGND <- data.frame(`BFL 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BFL_C3cl3.out$tech10$bivar_model_fit_info$z,
`BGR 4cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BGR_C3cl4.out$tech10$bivar_model_fit_info$z,
`DNK 4cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_DNK_C3cl4.out$tech10$bivar_model_fit_info$z,
`EST 4cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_EST_C3cl4.out$tech10$bivar_model_fit_info$z,
`FIN 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_FIN_C3cl3.out$tech10$bivar_model_fit_info$z,
`HRV 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_HRV_C3cl3.out$tech10$bivar_model_fit_info$z,
`ITA 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_ITA_C3cl3.out$tech10$bivar_model_fit_info$z,
`LTU 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LTU_C3cl3.out$tech10$bivar_model_fit_info$z,
`LVA 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LVA_C3cl3.out$tech10$bivar_model_fit_info$z,
`MLT 4cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_MLT_C3cl4.out$tech10$bivar_model_fit_info$z,
`NLD 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NLD_C3cl3.out$tech10$bivar_model_fit_info$z,
`NOR 5cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NOR_C3cl5.out$tech10$bivar_model_fit_info$z,
`SVN 4cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SVN_C3cl4.out$tech10$bivar_model_fit_info$z,
`SWE 4cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SWE_C3cl4.out$tech10$bivar_model_fit_info$z)
residuals_ByCntryGND <- residuals_ByCntryGND %>% mutate(x = 1:nrow(residuals_ByCntryGND)) %>%
reshape2::melt(id.vars = c("x"))
pByCntryGND <- residuals_ByCntryGND %>% ggplot() +
geom_point(aes(x = x, y = value), size = 1) +
geom_hline(yintercept=c(-1.96,1.96), linetype = "dashed", size = 0.9, color = "black") +
scale_fill_grey() + theme_bw() +
facet_wrap(variable~., scales = "free_y") +
#ggtitle("Bivariate residuals for individual best country model fit") +
labs(y="Standardized residuals") +
theme(legend.position = "none", legend.box="vertical",
strip.text.y = element_text(size = 8),
legend.spacing.y = unit(-0.2, 'cm'),
title = element_text(size = 10),
axis.title = element_text(size = 10),
axis.title.x = element_blank(),
axis.text.x = element_text(size = 9),
axis.text.y = element_text(size = 9))
pByCntryGND
cat("\n")
cat("\n")
```
In figure \@ref(fig:classes1cnt), the classes of each independent model can be identified by looking at the conditional probabilities. In the figure, the conditional probabilities to agree to each item are shown and plotted for each class estimated in each country. From all the models, two classes that are similar across countries are identified in the figure, Fully egalitarian and Competition-driven sexism, green and purple line respectively.
A brief explanation of each class is described below.
- **Fully egalitarian:** Most likely to agree to all items (green line). This class can be observed in all countries. Conditional probabilities greater than 0.75 to agree, class sizes around 60% (Bulgaria) and 90% (Denmark).
- **Competition-driven sexism:** Most likely to disagree with gender competitive items in favor of women (purple line). This class can be observed in all countries. Conditional probabilities greater than 0.75 to agree to positive views of gender equality and generally lower than 0.5 to agree to reversed negative views, class sizes around 3.6% (Denmark) and 22.5% (Bulgaria).
- **Non-egalitarian:** Not likely to agree to any item (orange line). This class can be observed in four countries. Conditional probabilities lower to 0.5 to agree to any item, class sizes around 0.9% (Norway) and 2.6% (Italy).
- **Reverse competition-driven sexism:** Most likely to agree to gender competitive items in favor of women (pink line) . This class can be observed in five countries. Conditional probabilities lower than 0.25 to agree to positive views of gender equality and generally greater than 0.75 to agree to reversed negative views, class sizes around 0.6% (Norway) and 1.6% (The Netherlands).
- **Political egalitarian:** Likely to agree to politically related items (light-green line). This class can be observed in five countries. Conditional probabilities are greater than 0.75 in political equality items, class sizes around 3.2% (Belgium) and 1.4% (Estonia).
- **Random response:** Not defined attitude (yellow line). This class can be observed in four countries. Conditional probabilities between 0.25 and 0.75 to agree all items, class sizes around 2.7% (Slovenia) and 16.8% (Bulgaria).
The classes described before are not present in all countries, for this reason, a global model will be tested in the pooled sample considering not only the classes that are similar across more than one country, but additional classes will be added in order to absorb the remaining different classes that the global model will identify.
In the following section, the global model will be tested using the pooled dataset.
```{r classes1cnt, fig.width=6, fig.height=5, fig.cap="Classes for best individual country model for Students' endorsement of gender equality", fig.pos='H'}
classes_ByCntryGND <- rbind(data.frame(Country = "Belgium (Flemish)",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BFL_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BFL_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 5)),
data.frame(Country = "Bulgaria",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BGR_C3cl4.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BGR_C3cl4.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 6,
LatentClass == 3 ~ 4,
LatentClass == 4 ~ 1)),
data.frame(Country = "Denmark",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_DNK_C3cl4.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_DNK_C3cl4.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 5,
LatentClass == 3 ~ 1,
LatentClass == 4 ~ 3)),
data.frame(Country = "Estonia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_EST_C3cl4.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_EST_C3cl4.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 5,
LatentClass == 2 ~ 2,
LatentClass == 3 ~ 1,
LatentClass == 4 ~ 3)),
data.frame(Country = "Finland",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_FIN_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_FIN_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 1)),
data.frame(Country = "Croatia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_HRV_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_HRV_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 2,
LatentClass == 3 ~ 1)),
data.frame(Country = "Italy",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_ITA_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_ITA_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 1)),
data.frame(Country = "Lithuania",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LTU_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LTU_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 6)),
data.frame(Country = "Latvia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LVA_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LVA_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 6)),
data.frame(Country = "Malta",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_MLT_C3cl4.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_MLT_C3cl4.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 4,
LatentClass == 2 ~ 6,
LatentClass == 3 ~ 3,
LatentClass == 4 ~ 1)),
data.frame(Country = "Netherlands",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NLD_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NLD_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 1,
LatentClass == 2 ~ 4,
LatentClass == 3 ~ 3)),
data.frame(Country = "Norway",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NOR_C3cl5.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NOR_C3cl5.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 4,
LatentClass == 3 ~ 1,
LatentClass == 4 ~ 2,
LatentClass == 5 ~ 5)),
data.frame(Country = "Slovenia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SVN_C3cl4.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SVN_C3cl4.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 4,
LatentClass == 4 ~ 1)),
data.frame(Country = "Sweden",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SWE_C3cl4.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SWE_C3cl4.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 5,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 2,
LatentClass == 4 ~ 3)))
pByCntryGND <- classes_ByCntryGND %>% filter(category == 1) %>% plyr::rbind.fill(data.frame(param = "Size", Country = "Sweden")) %>%
mutate(LatentClass = factor(LatentClass),
param = factor(param, levels = c(unique(classes_ByCntryGND$param), "Size")[c(1,2,5,3,4,6,7)])) %>%
ggplot() +
geom_point(aes(x = param, y = est, group = LatentClass, color = LatentClass), size = 1) +
geom_line(aes(param, est, group = LatentClass, linetype = LatentClass, color = LatentClass)) +
geom_text(aes(x = param, y = est, label = scales::percent(proportion, accuracy = 0.1), color = LatentClass), size = 2,
nudge_x = 0.8, nudge_y = 0) +
facet_wrap(Country ~ .) +
scale_fill_grey() + theme_bw() +
#ggtitle("Classes for individual country models for attitudes towards gender \nequality scale") +
theme(legend.position = "none",
title = element_text(size = 9),
axis.title.x = element_blank(),
axis.text.y = element_text(size = 7),
panel.grid.minor.y = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.text.x = element_text(angle = 90, size = 7, vjust = 0.5, hjust = 0)) +
scale_y_continuous(breaks = c(0.25,0.5,0.75), limits = c(0,1)) +
geom_hline(yintercept = c(0.25,0.5,0.75), color = "gray", size = 0.2) +
labs(y="Response probabilities", linetype = "Latent Classes", color = "Latent Classes") +
scale_color_brewer(type = "qual", palette = "Dark2")
pByCntryGND
cat("\n")
cat("\n")
```
```{r resid1cnt2, fig.width=6, fig.height=5, fig.cap="Bivariate model fit standardized residuals", fig.pos='H', eval=FALSE}
residuals_ByCntryGND <- data.frame(`BFL 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BFL_C3cl3.out$tech10$bivar_model_fit_info$z,
`BGR 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BGR_C3cl3.out$tech10$bivar_model_fit_info$z,
`DNK 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_DNK_C3cl3.out$tech10$bivar_model_fit_info$z,
`EST 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_EST_C3cl3.out$tech10$bivar_model_fit_info$z,
`FIN 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_FIN_C3cl3.out$tech10$bivar_model_fit_info$z,
`HRV 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_HRV_C3cl3.out$tech10$bivar_model_fit_info$z,
`ITA 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_ITA_C3cl3.out$tech10$bivar_model_fit_info$z,
`LTU 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LTU_C3cl3.out$tech10$bivar_model_fit_info$z,
`LVA 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LVA_C3cl3.out$tech10$bivar_model_fit_info$z,
`MLT 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_MLT_C3cl3.out$tech10$bivar_model_fit_info$z,
`NLD 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NLD_C3cl3.out$tech10$bivar_model_fit_info$z,
`NOR 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NOR_C3cl3.out$tech10$bivar_model_fit_info$z,
`SVN 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SVN_C3cl3.out$tech10$bivar_model_fit_info$z,
`SWE 3cl` = ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SWE_C3cl3.out$tech10$bivar_model_fit_info$z)
residuals_ByCntryGND <- residuals_ByCntryGND %>% mutate(x = 1:nrow(residuals_ByCntryGND)) %>%
reshape2::melt(id.vars = c("x"))
pByCntryGND <- residuals_ByCntryGND %>% ggplot() +
geom_point(aes(x = x, y = value), size = 1) +
geom_hline(yintercept=c(-1.96,1.96), linetype = "dashed", size = 0.9, color = "black") +
scale_fill_grey() + theme_bw() +
facet_wrap(variable~., scales = "free_y") +
ggtitle("Residuals by country models") +
labs(y="Standardized residuals") +
theme(legend.position = "none", legend.box="vertical",
strip.text.y = element_text(size = 8),
legend.spacing.y = unit(-0.2, 'cm'),
title = element_text(size = 9),
axis.title = element_text(size = 8),
axis.title.x = element_blank(),
axis.text.x = element_text(size = 8),
axis.text.y = element_text(size = 8))
pByCntryGND
cat("\n")
cat("\n")
```
```{r classes1cnt2, fig.width=6, fig.height=5, fig.cap="By country 3-Classes model", fig.pos='H', eval=FALSE}
classes_ByCntryGND <- rbind(data.frame(Country = "Belgium (Flemish)",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BFL_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BFL_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 2)),
data.frame(Country = "Bulgaria",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BGR_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_BGR_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 3)),
data.frame(Country = "Denmark",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_DNK_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_DNK_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 1,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 2)),
data.frame(Country = "Estonia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_EST_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_EST_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 1)),
data.frame(Country = "Finland",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_FIN_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_FIN_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 1)),
data.frame(Country = "Croatia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_HRV_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_HRV_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 2,
LatentClass == 3 ~ 1)),
data.frame(Country = "Italy",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_ITA_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_ITA_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 1)),
data.frame(Country = "Lithuania",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LTU_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LTU_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 2)),
data.frame(Country = "Latvia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LVA_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_LVA_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 3,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 2)),
data.frame(Country = "Malta",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_MLT_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_MLT_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 1)),
data.frame(Country = "Netherlands",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NLD_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NLD_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 1,
LatentClass == 2 ~ 2,
LatentClass == 3 ~ 3)),
data.frame(Country = "Norway",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NOR_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_NOR_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 1,
LatentClass == 2 ~ 3,
LatentClass == 3 ~ 2)),
data.frame(Country = "Slovenia",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SVN_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SVN_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 1,
LatentClass == 2 ~ 2,
LatentClass == 3 ~ 3)),
data.frame(Country = "Sweden",
left_join(ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SWE_C3cl3.out$parameters$probability.scale,
ByCountry_GND$data.MplusModels.ByCountry.GNDlca_SWE_C3cl3.out$class_counts$modelEstimated %>%
mutate(LatentClass = as.character(class), param = "GND6") %>%
select(LatentClass, param, proportion), by = c("LatentClass", "param"))) %>%
mutate(LatentClass = case_when(LatentClass == 1 ~ 2,
LatentClass == 2 ~ 1,
LatentClass == 3 ~ 3)))
pByCntryGND <- classes_ByCntryGND %>% filter(category == 1) %>%
mutate(LatentClass = factor(LatentClass),
param = factor(param, levels = unique(classes_ByCntryGND$param)[c(1,2,5,3,4,6)])) %>%
ggplot() +
geom_point(aes(x = param, y = est, group = LatentClass, color = LatentClass), size = 1) +
geom_line(aes(param, est, group = LatentClass, linetype = LatentClass, color = LatentClass)) +
geom_text(aes(x = param, y = est, label = scales::percent(proportion, accuracy = 0.1), color = LatentClass), size = 2,
nudge_x = -0.15, nudge_y = 0.1) +
facet_wrap(Country ~ .) +
scale_fill_grey() + theme_bw() +
ggtitle("Classes by country for Students' endorsement of gender equality") +
theme(legend.position = "none",
title = element_text(size = 9),
axis.title.x = element_blank(),
axis.text.y = element_text(size = 7),
axis.text.x = element_text(angle = 90, size = 7, vjust = 0.5, hjust = 0)) +
scale_y_continuous(breaks = c(0.25,0.5,0.75)) +
labs(y="Response probabilities", linetype = "Latent Classes", color = "Latent Classes") +
scale_color_brewer(type = "qual", palette = "Dark2")
pByCntryGND
cat("\n")
cat("\n")
```
### General model
Table \@ref(tab:modelfitlca1) shows the results of each model using the pooled sample with all the countries. Models with 1 to 7 classes were computed and the model fit statistics were summarized in the table.
The model that includes a single class has the largest AIC (192,838), BIC (192,891), and ABIC (192,872) values for the pooled sample, indicating that this model fits data worse than all other models. In addition, the P-values for the VLMR test, and LMR in the 2-class model are all < 0.0001; this means that both tests reject the single-class model in favor of a model with at least two latent classes. In other words, there exists heterogeneity in the target population regarding attitudes towards gender equality.
On the other hand, the LMR LR and VLMR tests for the 6-class model are not statistically significant (P > 0.05). That is, the two tests are in favor of at most 5 classes.
In contrast, BIC and aBIC values are all smaller in the 5-class model than those in the 6-class model; thus, consider that the models with more than 5 classes are not preferred. AIC values reach the lowest value in the 7-class model but based on the previous results this criterion will not be considered in this case due to the tendency to overfit the data.
The relative entropy given by Mplus software decreases when including more than 4 classes and increases again with the 6-class model; this would suggest that a model with at least 6 class or 4 class is preferred.
Together with the percentage of reduction in the log-likelihood value, that indicates that by adding two classes to the model the log-likelihood is reduced by 13.3%, this reduction is only increased by 1.2% if the model is a 3-class model and finally this value is reduced close to 0 if more than 5 classes are included.
Now, the preferred model must be either the 5-class or the 6-class model. Considering the residuals of each model, in figure \@ref(fig:resid1) all values are around -1.96 and 1.96. But based on the parsimony principle a 4-class model can be considered as well if just one value of the residuals is outside the acceptable range.
Theoretically, we tend to determine that the 4-class LCA model is the preferred model. We will show later that the classes identified by the 4-class model are more interpretable and representative than the rest of the models. And in particular that two classes can be compared across countries.
\blandscape
```{r modelfitlca1, echo=FALSE}
Modelfit("lcaGND", title = "Model fit statistics LCA Students' endorsement of gender equality", fontn = 11) %>%
pack_rows("All countries",1,4) %>%
#row_spec(4:5, background = "lightgray") %>%
row_spec(c(4,5), bold = TRUE) %>%
print()
cat('\n')
cat('\n')
```
\elandscape
```{r resid1, fig.width=5, fig.height=3, fig.cap="Bivariate model fit standardized residuals global model for Students' endorsement of gender equality", fig.pos='H'}
residuals_GND <- data.frame(cl1 = lcaGND$GND_lca_C3cl1.out$tech10$bivar_model_fit_info$z,
cl2 = lcaGND$GND_lca_C3cl2.out$tech10$bivar_model_fit_info$z,
cl3 = lcaGND$GND_lca_C3cl3.out$tech10$bivar_model_fit_info$z,
cl4 = lcaGND$GND_lca_C3cl4.out$tech10$bivar_model_fit_info$z,
cl5 = lcaGND$GND_lca_C3cl5.out$tech10$bivar_model_fit_info$z,
cl6 = lcaGND$GND_lca_C3cl6.out$tech10$bivar_model_fit_info$z)
residuals_GND <- residuals_GND %>% mutate(x = 1:nrow(residuals_GND)) %>%
reshape2::melt(id.vars = c("x"))
pGND <- residuals_GND %>% ggplot() +
geom_point(aes(x = x, y = value), size = 1) +
geom_hline(yintercept=c(-1.96,1.96), linetype = "dashed", size = 0.9, color = "black") +
scale_fill_grey() + theme_bw() +
facet_wrap(variable~., scales = "free_y") +
#ggtitle("Standardized bivariate residuals for \nAttitudes towards gender equality models") +
labs(x = "Parameters", y="Standardized residuals", color = "Number of classes") +
theme(legend.position = "none",
strip.text.y = element_text(size = 8),
legend.spacing.y = unit(-0.2, 'cm'),
title = element_text(size = 9),
axis.title = element_text(size = 8),
axis.text.x = element_blank(),
axis.text.y = element_text(size = 8)) +
scale_color_brewer(type = "qual", palette = "Dark2")
pGND
cat("\n")
cat("\n")
```
```{r, echo=FALSE, results='hide'}
#----GND 3 groups----
classes3GND <- c("Fully egalitarian",
"Competition- driven sexism",
"Random response")
orden3GND <- c(2,3,1)
lcaGND_C3cl3 <- lcaGND$GND_lca_C3cl3.out$parameters$probability.scale %>%
rename_with(~ c("Class", "value")[which(c("LatentClass", "est") == .x)], .cols = c("LatentClass", "est")) %>%
mutate_at( c("param", "category", "Class"), ~ as.factor(.x)) %>%
mutate(Class = factor(Class, levels = orden3GND, labels = classes3GND))
counts3GND <- full_join(lcaGND$GND_lca_C3cl3.out$class_counts$modelEstimated,
lcaGND$GND_lca_C3cl3.out$class_counts$mostLikely,by = c("class"))
counts3GND %>%
mutate(class = factor(class, levels = orden3GND, labels = classes3GND),
proportion.x = scales::percent(proportion.x,accuracy = 0.1),
proportion.y = scales::percent(proportion.y,accuracy = 0.1)) %>% arrange(desc(count.y)) %>%
kbl(col.names = c("Class", "Counts", "Proportion", "Counts", "Proportion"),
caption = paste0("Class sizes 3-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",3)), row.names = FALSE, digits = 1, escape = TRUE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
add_header_above(c(" " = 1 , "Model estimated" = 2, "Most likely" = 2))
sizelca3_GND <- lcaGND$GND_lca_C3cl3.out$class_counts$modelEstimated %>% dplyr::select(-count) %>%
rename_with(~ c("Gender", "Class")[which(c("proportion", "class") == .x)], .cols = c("proportion", "class")) %>%
mutate(Class = factor(Class, levels = orden3GND, labels = classes3GND)) %>%
reshape2::melt(id.vars = c("Class"), variable.name = "Group") %>%
dplyr::arrange(Group) %>%
dplyr::group_by(Group) %>%
dplyr::mutate(countT= sum(value, na.rm = TRUE)) %>%
dplyr::group_by(Class) %>%
dplyr::mutate(per=value/countT) %>%
dplyr::select(Group, Class, per)
VarClass(lcaGND_C3cl3) %>% group_by(Class, param) %>%
filter(category == 1) %>%
select(param, Class, value) %>%
mutate(value = cell_spec(value, color = ifelse(value >= 0.8, "Myblue",
ifelse(value < 0.8 & value > 0.49, "Mygreen","Myred")))) %>%
reshape2::dcast(param ~ Class) %>%
kbl(caption = paste0("Probabilities to agree each item 3-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",3)), row.names = FALSE, digits = 3, escape = FALSE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
column_spec(1, width = "15em") %>%
column_spec(2:4, width = "5em") %>%
collapse_rows(1, valign = "top") %>%
print()
# graphclass(lcaGND_C3cl3, nclass = 3, orden = c(1,2,5,3,4,6), title = "LCA Gender equality with 3 classes", mg = FALSE)
# cat("\n")
# cat("\n")
#HighProb(lcaGND_C3cl3, sizelca3_GND, orden = c(1,2,5,3,4,6), title = "Response categories probabilities and class size for\n #3-classes Gender equality model")
cat("\n")
cat("\n")
#----GND 4 groups----
classes4GND <- c("Fully egalitarian",
"Competition- driven sexism",
"Non-egalitarian",
"Political egalitarian")
orden4GND <- c(2,4,3,1)
lcaGND_C3cl4 <- lcaGND$GND_lca_C3cl4.out$parameters$probability.scale %>%
rename_with(~ c("Class", "value")[which(c("LatentClass", "est") == .x)], .cols = c("LatentClass", "est")) %>%
mutate_at( c("param", "category", "Class"), ~ as.factor(.x)) %>%
mutate(Class = factor(Class, levels = orden4GND, labels = classes4GND))
counts4GND <- full_join(lcaGND$GND_lca_C3cl4.out$class_counts$modelEstimated,
lcaGND$GND_lca_C3cl4.out$class_counts$mostLikely,by = c("class"))
counts4GND %>%
mutate(class = factor(class, levels = orden4GND, labels = classes4GND),
proportion.x = scales::percent(proportion.x,accuracy = 0.1),
proportion.y = scales::percent(proportion.y,accuracy = 0.1)) %>% arrange(desc(count.y)) %>%
kbl(col.names = c("Class", "Counts", "Proportion", "Counts", "Proportion"),
caption = paste0("Class sizes 4-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",4)), row.names = FALSE, digits = 1, escape = TRUE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
add_header_above(c(" " = 1 , "Model estimated" = 2, "Most likely" = 2))
sizelca4_GND <- lcaGND$GND_lca_C3cl4.out$class_counts$modelEstimated %>% dplyr::select(-count) %>%
rename_with(~ c("Gender", "Class")[which(c("proportion", "class") == .x)], .cols = c("proportion", "class")) %>%
mutate(Class = factor(Class, levels = orden4GND, labels = classes4GND)) %>%
reshape2::melt(id.vars = c("Class"), variable.name = "Group") %>%
dplyr::arrange(Group) %>%
dplyr::group_by(Group) %>%
dplyr::mutate(countT= sum(value, na.rm = TRUE)) %>%
dplyr::group_by(Class) %>%
dplyr::mutate(per=value/countT) %>%
dplyr::select(Group, Class, per)
VarClass(lcaGND_C3cl4) %>% group_by(Class, param) %>%
filter(category == 1) %>%
select(param, Class, value) %>%
mutate(value = cell_spec(value, color = ifelse(value >= 0.8, "Myblue",
ifelse(value < 0.8 & value > 0.49, "Mygreen","Myred")))) %>%
reshape2::dcast(param ~ Class) %>%
kbl(caption = paste0("Probabilities to agree each item 4-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",4)), row.names = FALSE, digits = 3, escape = FALSE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
column_spec(1, width = "15em") %>%
column_spec(2:5, width = "5em") %>%
collapse_rows(1, valign = "top") %>%
print()
# graphclass(lcaGND_C3cl4, nclass = 4, orden = c(1,2,5,3,4,6), title = "LCA Gender equality with 4 classes", mg = FALSE)
# cat("\n")
# cat("\n")
#HighProb(lcaGND_C3cl4, sizelca4_GND, orden = c(1,2,5,3,4,6), title = "Response categories probabilities and class size for\n #4-classes Gender equality model")
#cat("\n")
#cat("\n")
#----GND 5 groups----
classes5GND <- c("Fully egalitarian",
"Competition- driven sexism",
"Non-egalitarian",
"Political egalitarian",
"Reverse competition-driven sexism")
orden5GND <- c(2,3,5,4,1)
lcaGND_C3cl5 <- lcaGND$GND_lca_C3cl5.out$parameters$probability.scale %>%
rename_with(~ c("Class", "value")[which(c("LatentClass", "est") == .x)], .cols = c("LatentClass", "est")) %>%
mutate_at( c("param", "category", "Class"), ~ as.factor(.x)) %>%
mutate(Class = factor(Class, levels = orden5GND, labels = classes5GND))
counts5GND <- full_join(lcaGND$GND_lca_C3cl5.out$class_counts$modelEstimated,
lcaGND$GND_lca_C3cl5.out$class_counts$mostLikely,by = c("class"))
counts5GND %>%
mutate(class = factor(class, levels = orden5GND, labels = classes5GND),
proportion.x = scales::percent(proportion.x,accuracy = 0.1),
proportion.y = scales::percent(proportion.y,accuracy = 0.1)) %>% arrange(desc(count.y)) %>%
kbl(col.names = c("Class", "Counts", "Proportion", "Counts", "Proportion"),
caption = paste0("Class sizes 5-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",4)), row.names = FALSE, digits = 1, escape = TRUE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
add_header_above(c(" " = 1 , "Model estimated" = 2, "Most likely" = 2))
sizelca5_GND <- lcaGND$GND_lca_C3cl5.out$class_counts$modelEstimated %>% dplyr::select(-count) %>%
rename_with(~ c("Gender", "Class")[which(c("proportion", "class") == .x)], .cols = c("proportion", "class")) %>%
mutate(Class = factor(Class, levels = orden5GND, labels = classes5GND)) %>%
reshape2::melt(id.vars = c("Class"), variable.name = "Group") %>%
dplyr::arrange(Group) %>%
dplyr::group_by(Group) %>%
dplyr::mutate(countT= sum(value, na.rm = TRUE)) %>%
dplyr::group_by(Class) %>%
dplyr::mutate(per=value/countT) %>%
dplyr::select(Group, Class, per)
VarClass(lcaGND_C3cl5) %>% group_by(Class, param) %>%
filter(category == 1) %>%
select(param, Class, value) %>%
mutate(value = cell_spec(value, color = ifelse(value >= 0.8, "Myblue",
ifelse(value < 0.8 & value > 0.49, "Mygreen","Myred")))) %>%
reshape2::dcast(param ~ Class) %>%
kbl(caption = paste0("Probabilities to agree each item 5-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",5)), row.names = FALSE, digits = 3, escape = FALSE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
column_spec(1, width = "15em") %>%
column_spec(2:6, width = "5em") %>%
collapse_rows(1, valign = "top") %>%
print()
# graphclass(lcaGND_C3cl5, nclass = 5, orden = c(1,2,5,3,4,6), title = "LCA Gender equality with 5 classes", mg = FALSE)
# cat("\n")
# cat("\n")
#HighProb(lcaGND_C3cl5, sizelca5_GND, orden = c(1,2,5,3,4,6), title = "Response categories probabilities and class size for\n #5-classes Gender equality model")
#cat("\n")
#cat("\n")
#----GND 6 groups----
classes6GND <- c("Fully egalitarian",
"Competition- driven sexism",
"Non-egalitarian",
"Political egalitarian",
"Reverse competition sexism",
"Pro-women pay/job")
orden6GND <- c(4,5,2,6,1,3)
lcaGND_C3cl6 <- lcaGND$GND_lca_C3cl6.out$parameters$probability.scale %>%
rename_with(~ c("Class", "value")[which(c("LatentClass", "est") == .x)], .cols = c("LatentClass", "est")) %>%
mutate_at( c("param", "category", "Class"), ~ as.factor(.x)) %>%
mutate(Class = factor(Class, levels = orden6GND, labels = classes6GND))
counts6GND <- full_join(lcaGND$GND_lca_C3cl6.out$class_counts$modelEstimated,
lcaGND$GND_lca_C3cl6.out$class_counts$mostLikely,by = c("class"))
counts6GND %>%
mutate(class = factor(class, levels = orden6GND, labels = classes6GND),
proportion.x = scales::percent(proportion.x,accuracy = 0.1),
proportion.y = scales::percent(proportion.y,accuracy = 0.1)) %>% arrange(desc(count.y)) %>%
kbl(col.names = c("Class", "Counts", "Proportion", "Counts", "Proportion"),
caption = paste0("Class sizes 6-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",4)), row.names = FALSE, digits = 1, escape = TRUE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
add_header_above(c(" " = 1 , "Model estimated" = 2, "Most likely" = 2))
sizelca6_GND <- lcaGND$GND_lca_C3cl6.out$class_counts$modelEstimated %>% dplyr::select(-count) %>%
rename_with(~ c("Gender", "Class")[which(c("proportion", "class") == .x)], .cols = c("proportion", "class")) %>%
mutate(Class = factor(Class, levels = orden6GND, labels = classes6GND)) %>%
reshape2::melt(id.vars = c("Class"), variable.name = "Group") %>%
dplyr::arrange(Group) %>%
dplyr::group_by(Group) %>%
dplyr::mutate(countT= sum(value, na.rm = TRUE)) %>%
dplyr::group_by(Class) %>%
dplyr::mutate(per=value/countT) %>%
dplyr::select(Group, Class, per)
VarClass(lcaGND_C3cl6) %>% group_by(Class, param) %>%
filter(category == 1) %>%
select(param, Class, value) %>%
mutate(value = cell_spec(value, color = ifelse(value >= 0.8, "Myblue",
ifelse(value < 0.8 & value > 0.49, "Mygreen","Myred")))) %>%
reshape2::dcast(param ~ Class) %>%
kbl(caption = paste0("Probabilities to agree each item 6-class Gender equality model"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",6)), row.names = FALSE, digits = 3, escape = FALSE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"), font_size = 9) %>%
column_spec(1, width = "15em") %>%
column_spec(2:7, width = "5em") %>%
collapse_rows(1, valign = "top") %>%
print()
# graphclass(lcaGND_C3cl6, nclass = 6, orden = c(1,2,5,3,4,6), title = "LCA Gender equality with 6 classes", mg = FALSE)
# cat("\n")
# cat("\n")
#HighProb(lcaGND_C3cl6, sizelca6_GND, orden = c(1,2,5,3,4,6), title = "Response categories probabilities and class size for\n #6-classes Gender equality model")
cat("\n")
cat("\n")
```
Three, four, five and six class models were investigated profoundly. It is difficult to choose the best model fit without a full analysis. There are some patterns that can be clearly identified in the models as can be seen in figure \@ref(fig:compare1). Class 1 with class sizes of 81.5%, 79.2%, 77% and 78.6% in each model respectively, and estimated probabilities to agree for this latent class higher than 0.92 for all six items *Men and women should have equal opportunities to take part in government*, *Men and women should have the same rights in every way*, *Men and women should get equal pay when they are doing the same job*, *Women should stay out of politics*, *Not many jobs available, men should have more right to a job than women* and *Men are better qualified to be political leaders than women* the **Fully egalitarian** group.
The second class (Class 2 in Figure \@ref(fig:compare1)) identified in all the models, called **Competition-driven sexism**. For this class, the estimated probabilities to agree to the first 3 items are close or higher than 0.9 in all models. For the last three items, the estimated probabilities of agreement are not higher than 0.5 in all models. The class size differs in all four models, 11.3%, 11.5%, 12.8% and 8.6% in the 3, 4, 5, 6-class model respectively.
The third class (Class 3) that can be seen with a similar pattern in all the models, it is called **Non-egalitarian**, this class appears in the 4-class model. The pattern of this class is basically showing lower estimated conditional probabilities to agree with any of these statements, no greater than 0.4, except for one item *Men and women should get equal pay when they are doing the same jobs* with an estimated probability to agree no higher than 0.55. The estimated sizes for this class are 7.7%, 7% and 5.8% in each model respectively.
The fourth, fifth and sixth classes identified in the models differ in all the countries, nevertheless, one class appears to be consistent in the 5-class model, where this class called **Reverse competition-driven sexism** has opposite conditional probabilities compared to the second class identified previously.
```{r compare1, fig.width=6, fig.height=6, fig.cap="Comparative conditional probabilities to agree in 3 to 6 latent class global models for Students' endorsement of gender equality", fig.pos='H'}
g0 <- ClassGraph(lcaGND_C3cl3, sizelca3_GND, orden = c(1,2,5,3,4,6), title = "3-classes", selected = c(1,3,6))
g1 <- ClassGraph(lcaGND_C3cl4, sizelca4_GND, orden = c(1,2,5,3,4,6), title = "4-classes", selected = c(1,3,2,5))
g2 <- ClassGraph(lcaGND_C3cl5, sizelca5_GND, orden = c(1,2,5,3,4,6), title = "5-classes", selected = c(1,3,2,5,4))
g3 <- ClassGraph(lcaGND_C3cl6, sizelca6_GND, orden = c(1,2,5,3,4,6), title = "6-classes", selected = c(1,3,2,7,4,5))
g <- arrangeGrob(g0,g1,g2,g3, ncol = 2)#, top = "Conditional probabilities to agree to Students' endorsement of gender equality")
grid.draw(g)
cat("\n")
cat("\n")
```
The main two classes in the solutions with five and six classes do not strongly differ from other models, and the remaining classes are not informative at all or with a sample size very small, using this as a criterion, one can prefer a four-class solution.
In table \@ref(tab:bestfit11) the conditional probabilities to agree with the fourth latent class model are shown. These values are very close to 1 for the first class, Fully egalitarian. Similar values are obtained for the positive items in the second class Competition driven-sexism, meanwhile, for third item GND3, conditional probabilities are close to 0.5, this would mean a random response, but the last two items have lower conditional probabilities close to 0.2, which would mean not likely to agree to the statements. Table \@ref(tab:bestfit12) indicates the counts and proportions using the model estimated and the most likely probabilities.
```{r bestfit11}
#----GND 4 groups----
classes4GND <- c("Fully egalitarian",
"Competition- driven sexism",
"Non-egalitarian",
"Political egalitarian")
orden4GND <- c(2,4,3,1)
lcaGND_C3cl4 <- lcaGND$GND_lca_C3cl4.out$parameters$probability.scale %>%
rename_with(~ c("Class", "value")[which(c("LatentClass", "est") == .x)],
.cols = c("LatentClass", "est")) %>%
mutate_at( c("param", "category", "Class"), ~ as.factor(.x)) %>%
mutate(Class = factor(Class, levels = orden4GND, labels = classes4GND))
counts4GND <- full_join(lcaGND$GND_lca_C3cl4.out$class_counts$modelEstimated,
lcaGND$GND_lca_C3cl4.out$class_counts$mostLikely,
by = c("class"))
lcaGND_C3cl4$orden = rep(c(1,2,4,5,3,6), each = 2)
VarClass(lcaGND_C3cl4) %>% group_by(Class, param) %>%
filter(category == 1) %>%
select(orden, param, Class, value) %>%
mutate(value = cell_spec(value, color = ifelse(value >= 0.75, "Myblue",
ifelse(value < 0.75 & value >= 0.25, "Mygreen","Myred")))) %>%
reshape2::dcast(orden + param ~ Class) %>% arrange(orden) %>% select(-orden) %>%
kbl(caption = paste0("Probabilities to agree each item in the four-class global model for Students' endorsement of gender equality"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",4)),
row.names = FALSE, digits = 3, escape = FALSE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"),
font_size = 9) %>%
column_spec(1, width = "15em") %>%
column_spec(2:5, width = "5em") %>%
collapse_rows(1, valign = "top") %>%
print()
```
```{r bestfit12}
counts4GND %>%
mutate(class = factor(class, levels = orden4GND, labels = classes4GND),
proportion.x = scales::percent(proportion.x,accuracy = 0.1),
proportion.y = scales::percent(proportion.y,accuracy = 0.1)) %>%
arrange(desc(count.y)) %>%
kbl(col.names = c("Class", "Counts", "Proportion", "Counts", "Proportion"),
caption = paste0("Class sizes four-class global model for Students' endorsement of gender equality"),
booktabs = TRUE, longtable = TRUE, align = c("l", rep("r",4)),
row.names = FALSE, digits = 1, escape = TRUE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = c("repeat_header", "HOLD_position"),
font_size = 9) %>%
add_header_above(c(" " = 1 , "Model estimated" = 2, "Most likely" = 2))
sizelca4_GND <- lcaGND$GND_lca_C3cl4.out$class_counts$modelEstimated %>%
dplyr::select(-count) %>%
rename_with(~ c("Gender", "Class")[which(c("proportion", "class") == .x)],
.cols = c("proportion", "class")) %>%
mutate(Class = factor(Class, levels = orden4GND, labels = classes4GND)) %>%
reshape2::melt(id.vars = c("Class"), variable.name = "Group") %>%
dplyr::arrange(Group) %>%
dplyr::group_by(Group) %>%
dplyr::mutate(countT= sum(value, na.rm = TRUE)) %>%
dplyr::group_by(Class) %>%
dplyr::mutate(per=value/countT) %>%
dplyr::select(Group, Class, per)
```
```{r higprob1, eval=FALSE}
HighProb(lcaGND_C3cl4, sizelca4_GND, orden = c(1,2,5,3,4,6),
title = "Response categories probabilities and class size for four-\nclasses global model for Students' endorsement of gender equality")
cat("\n")
cat("\n")
```
```{r , eval=FALSE}
prob_gender <- read.table("data/MplusModels/LCA/GND_Prob_C3cl4.dat", header = FALSE)
colnames(prob_gender) <- c("GND1", "GND2", "GND3", "GND4", "GND5", "GND6", "GND_CPROB1", "GND_CPROB2", "GND_CPROB3", "GND_CPROB4",
"GND_C", "ws", "IDSTUD", "id_s", "id_j")
prob_eth <- read.table("data/MplusModels/LCA/ETH_Prob_C3cl4.dat", header = FALSE)
colnames(prob_eth) <- c("ETH1", "ETH2", "ETH3", "ETH4", "ETH5", "ETH_CPROB1", "ETH_CPROB2", "ETH_CPROB3", "ETH_CPROB4",
"ETH_C", "ws", "IDSTUD", "id_s", "id_j")
ISC_lca_prob <- full_join(prob_gender, prob_eth, by = c("id_s", "id_j", "IDSTUD")) %>%
select(-GND1, -GND2, -GND3, -GND4, -GND5, -GND6, -ETH1, -ETH2, -ETH3, -ETH4, -ETH5, -ws.y, -ws.x)
ISC_lca_prob2 <- ISC_lvRlca %>% left_join(ISC_lca_prob,
by = c("id_s", "id_j", "IDSTUD")) %>%
select(-IMM1, -IMM2, -IMM3, -IMM4, -IMM5)
classes4GND <- c("Fully egalitarian",
"Competition- driven sexism",
"Non-egalitarian",
"Political egalitarian")
orden4GND <- c(2,4,3,1)
ISC_lca_prob2$GND_C <- factor(ISC_lca_prob2$GND_C, levels = orden4GND, labels = classes4GND)
classes4ETH <- c("Fully egalitarian",
"Political non-egalitarian",
"Non-egalitarian",
"Employment non-egalitarian")
orden4ETH <- c(2,3,1,4)
ISC_lca_prob2$ETH_C <- factor(ISC_lca_prob2$ETH_C, levels = orden4ETH, labels = classes4ETH)
#head(ISC_lvRlca)
#head(ISC_lca_prob2)
#table(ISC_lca_prob$GND_C, ISC_lca_prob$ETH_C)
#table(ISC_lca_prob2$GND_C, ISC_lca_prob2$ETH_C)
#write_sav(ISC_lca_prob2, "data/MplusModels/LCA/Profiles_probabilities.sav")
#str(ISC_lca_prob2)
#sav <- read_sav("data/MplusModels/LCA/Profiles_probabilities.sav")
```
### Country comparability
```{r eval=FALSE}
load("data/MplusModels_CntCov.RData")
Modelfit("CntCovGND", title = "Model fit statistics Gender equality model with Country as covariate") %>%
pack_rows("Country numerical covariate",1,3) %>%
row_spec(2, background = "lightgray") %>%
row_spec(c(2), bold = TRUE) %>%
footnote(general = "The best loglikelihood value was not replicated in 6-class model. The solution may not be trustworthy due to local maxima.") %>%
print()
cat('\n')
cat('\n')
```
```{r, eval=FALSE}
#----GND 5 groups
CntCovclasses5GND <- c("Fully egalitarian",
"Strong competition- driven sexism",
"Equal fully egalitarian",
"Not involved",
"Soft competition- driven sexism")