skip to primary navigationskip to content

The Psychometrics Centre

Cambridge Judge Business School Executive Education

Studying at Cambridge

 

IRT and Mixture Modelling


In this two day course the instructors Tim Croudace and Jon Heron introduced the Mplus software and worked through a number of examples, from simple linear and logistic regression through to more complex mixture modelling for continuous and binary measures.

Practical examples were run with the demo version of the software that had been downloaded for free from herehttp://www.statmodel.com/demo.shtml.

The basics through to CFA / EFA

Mplus Basics

An introduction to the Mplus environment

Simple regression using GHQ data

Mplus is a package often called upon when a researcher needs to fit a complex model - the type that cannot be fitted in a standard package such as SPSS or Stata.  This presents the researcher with the joint hurdles of learning a new package and learning a new modelling technique.  Here we showed that familiarity with the Mplus environment can be gained through simple regression models.  Model outputs were compared with Stata. The data used for this example can be downloaded from here. 

Linking SEM figures to the model command

The building blocks of the Mplus model building language (ON/BY/WITH) were demonstrated.  With these three simple commands, complex models can be built up. Equally, a complicated looking SEM figure can be dissected into a number of smaller, simpler models and fitted gradually in sections.

Confirmatory Factor Analysis

Returning to the GHQ dataset we demonstrated how the BY/WITH commands can be used to fit a confirmatory factor analysis (CFA).  In a paper by Shevlin/Adamson (Psychological Assessment 2005) the authors use CFA to assess the fit of six other GHQ factor-structures previously reported in the literature.  We repeated this exercise with our own GHQ dataset.  The GHQ-12 has too many items for the Mplus-demo so the output for the six models was circulated for the course attendees to match the syntax with each model + to decide which model was best supported by the data.  The dataset is the same as that used in the regression examples above. The 6 outputs circulated are available from here - m1 m2 m3 m4 m5 m6.

Exploratory Factor Analysis

In the final session of the day we demonstrated how to carry out an Exploratory Factor Analysis (EFA) on the GHQ-12. As the GHQ items are 4-level ordinal scales we illustrate the pitfalls of failing to take account of this in the analysis. 

IRT and mixture modelling

Item Response Theory

We introduced Item Response Theory (IRT) with an example based on 4 binary variables assessing whether respondents felt they had gained knowledge about cancer from radio/newspapers/books/lectures.  The estimated factor score from an IRT model was compared with the simpler procedure of adding up the number of sources of information reported (scale 0-4).

Cross Sectional Latent Class Analysis

In this example we fit a Latent Class Analysis (LCA) model on a set of binary variables which measure a group of young people's response to their first cigarette.  We demonstrate how to determine the optimal number of classes and also what to do when the results suggest that more classes are needed than can be tested using the data being modelled.  A 2-stage modelling procedure is described where probabilities for class membership are exported to Stata to examine the relationship between latent classes and both covariates and binary outcomes.

Mixture Modelling of a Continuous Variable

Here we show that mixture modelling can also be used to find a latent grouping within a single continuous varaible.

Longitudinal Mixture Modelling

We demonstrated the use of Longitudinal Latent Class Analysis (LLCA) and Latent Class Growth Analysis (LCGA) on a dataset of reported bed-sharing practices between parents and the new-born infant. In this example, a set of cross-sectional models (using each repeated binary measure in turn) showed a bizarre effect in which the effect of social-classwas inconsistently related to bed-sharing.  By using LLCA it was possible to extract groups with more homogeneous longitudinal bed-sharing behaviours with the result that the source of the social-class anomaly became clear.