skip to primary navigationskip to content

Longitudinal Data Analysis

In this two-day workshop the instructors introduced the statistical theory required to understand data using latent variable models for growth and change.

The emphasis was was data-analytic experience and hands-on practice rather than on mathematical underpinnings or statistical formulae. The workshop centred on applied topics that are integral in estimating latent growth curves and growth mixture models.

Course introduction by Tim Croudace, 
followed by discussion of variety of modelling options (part1/part2) with reference to Rosel and Plewis (2008).

The analysis of continuous measures

The aim of this day was threefold:

  1. To introduce the Mplus software and command language,
  2. To introduce the basics of latent growth curve modeling (with continuous outcomes), and
  3. To introduce the basics of General Mixture Models (with continuous outcomes).

The first presentation introduces Mplus, and goes on to cover the basics of setting up a latent growth curve model. The second presentation introduces general mixture models, as well as the need for theoretically based hypotheses in regard to the number and shape of distinct trajectories expected.

Data and examples used during the day can be downloaded from here: examples-1, examples-2.  

The analysis of binary and categorical measures

Introduction by Tim Croudace

The aim of this day was to demonstrate a number of mixture models that could be fitted to repeated binary/categorical measures. Two kinds of model were introduced (i) Longitudinal Latent Class Analysis (LLCA) which create a number of different profiles (or typologies) of development, and (ii) Latent Class Growth Analysis (LCGA) which produces a number of trajectories (typically linear/quadratic) describing changing behaviour across time.

The data used were taken from ALSPAC (The Avon Longitudinal Study of Parents and Children), a birth cohort based in the University of Bristol.  The majority of the models used data on nighttime incontinence (bedwetting) measured on children between the ages of 4.5 years and 9.5 years.  A description of the data used can be downloaded from here, with the datasets (Stata/Mplus txt-files) used in the practical sessions available as a zipfile from here.

For each section that follows, the powerpoint presentation can be downloaded by clicking on the section title, with supplementary material available from links within the text. 

Binary models 1 

The motivation for the ALSPAC work on bedwetting was an LLCA/LCGA paper by Tim Croudace and colleagues published in the American Journal of Epidemiology in 2003.  Availability of data within ALSPAC meant that an attempted replication was possible.  The work presented here makes up part of a paper currently under review (as of July 2008).  The aim was to demonstrate how one would fit models within the Mplus package and then select the 'best' model based on the various statistical (and non-statistical) criteria available.

The aim of the first practical (prac-1a) was to replicate the model fitting / selection described above using a random sample of 2000 cases (1000 boys / 1000 girls) selected from those with complete data (all 5 bedwetting measures).  The practical materials / answers can be downloaded from here.

This session continued with a description of how one would incorporate covariates as part of a "2-stage" modelling procedure.  Model-based posterior probabilities were extracted from Mplus and combined with the covariate data in Stata.  Weighted multinomial regression models were used to predict latent class-membership with a range of covariates.  The second practical (prac-1b) returned to our random sample of 2000 cases.  A stata datafile was provided with covariates already merged with Mplus output and the objective was to examine the weighted mlogit results and compare them with the (biased) results obtained when assigning each subject to their 'modal' class.  The practical materials / answers can be downloaded from here

Binary Models 2

The aim here was to demonstrate how one would fit and compare LLCA and LCGA models.  The models were based on binary indicators of 'frequent' wetting since the resulting trajectory shapes were more well-behaved and more likely to be well-approximated by LCGA polynomials.  Gender specific LLCA and LCGA models were fitted using the "knownclass" option in Mplus and then gender-invariance was assessed by testing for equality in parameter values across the two genders.

The practical for this session (prac-2) was to recreate the models using our 1000 boys and 1000 girls.  This smaller sample led to various error messages for the LCGA models.  Part of the aim of this session was therefore to show how one would deal with such errors by constraining some parameter values. The practical materials / answers can be downloaded from here.

Ordinal models

3-level ordinal bedwetting variables (no wetting / infrequent / frequent) were used to show how LLCA models could be fitted, and the results examined graphically.  Covariates were incorporated with weighted mlogit models and then a distal outcome was predicted by class-membership in a similar way.

Prac 3 demostrated how ordinal models could be fitted and the results plotted within Mplus.  The practical materials / answers can be downloaded from here

Parallel models

Finally it was shown how two sets of binary variables (bedwetting and daytime wetting) could be modelled as a pair of parallel LLCA or LCGA.  This permits an assessment of the association between the membership of two seperate processes.  The procedure can also be used for sequential processes such as early life temperament leading into early adolecence depressive symptoms.  The importance of considering conditional independence (residual correlations between measures within class) was stressed and ways of assessing these correlations were presented.

Recommended Post-Course Reading

LDA Books:

Longitudinal Data Analysis by Donald Hedeker and Robert D.Gibbons

Epidemiological methods in life-course research by Andrew Pickles, Barbara Maughan and Mike Wadsworth

Latent Curve Models: A Structural Equation Perspective by Kenneth A. Bollen and Patrick J. Curran

Group-based modeling of development by Daniel S. Nagin

Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence by Judith D. Singer and John B. Willett

Applied Longitudinal Analysis by Garrett M. Fitzmaurice, Nan M. Laird and James H. Ware

An Introduction to Latent Variable Growth Curve Modeling: Concepts, Issues and Applications by Duncan, Duncan, Strucker, Li and Alpert

Design and analysis of quality of life studies in clinical trials by Diane L. Fairclough

Models for intensive longitudinal data by Theodore A. Walls and Joseph L. Shafer

Non-parametric regression models for Longitudinal Data Analysis by Hulin Wu and Jin-Ting Zhang

Multilevel and Longitudinal Modeling Using Stata 2nd edn by Sophia Rabe-Hesketh and Anders Skrondal

Generalized Latent Variable Modeling:  Multilevel, Longitudinal, and Structural Equation Models by Anders Skrondal and Sophia Rabe-Hesketh


Dunn, G., Everitt, B., & Pickles, A. (1993). Modelling covariances and latent variables using EQS. London: Chapman & Hall.

Dwyer, M., Feinlieb, P.L., & Hoffmeister, H. (Eds.). (1992). Statistical models for longitudinal studies of health. New York: Oxford.

Embretson, S.E. (2007). Impact of measurement scale in modeling developmental processes and ecological factors.  In T.D. Little, J.A. Bovaird, & N.A. Card (Eds.), Modeling contextual effects in longitudinal studies (pp. 63–87). Mahwah, NJ: Erlbaum.

L.M. Collins & J.L. Horn (Eds.), Best methods for the analysis of change. Washington, DC:American Psychological Association.

L.M. Collins & A. Sayer (Eds.), New methods for the analysis of change. Washington, DC:American Psychological Association.

von Eye, A., & Clogg, C.C. (Eds.). (1994). Latent variables analysis: Applications for developmental research. Thousand Oaks, CA: Sage.

Steyer, R., Krambeer, S.,&Hannöver,W. (2004). Modeling latent trait-change. In K. Van Montfort, H. Oud, & A. Satorra (Eds.), Recent developments on structural equation modeling: Theory and applications (pp. 337–357). Amsterdam: Kluwer.

Little, T.D., Bovaird, J.A., & Slegers, D.W. (2006). Methods for the analysis of change. In T.D. Little & D. Mroczek (Eds.), Handbook of personality development (pp. 181–211). Mahwah,NJ: Erlbaum.

Little, T.D., Preacher, K.J., Selig, J.P., & Card, N.A. (2007). New  developments in latent variable panel analyses of longitudinal data. International Journal of Behavioral Development, 31,357–365.

Little, T.D., Schnabel, K.U.,&Baumert, J. (Eds.). (2000).Modeling  longitudinal and multilevel data, practical issues, applied approaches,and specific examples.Mahwah, NJ: Erlbaum.

Card, N.A., & Little, T.D. (2007). Longitudinal modeling of developmental processes. International Journal of Behavioral Development, 31, 297–302.

Cole, D.A., & Maxwell, S.E. (2003). Testing mediational models with longitudinal data: Questions and tips in the use of structural equation modeling. Journal of Abnormal Psychology, 112, 558–577.

Meredith, W., & Tisak, J. (1990). Latent curve analysis. Psychometrika,55, 107–122.

Muthén, B. (2001). Second-generation structural equation modeling with a combination of categorical and continuous latent variables: New opportunities for latent class/latent growth modeling. In L.M. Collins & A. Sayer (Eds.), New methods for the analysis of change (pp. 291–322). Washington, DC:American Psychological Association.

Muthén, B., & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics, 55, 463–469.

Ian Plewis:
Plewis, I, (1985). Analyzing change: Measurement and explanation using longitudinal data. Chichester: Wiley.

Plewis, I. (1996). Statistical methods for understanding cognitive growth: A review, a synthesis, and an application. British Journal of Mathematical and Statistical Psychology, 49, 25–42.

Plewis, I. (2005). Modelling behavior with multivariate multilevel growth curves. Methodology, 1, 71–80.

Plewis, I., Vitaro, F., & Tremblay, R. (2006). Modelling repeated ordinal reports from multiple informants. Statistical Modelling,6, 251–263.