# Latent Class Analysis with a Binary Distal Outcome

## LCA Outcome Probability Calculator

This page provides

- a step-by-step procedure for estimating the effect of latent class membership on a binary distal outcome, and
- an Excel calculator that can be modified and used to obtain the effects of interest.

## Overview

The following provides the technical details on how to reparameterize a latent class analysis (LCA) with covariates model in order to calculate the probability of an outcome given latent class membership using the LCA Outcome Probability Calculator. The purpose is to present a step-by-step guide for the empirical example described in Lanza and Rhoades (2011). With this information, you should be able to replicate these steps with your own LCA analyses.

Read about latent class analysis or latent transition analysis.

## Introductory Example: Adolescent Subgroup Transitions

LCA was used to identify underlying subgroups of adolescents at risk for problem behavior. Six unique observed variables were used to measure risk exposure within the household, peer, and neighborhood contexts. Participants (N = 1900 adolescents in 8th grade) were drawn from The National Longitudinal Study of Adolescent Health (Add Health; Harris, 2009). In this example, a five-class model best fit the data with the following classes: Low Risk, Peer Risk, Economic Risk, Household & Economic Risk, Multi-Contextual Risk (see Lanza & Rhoades, 2011). In this example, we were specifically interested in examining how differential treatment effects (effect of the treatment on Grade 9 binge drinking) varied across risk subgroups. Therefore, following model selection, the treatment group was incorporated in the model as a grouping variable; this LCA model provides the basis for the following steps.

Another example that relied on the LCA Outcome Probability Calculator appeared in Lanza, Rhoades, Nix, & Greenberg (2010).

## Recommended Citation

LCA outcome probability calculator (Version 1.0) [Software]. (2011). University Park: The Methodology Center, Penn State. Retrieved from http://methodology.psu.edu

**Below we present a step-by-step approach to predicting a binary distal outcome from latent class membership.**

### Step 1. Fit Latent Class Model with a Covariate using PROC LCA

Following model selection and the addition of the grouping variable (control, treatment), the outcome variable (t2binge) was included as a covariate so that the associations between latent class membership and Grade 9 binge drinking could be estimated. Below is the SAS syntax used to fit this LCA model.

**SAS syntax for LCA with a covariate**

*Test for differential treatment effect across latent classes using outcome as covariate;

PROC LCA DATA=assigned outparam=start;

TITLE2 'Five Risk Classes';

NCLASS 5;

ITEMS HH_poverty single peer_cig peer_alc unemp below_pov;

CATEGORIES 2 2 2 2 2 2;

SEED 108;

RHO PRIOR=1;

COVARIATES t2binge;

GROUPS tx_group;

GROUPNAMES control treatment;

MEASUREMENT groups;

RHO PRIOR=1;

RUN;

### Step 2. Obtain Intercepts & Slopes from SAS Output

From the SAS output of the model above, we obtained the intercepts and slopes for adolescents in each treatment condition (highlighted in yellow below). In the SAS output, the row labeled "Intercept" contains the intercept estimates and the row labeled with the name of the outcome variable, in our case "t2binge," contains the slope estimates needed for the calculator.

In the SAS output Classes are as follows^{1}:

Class 1: Low Risk

Class 2: Economic Risk

Class 3: Household & Peer Risk

Class 4: Peer Risk

Class 5: Multi-contextual Risk

**SAS Output for Intercepts & Slopes**

Beta estimates (standard errors) |
|||||

CONTROL: |
|||||

Class: | 1: Low Risk | 2: Econ Risk | 3: HH & Peer | 4: Peer | 5: Multi |

Intercept | Reference | -0.4157 | -1.3240 | -1.0338 | -2.0866 |

(0.1260) | (0.2476) | (0.2062) | (0.2468) | ||

t2binge | 2.0386 | 4.4061 | 4.3420 | 4.3800 | |

(1.3727) | (1.3570) | (1.3477) | (1.3499) | ||

TREATMENT: | |||||

Class: | 1: Low Risk | 2: Econ Risk | 3: HH & Peer | 4: Peer | 5: Multi |

Intercept | Reference | -0.2797 | -0.9489 | -0.0928 | -2.7260 |

(0.1436) | (0.2435) | (0.1494) | (0.4524) | ||

t2binge | -0.9160 | 1.2739 | -0.5353 | 2.8876 | |

(0.4582) | (0.3208) | (0.4793) | (0.4963) |

^{1}The order of the classes in the SAS output is based on the seed that was used. This order does not match the order in the *LCA Outcome Probability Calculator* on the "Lanza & Rhoades Example" worksheet, which orders them according to level of risk in order to improve substantive interpretation.

### Step 3. Obtain Marginal Frequencies of Outcome by Grouping Variable

Next, PROC FREQ was used to obtain the known marginal distributions of the outcome (t2binge) in each treatment condition (control, treatment). Below is the SAS output from this set of analyses. The marginal frequencies needed for the excel calculator are highlighted in yellow below.

**SAS Output for Marginal Frequencies**

Frequency table of treatment condition by Time 2 binge drinking (distal outcome):

Time 2 Binge

No Yes Total

______________________________

Control | 703 | 270 | 973

______________________________

Treatment | 778 | 149 | 927

______________________________

Total | 1413 | 487 | 1900

### Step 4. Input Parameters (Intercepts, Slopes & Marginal Frequencies) from Steps above into Excel Workbook

Using the output from Step #2, you will need to insert the intercepts and slopes for each class within each group (the control and treatment groups, in our case) into the *LCA Outcome Probability Calculator*. As is shown below, the top box represents the parameters for the control group and the bottom box represents the parameters for the treatment group. The cells highlighted in yellow represent those in which the user must enter information.

First, you will enter the intercept and slope estimates from the SAS output in Step #2 into the intercept and slope cells located in the spreadsheet. After you have inserted all of the intercept and slope estimates, next you will need to insert the marginal frequencies for the outcome (t2binge) in each treatment condition (control, treatment) from the SAS output in Step #3. For example, 703 adolescents in the control condition report "no" to binge drinking and therefore you will input this number into the cell labeled "N with outcome=0" in the top box where the control parameters are located. Once all of the intercepts, slopes and marginal frequencies are entered into the calculator for each class within each group, the calculator will automatically populate the probabilities listed in the far right column (cells highlighted in blue are probabilities for the control group and cells highlighted in orange are probabilities for the treatment group). These probabilities will sum to one within each group.

The following is one way to display the results obtained above from the LCA Outcome Probability Calculator. This graph will automatically populate based on the parameters entered into the calculator. In our case, this presents the proportion of treatment and control participants in each risk subgroup reporting binge drinking at Grade 9.

**References**

Harris, K. M., Halpern, C. T., Whitsel, E., Hussey, J., Tabor, J., Entzel, P., & J.R. Udry. (2009). *The National Longitudinal Study of Adolescent Health: Research desig*n [WWW document]. Retrieved from http://www.cpc.unc.edu/projects/addhealth/design

Lanza, S. T. & Rhoades, B. L. (2011). Latent class analysis: An alternative perspective on subgroup analysis in prevention and treatment. *Prevention Science.* Advance online publication. doi:10.1007/s11121-011-0201-1 PMCID: PMC3173585 View abstract

Lanza, S. T., Rhoades, B. L., Nix, R., & Greenberg, M. T. (2010). Modeling the interplay of multilevel risk factors for future academic and behavior problems: A person-centered approach. *Development and Psychopathology, 22,* 313-335. PMCID: PMC3005302 View article