In This Issue:
- Director's Report
- 2011 JASA Invited Paper: Adaptive Confidence Intervals for the Test Error in Classification
- Featured Scientist: Donna Coffman
- P50 Scientific Projects
- Methodology Minutes
- Recent Activity in The Methodology Center
- MC Friends' Corner
- Ask a Methodologist
Welcome to another edition of the Methodology Center Perspective! There is a lot of news to share with you.
I am very happy to announce that our P50 center grant, “Center for Prevention and Treatment Methodology,” has been renewed by the National Institute on Drug Abuse (NIDA) for an additional five years, Years 14-19. Writing such a large application is a team effort. I feel incredibly fortunate that my team includes Drs. Donna Coffman, Stephanie Lanza, Runze Li, and Susan Murphy. It is hard for me to imagine a better group of collaborators. The article on page 2 describes the scientific projects in the new P50, including two exciting pilot projects.
With this new funding, our work is moving in an important new direction: the HIV area. For the first time, our P50 has been AIDS-coded by NIDA. We are excited about this new emphasis, which is opening up opportunities for developing new collaborations and for investigating interesting methodological issues.
I want to take this opportunity to welcome some new hires. Katie Bode-Lang is our new assistant director (and, I believe, the first published poet we have ever had on our staff). Among Katie’s duties will be serving as the editor of this newsletter. Amanda Applegate has joined us as a writer/editor, and Tammy Knepp is our new administrative support assistant. We have also hired two new postdocs. Dr. Violet Xu joins us from the University of California, Davis, where she earned a Ph.D. in quantitative psychology. Dr. Charu Mathur comes to us from the University of Minnesota, where she earned a Ph.D. in epidemiology.
There are some other personnel changes. Lisa Litz, who had been our receptionist for five years, has left us to join Penn State Altoona. Dr. Jackie Wiersma, who was until recently a postdoctoral fellow in the Prevention and Methodology Training Program, is now an assistant professor in the School of Human Environmental Sciences at the University of Arkansas. Research associate Dr. Michael Cleveland remains at Penn State but is moving to our sister center, the Prevention Research Center for the Promotion of Human Development. Dr. Inbal Nahum-Shani is now a research associate at the Institute for Social Research at The University of Michigan. Both Michael and Inbal are part of ongoing center work and remain affiliated with the center.
Dr. Mark Stemmler, professor of psychological methodology and quality assurance and dean of the Faculty of Psychology and Sports Science, Bielefeld University, Germany, was a visiting professor in the Methodology Center for part of the fall semester. He taught a 1-credit graduate course on multiway contingency table analysis using configural frequency analysis (CFA). We enjoyed Mark’s visit very much and hope he visits again!
The Summer Institute on Longitudinal Methods (June 7-9, 2010) was a big success. Our topic this year was “Analysis of Longitudinal Dyadic Data,” and the instructors were Dr. Niall Bolger, professor of psychology at Columbia University, and Dr. Jean-Philippe Laurenceau, associate professor of psychology at the University of Delaware. This year we video recorded much of the summer institute. Watch for an announcement when the video is available for viewing on our website, methodology.psu.edu.
I hope you enjoy this issue of the Methodology Center Perspective!
Linda M. Collins, Ph.D.
Director,The Methodology Center
Penn State University
Drs. Eric Laber and Susan Murphy's paper “Adaptive Confidence Intervals for the Test Error in Classification” has been recently recognized by the American Statistical Association; their paper was selected as the Joint Statistical Meetings’ Invited Theory and Methods Paper in 2011. In addition to presenting their paper at the meeting, it will also appear in the Journal of the American Statistical Association with discussion in September 2011. In this paper, Drs. Laber and Murphy propose a novel method for constructing theoretically valid confidence intervals for the misclassification rate.
A common task in many areas of science is the classification of an object into one of a number of fixed categories based on a series of empirical observations. An example would be classifying a patient suffering from depression as either high or low risk for a suicide attempt using items from a questionnaire. A rule for mapping patients into categories (high risk/low risk) based on data is called a learned classifier.
A learned classifier, once built, is deployed to make predictions at new inputs. A critical question, then, is, “How frequently does the learned classifier assign an object to the wrong category?” For example, what proportion of high-risk depression patients will be incorrectly classified as low-risk? The implications of such a mistake are potentially quite severe. Consequently, it is imperative to not only report an estimate of the proportion of mistakes but also a measure of confidence. The problem of constructing a confidence interval for the misclassification rate of a learned classifier has been an open problem for nearly two decades. This is somewhat surprising given the ubiquity and importance of this problem. Previous attempts have made either unverifiable or unrealistic assumptions about the underlying data, leading to poor small-sample performance under many realistic scenarios.
Drs. Laber and Murphy’s theoretical results rely on far weaker—and therefore more realistic—assumptions than previous methods and exhibit significant empirical improvement over previous methods. Congratulations to Drs. Laber and Murphy on being recognized for this important work!
Dr. Donna Coffman is a research associate at the Methodology Center. She is the scientific project director of one research component of the new NIDA funded P50. Her component is titled “Causal Mediation in Non-Randomized and Multilevel Intervention Research.” The overall purpose of this project is to improve methods for drawing causal inferences from mediation models and to place these methods in the hands of prevention and treatment scientists.
An example of Dr. Coffman’s current interests is her work with HealthWise, a program that aims to increase leisure participation of students in South Africa, with the hopes that increased leisure skills will decrease substance use. In this scenario, leisure participation is an intermediate step between the intervention and the final outcome, also known as the mediator. However, subjects cannot be assigned to levels of a mediator, since it is a result of the intervention. Therefore, it is difficult to determine whether leisure participation actually causes a change in substance use, or if that change is due to some other factor such as an increase in community safety. These outside factors are called confounders, and they include things such as socio-economic status, access to leisure equipment, and the ability to safely participate in leisure activities in the community.
Dr. Coffman is also working to enable researchers to draw more valid conclusions from observational studies. For example, in a project with Julia Moore and Dr. Stephanie Lanza of Penn State, Dr. Coffman is examining the effect of participation in Head Start compared to parent-based child care on later verbal ability using data from a nationally representative sample of kindergartners (Early Childhood Longitudinal Study, Kindergarten; ECLS-K). Adjusting for selection effects associated with enrollment in Head Start using propensity score methods allows researchers to draw more valid inferences about the differential impact of preschool setting on later ability.
In her free time, Donna enjoys hiking, gardening, and cross-country skiing. She also enjoys spending time with her partner, bird, and three dogs.
Coffman, D. L. (in press). Estimating causal effects in mediation analysis using propensity scores. Structural Equation Modeling.
Scientific Project I
Causal Mediation in Non-randomized and Multilevel Intervention Research
PI: Dr. Donna Coffman, Research Associate, Methodology Center, Penn State
This project will generate new statistical models and software to enable scientists to draw more valid causal inferences about mediation processes and then apply those methods in secondary analysis of prevention and treatment data. This will ultimately inform the design of more cost-effective interventions for substance use and HIV. See the Featured Scientist article above to learn more about Donna Coffman.
Scientific Project II
Advances in Finite Mixture Modeling for Substance Use and HIV Research
PI: Dr. Stephanie Lanza, Senior Research Associate and Scientific Director, Methodology Center, Penn State
The goal of this project is to enable substance use and HIV scientists to identify underlying subgroups based on risk exposure, risk effects, treatment response, and other factors. The guiding principle underlying this research is that populations are comprised of more than one type of individual; findings from this project will inform development of intervention programs for maximum effect based on the targeted subgroups.
Scientific Project III
New Models for Joint Analysis of Intensive Longitudinal Data and Survival Data
PI: Dr. Runze Li, Professor of Statistics, Penn State
This scientific project will focus on developing new, flexible statistical procedures for joint analysis of intensive longitudinal data and survival data. Userfriendly computer software will allow drug use prevention and treatment scientists to implement the new methods, which will be useful for testing key hypotheses in drug use and HIV research. A collaboration with investigators of the Women’s Interagency HIV Study (WIHS) is already under way to jointly examine the roles of alcohol, tobacco, and drug use on the disease course and survival time.
Scientific Project IV
SMART Methodology for Constructing Adaptive Interventions
PI: Dr. Susan Murphy, H.E. Robbins Professor of Statistics, The University of Michigan
The long-term goal of this project is to improve clinical practices by facilitating the evidence-based construction of effective, individualized interventions and treatments in drug abuse. This project will encourage broader use of the Sequential Multiple Assignment Randomized Trial (SMART) design by developing tutorials and case studies that illustrate how scientists can use the primary time-varying outcome in SMART data analyses to meet their scientific or clinical goals.
Pilot Project I
Causal Inference in the Presence of Time-Varying Confounders
PI: Dr. Daniel Almirall, Faculty Research Fellow, Institute for Social Research, The University of Michigan
This project will develop methodology to allow scientists to examine how time-varying moderators (such as history of response to treatment) influence the effect of additional treatment on later outcomes. By allowing such questions to be addressed, this project will inform the development of adaptive treatment strategies.
Pilot Project II
A System Sciences Approach to Modeling Compliance Dynamics
PI: Dr. Rachel Smith, Assistant Professor of Communication Arts and Sciences, Penn State
This project will look at how individuals influence change within a social network based on perceived power and levels of communication. This new approach of dynamical systems modeling will aid research in understanding adolescent substance use and HIV risk in the context of social networks.
Interested in learning more about missing data analysis beyond what you read in an article? Wish you could sit in on a conversation between interdisciplinary scientists? Our new podcast series, “Methodology Minutes,” offers information on a variety of topics. Episodes are around twenty-five minutes each, and they give listeners insights that only a conversation can provide. Next semester, look for podcasts featuring Dr. Bethany Bray on associative latent transition analysis; Dr. Mark Stemmler on configural frequency analysis; and Drs. Ed Smith, Linda Caldwell, and Linda Collins on Type 2 translation of an evidence-based intervention in South African schools (that is, studying the large-scale implementation of an existing intervention). Podcasts are available to download for free from iTunesU. You can also subscribe to “Methodology Minutes” so you’ll never miss a new podcast. Visit “Methodology Minutes” on our website for all the links.
“Summer Institute 2010” features an informal discussion with the presenters, Dr. Niall Bolger, professor of psychology at Columbia University, and Dr. Jean-Philippe Laurenceau, associate professor of psychology at the University of Delaware; the organizer, Dr. Stephanie Lanza; and several participants.
"Missing Data Analysis: Making It Work in the Real World," where Dr. John Graham of Penn State describes state-of-the-art methods for handling missing data.
“An Odd Couple in Interdisciplinary Research,” in which Dr. Linda Collins of Penn State and Dr. Daniel Rivera of Arizona State University are interviewed about the unlikely but fruitful collaboration between a quantitative psychologist and an electrical engineer.
“Society for Prevention Research (SPR) Competition Cup Winners from PSU” features an interview of several members of the 2009 winning team describing their experience in the competition.
“New Book on LCA and LTA: An Interview With the Authors,” where Drs. Linda Collins and Stephanie Lanza describe their 2010 book on latent class and latent transition analysis (part of the Wiley Series in Probability and Statistics) and talk about the experience of writing it.
“Introduction to the Methodology Center: Who We Are, What is Our Mission.” Listen to an interview with director, Dr. Linda Collins, to find out more about the work we do!
Drs. Susan Murphy and Linda Collins both presented at the Behavioral Intervention Optimization: Capitalizing on Engineering, Computer Science, and Technology conference in Bethesda, Maryland, in June 2010.
A new web applet is now available to calculate the sample size necessary to detect a meaningful difference between two-stage adaptive treatment strategies in a SMART trial. You can access the Sample Size Calculator for a SMART Design with Censored Data at the Methodology Center website (methodology.psu.edu).
Center scientists attended the Center for Aids Research (CFAR) Joint Symposium on HIV Research in Women in October 2010. The program was hosted by the Chicago Development for AIDS Research (www.chicagocfar.org).
Multiple center scientists, including Drs. Runze Li, Inbal Nahum-Shani, Linda Collins, Stephanie Lanza, Brittany Rhoades, Donna Coffman, and Michael Cleveland presented papers and posters, and participated in symposia at the annual meeting of the Society for Prevention Research (SPR) in Denver in June 2010.
Drs. Suellen Hopfer and Lauren Molloy, fellows in the Prevention and Methodology Training program (PAMT), along with graduate student Jessica Trail, helped bring the Sloboda and Bukoski SPR Competition Cup back to Penn State.
The Methodology Center’s Green Team, established last year, was the first in the College of Health and Human Development. Green Teams are groups of faculty or staff volunteering to take specific actions to help their organization operate in a more efficient, innovative, and healthy way. Our green team is currently working towards Level I Green Paws Certification.
Check out our most recently published articles:
Almirall, D., Ten Have, T., & Murphy, S. A. (2010). Structural nested mean models for assessing time-varying effect moderation. Biometrics, 66(1), 131-139. PMCID: PMC2875310
Bray, B. C., Lanza, S. T., & Collins, L. M. (2010). Modeling relations among discrete developmental processes: A general approach to associative latent transition analysis. Structural Equation Modeling, 17, 541-569. NIHMSID: NIHMS156531
Cleveland, M. J., Collins, L. M., Lanza, S. T., Greenberg, M. T., & Feinberg, M. E. (2010). Does individual risk moderate the effect of contextual-level protective factors? A latent class analysis of substance use. Journal of Prevention and Intervention in the Community, 38(3), 213-228. PMCID: PMC2898733
Lanza, S. T., Patrick, M. E., & Maggs, J. L. (2010). Latent transition analysis: Benefits of a latent variable approach to modeling transitions in substance use. Journal of Drug Issues, 40(1), 93-120. PMCID: PMC2909684
Lanza, S. T., Savage, J., & Birch, L. (2010). Identification and prediction of latent classes of weight loss strategies among women. Obesity, 18(4), 833-840. PMCID: PMC2847025
Smith, R. A., & Fink, E. (2010). Compliance dynamics within a simulated friendship network I: The effects of agency, tactic, and node centrality. Human Communication Research, 36, 232-260.
Zhu, L. P., Tan, X., & Tu, D. (2010). Testing the homogeneity of two survival functions against a mixture alternative based on censored data. Communications in Statistics: Simulation and Computation, 39, 767-776.
Have you used our software recently? Did you utilize our methodology in a recent paper? Has the Methodology Center helped you solve a research problem? If so, we want to know about it. Every semester, we’ll be featuring a scientist in the new “MC Friends’ Corner” section of the newsletter. Tell us about your recent work and you could be featured here. Submissions can be sent to Katie Bode-Lang at firstname.lastname@example.org.
I want to investigate multiple risk factors for health risk behaviors in a national study, but do not know how to handle the high levels of covariation among the different risk factors. Do you recommend that I regress the outcome on the entire set of risk factors using multiple regression analysis? Or should I create a cumulative risk index by summing risk exposure, and regress the outcome on that index?— Signed, Waiting to Regress
Recognizing that individuals develop within multiple contexts, and therefore simultaneously can be exposed to numerous—often highly correlated—risk factors, is critical in studies of human behavior and development. Historically, multiple risks for a poor outcome have typically been modeled using two approaches: multiple regression analysis and/or a cumulative risk index.
Multiple regression allows us to examine the relative importance of each risk factor in predicting the outcome, but there can be drawbacks: without the inclusion of many higher-order interactions, it is impossible to examine how exposure to certain combinations of risk factors impacts the outcome. Also, high levels of multicollinearity (for example, among risk factors such as maternal education, neighborhood disorganization, and residential crowding) can severely distort inference based on multiple regression.
Alternatively, a cumulative risk index can be created by summing for each individual the number of risk factors to which they are exposed. Instead of regressing the outcome on all individual risk factors, the outcome is simply regressed on the index score. Early work in this area represented an important step forward in thinking about multiple risks. A downside to this approach is that each risk factor is equally weighted; this means that exposure to each additional risk factor (regardless of which one) corresponds to an equal level of increased risk. Furthermore, such an index does not provide any insight into how particular risk factors co-occur or interact with each other.
A person-centered approach to modeling multiple risks was recently demonstrated by Lanza and colleagues (Lanza, Rhoades, Nix, Greenberg, and CPPRG, 2010). This study used latent class analysis to identify four unique risk profiles based on exposure to thirteen risk factors across child, family, school, and neighborhood domains. Each risk profile characterized a unique group of children: (1) Two-Parent Low Risk, (2) Single Parent with a History of Problems, (3) One-Parent Multilevel Risk, and (4) Two-Parent Multilevel Risk. Compared to a cumulative risk index, an examination of the link between risk profile membership during kindergarten and externalizing problems, school failure, and low academic achievement in fifth grade provided a more nuanced understanding of the early precursors to negative outcomes. The application of latent class analysis to multiple risks holds promise for informing the refinement of preventive interventions for groups of children who share particular combinations of risk factors.
Lanza, S. T., Rhoades, B. L., Nix, R. L., Greenberg, M. T., & The Conduct Problems Prevention Research Group (2010). Modeling the interplay of multilevel risk factors for future academic and behavior problems: A person-centered approach. Development and Psychopathology, 22, 313-335.
A Note to Readers: Do you have a burning question you would like to ask a methodologist? We would like to hear from you! Submit questions you would like to see answered in the newsletter to email@example.com. Be sure to put ‘Ask a Methodologist’ in the subject line.