The Importance of Secondary Data and How It Has Impacted My Research

A blog post by Hayoung Kim

There have been many times during my PhD program in Applied Human Development where I have been truly invigorated and inspired by the work I get to do. But whether it is discussions with fellow cohort mentors or moments with faculty advisors, nothing compares to the moments when I find exciting, unexpected, or explanatory results in data that provide an even richer and important story than I could even consider.

No matter what type of data or how hard I grapple with it, these moments make me feel that the time and effort that I put in is worthwhile. But when I find results for secondary data—data that another researcher has collected and made available—that level of joy that I feel is even greater, despite the fact it was not collected by me or created for my research.

Why do I love secondary data so much? Here are three reasons:

  1. Secondary data is often collected from a representative sample of the target population, something that would be difficult for me as an individual researcher to do.
  2. Secondary data enables me to be creative and think about numerous possible scenarios across numerous different factors. It provides me an opportunity to consider ideas outside previous literature and my limited knowledge.
  3. Secondary data makes research more accessible to masters and doctoral students, who have limited opportunities during graduate studies to lead large-scale research projects.

Last year, my classmate Danielle Richardson (a doctoral student in Counseling Psychology) and I began to wonder why we didn’t see much secondary data from marginalized or underrepresented populations. Danielle and I are both passionate about mental health and suicide prevention so, with that in mind, we started to search for secondary data and studies with data in this topic area.

We found some secondary data that intrigued us. Each year approximately 800,000 people die by suicide worldwide, with an estimated 75% of these deaths occurring in low-and middle-income countries (LMIC) (World Health Organization, 2014). Despite this, the research on suicide is largely based on data from high-income countries (HIC; Knipe et al., 2019). Secondary data on suicide in LMIC is less available than with HIC, and secondary data on suicide in LMIC hasn’t received much attention from researchers so far.

The Global School-Based Student Health Survey (GSHS), which does include data from LMIC has been collected by the World Health Organization since 2003, but we found that not every country’s data was actively used by researchers. With this in mind, Danielle and I began to design our research using the GSHS to add to the literature on suicide determinants with diverse adolescents across the world.

Last summer, we developed our research project, “Prediction of Determinants of Suicide in Adolescents in Low and Middle-Income Countries using Machine Learning.” With the support of BU Wheelock’s counseling psychology and applied human development department and by using machine learning, we examined the prediction model for three determinants of suicide (suicide ideation, planning, and attempt) with the data of 82,494 adolescents across 32 low-and middle-income countries from GSHS data.

Using this secondary data, we found that the most important predictors of suicide ideation, planning, and attempt of adolescent in LMIC were age, school climate, and parents-children relationship. Additionally, we found that psychological well-being (i.e., feeling loneliness, having worries) were more important than physical well-being to predict the three determinants of suicide.

By providing good quality data to everyone, secondary data enables graduate students to engage in research more deeply beyond writing literature reviews or proposing research ideas. My goal as a researcher is to shine a light on data about marginalized or underrepresented individuals so that more stories and voices from diverse populations are included in future research. This project was the first step in accomplishing that goal.

Hayoung Kim is a doctoral student in Applied Human Development at BU Wheelock. She is currently working with Dr. Scott Solberg at the Center for Future Readiness. Her research focuses on machine learning, quantitative research methods, social support, suicide, resilience, adolescents, and emerging adults.