‘We Live in a World of Big Data’.

From social media to social networks, machine learning to machine building, health information is easier to capture and analyze more than ever before. Biostatistics plays a central role in evaluating this data to better understand and tackle the health challenges that individuals and populations face across the globe.
On Friday, November 15, the School of Public Health hosted the daylong symposium “Statistics and the Life Sciences: Creating a Healthier World” to explore the statistical progress and challenges in collecting, understanding, and utilizing data pertaining to a variety of health behaviors and diseases.
The symposium was organized by Josée Dupuis, chair and professor of biostatistics, and Eric Kolaczyk, professor and data science faculty fellow in the Department of Mathematics and Statistics at Boston University, and cohosted with the American Statistical Association, the Institute for Mathematical Statistics, and the National Institute of Statistical Sciences. Scholars from biostatistics, computer science, pharmacology, medicine, and more gathered for in-depth discussions that centered around three critical areas of the health sciences: digital health, machine learning in causal inference, and networks for public health.
In opening remarks to a full audience in Hiebert Lounge, BU President Robert Brown reflected on how advancements in data storage and computing power have transformed the way health experts address perplexing questions and have “given us promise for understanding a myriad of factors that determine our health.” He said that one of BU’s major initiatives is to expand data science resources across the entire University. A state-of-the-art Center for Computing & Data Sciences is set to open in 2022.
“One of the biggest challenges facing universities is preparing all of our students to live and work in a data science-driven world,” Brown said. “Our goal is to put data sciences in every one of our 17 schools and colleges.”
In an interactive format with presentations, panel discussions, and questions from the audience, 18 speakers shared their data-driven work in a range of subjects, such as personalized interventions in public health, digital monitoring of home-based physical therapy, digital measurements in pharmaceutical development, the role of domain knowledge, and social network data analysis.
Susan Murphy, professor of statistics at Harvard University, spoke about a micro-randomized trial design that she developed to assess the effectiveness of HeartSteps, a mobile application that acts as a virtual health coach for people at risk of heart disease by encouraging them to engage in physical activity. Utilizing personal data collected through a wearable band, smartphone sensors, and self-reporting, the app delivered and continually optimized individually tailored “treatments” in the form of physical activity suggestions based on their daily schedules and walking habits.
“We all know that our behaviors form the major risk factors for many of the chronic problems that we later experience,” Murphy said. “Mobile health is about helping people change their behaviors.”
During a session on machine learning and causal inference, speakers examined the methodological opportunities and challenges in using machine learning to understand world patterns, and improve patient care and health outcomes. Beth Ann Griffin, senior statistician and co-director of the RAND Center for Causal Inference at the RAND Corporation, spoke about the importance of causal effects, and how machine learning can be used to estimate propensity scores.
“We live in a world of big data,” Griffin said, and to estimate causal effects, statisticians need to understand when and how this data can be used. “Using machine learning to estimate propensity scores has improved the science of statistics by reducing bias and improving precision,” she said.
David Dunson, professor of statistical science and mathematics at Duke University, provided a keynote speech on network data analysis, which was followed by a panel discussion among statistics and sociology scholars on the impact of social networks on health.
“Mental health and behavior outcomes have an enormous public health impact, but mental health is difficult to assess,” said Dunson. Thanks to recent advancements in imaging technology, neuroscientists can now measure an individual’s physical brain connection structure and assess the relationship between brain structure and the traits of an individual.
In closing remarks, Bhramar Mukherjee, chair of biostatistics at the University of Michigan School of Public Health, revisited the major themes of the day.
“We saw today that different classes of scientific problems require different classes of techniques, and we also recognized the importance of incorporating domain knowledge into all of the brilliant statistics and data science work that is being done,” Mukherjee said. “This symposium was about human health and life sciences. There is more to life than prediction, and we saw a lot of examples of that today.”