Elaine Nsoesie and the Ethics of Data Science

Data is often thought of as a group of objective numbers that relay information about the world. What most people don’t realize is that the collection and reporting of data is often fraught with bias and ethical issues. Dr. Elaine Nsoesie is an Assistant Professor at the Boston University School of Public Health in the Department of Global Health whose current work centers on applying data science methods to global health problems, particularly focused on how data can be used for good and how bias and ethics come into play. She is also the Assistant Director of Research at the Boston University Center for Antiracist Research and is currently teaching a class titled ‘Applications of Machine Learning in Global Health’. In both her work and in her class, Nsoesie places an importance on the careful consideration of data.

“I think having an understanding that data has opportunities but data can also be used as a tool to create inequality or promote inequality is important,” Nsoesie says. “So, we need to be aware and think very carefully about the data that we use and be able to question it.”

In her class, Nsoesie invites speakers from different backgrounds to talk to students about the ethics of data with the hopes that they will consider these perspectives when working with big data. One speaker spoke from a Western perspective, while another hailed from South Africa and provided a very different account. 

“When we say something is ethical, what exactly do we mean?” Nsoesie asks. “We never really think about it because we’ve been trained to think about ethics in a particular way – in a Western way. But how does an African or Asian think about ethics? These are cultures that have existed for a very long time and they have their own ways of thinking and their own beliefs around these things.”

Nsoesie considers her work ‘interdisciplinary’, as she oscillates between different fields. She completed her Bachelor of Science in Mathematics at the University of Maryland before pursuing a Masters degree at Virginia Tech in Statistics. There, she was able to work in a computational lab modelling large scale simulations that study the spread of different types of viruses and infectious diseases.

This was work that Nsoesie enjoyed doing. After completing her Masters, Nsoesie wasn’t ready to leave and went on to complete a PhD in Computational Epidemiology from the Genetics, Bioinformatics and Computational Biology program at Virginia Tech. Her PhD dissertation, Sensitivity Analysis and Forecasting in Network Epidemiology Models, at the Network Dynamics and Simulations Science Lab at Virginia Tech BioComplexity Institute developed methods for forecasting the spread of infectious diseases. Here, Nsoesie faced a challenge. Her studies involved creating simulations that used real world data, but Nsoesie found that owners including public health departments were often unwilling to share data.

“I started to look at other data sources available and that led me to data on the internet,” Nsoesie explains. “I started doing work on how we can utilize Google searches because a lot of people were searching for flu-like symptoms. So, that was information we could use to model the spread of influenza in a particular population. Twitter was also another one.”

Eventually, this became a large focus of her work. For the past seven years, Nsoesie has been investigating how big data on the internet can be used to address public health issues. This involves examining how data is being collected and used, who is left out of seemingly perfect modelling systems, and how to best create models that are representative of different groups in society. This year, Nsoesie has also taken a position at the Boston University Center for Antiracist Research where her work involves thinking about racial disparities and how to address them.

“It is a very data focused position which is perfect,” Nsoesie says. “I get to think a lot about data and data gaps and how we can improve racial data to better understand racial inequities and address them. If we don’t have the data, we can’t address them.”

For students interested in data science, particularly at the intersection of global health issues, Nsoesie has a few words of advice.

“There is so much opportunity to do good in data science,” Nsoesie says. “As long as we are careful about what we are studying and we are open to being corrected.”