Q&A with AP Bestavros: Part I

The burgeoning field of computing and data science is shaping the future. Today, data and computation are present in our daily lives, be it in public discourse, economics, research, or government. Students have demanded to be involved in advancing computational and data-driven research in a wide variety of disciplines - and we’ve responded.

This fall, Boston University will offer both a PhD in Computing & Data Sciences and a BS in Data Science. Let’s sit down with Azer Bestavros, Associate Provost for the Faculty of Computing & Data Sciences (CDS) to learn more about why BU is investing in the field of the future and how this new academic unit is advancing data-driven, evidence-based advocacy.

*Note that this is part 1 of a series of interviews with Associate Provost Bestavros. Part 2 is available here

Q: If you had to describe the mission of CDS in a short sentence, what would that be and why?

Azer Bestavros: Simply put, the mission of CDS is to democratize access to computing and data science.

I say that - intentionally leaving out who benefits from democratized access. For an academic unit, this includes students, faculty, industry, and the public sector. For every one of these constituents there is a widening gap between those with almost unfettered access (the "insiders") and those with cursory - if any - access (the "outsiders").

I see the mission of CDS as narrowing the gap between the "insiders" and "outsiders." For students, that means developing educational pathways that focus on students who may not have considered STEM for their education. For faculty, that means focusing on interdisciplinary research that allows a broader set of disciplines to leverage computing and data science. For industry and other external organizations, that means outreach that goes beyond “big tech” to provide access to data science research and talent pipeline.

Q: Can you elaborate on why it is important to go beyond big tech?

When it comes to leveraging data science capacities, there is a widening gap between the haves and have-nots. On one hand big tech and big political action committees (PACs) are leveraging data science to the nth degree by monopolizing data collection, recruitment of computing and data scientists, and development of new information products. On the other hand, government, non-profit, and advocacy organizations are falling behind. I believe that academia has an important role to play in narrowing that gap. If academia does not step in to do this, who will?

Q: What do you see as the role of CDS in supporting public and civic organizations? What is the role of CDS in advancing data-driven, evidence-based advocacy and policymaking?

As it relates to “data-driven, evidence-based advocacy and policy making,” what we need to do is to *empower* the public with the data and tools they need to build whatever evidence they need to push for policy reforms/change/etc. Today, data and data science are the purview of big tech, wall street, and lobbyists. Our job is to level that playground, and the work that we are doing in conjunction with co-Labs is all about that. This is crucial in order to change the narrative.

Q: There seems to be conflicting views on all sides of the political spectrum about the veracity of data used in support of (or in opposition to) a policy or a cause. What is the role of data science in that respect? Can data science lead to introducing objectivity in the process leading to building evidence from data?

I always say: “Data speaks louder than words.” People may quibble about what the data says, but at the end of the day, if the data analysis is bulletproof, then it is hard for them to argue.

A good example is the gender/racial pay equity project we did with the Boston Women Workforce Council (and the City of Boston). By relying on the backend HR data of *all* employees in the 150 participating companies and not just a sample of the data, we were able to squash the concerns of those who had questioned prior reports of pay inequities.

Of course, to use raw data, we had to deploy fairly significant/deep mathematics and engineering to protect privacy and confidentiality. But, by using raw data, nobody could question the answers we came up with: 70c to a dollar for women and 50c to a dollar for black women (compared to white males) for some types of professions. And, now, for the fifth year in a row, BWWC is tracking these numbers to see if changes are happening.

Q: The example you gave above speaks to data as evidence but does not speak as much about how computing could be helpful in evaluation of policies or in policymaking. Can you give an example?

Well, historical data can be used not only to obtain evidence in support of or in opposition of a policy that is already in place, but also to build models that answer hypothetical questions about alternative future policies to enact. This is the realm of predictive modeling -- e.g., using statistical and machine learning approaches to build models that would predict changes in outcomes if different policies are adopted. Other examples where computation can be extremely valuable in predicting outcomes with high levels of certainty include the use of simulation, including simulation of socioeconomic processes, simulation of how people from various demographics react to misinformation and disinformation, and even simulation of racist behaviors! The entire field of economics was transformed with the advent of behavioral economics (which is all about simulating how people react to economic triggers, often irrationally). I can give at least a half-dozen other examples of how computation is integral to potential evidence-based policy making in general (both the analysis and predictive dimensions which are really two sides of the same coin).

Q: This is fascinating. Can you elaborate on simulation of socioeconomic processes? What is that about? Why is this important or even relevant to policymaking?

This is crucially important because one needs to “model” how people react to changes in (say) policy? How do people choose between different alternatives? For example, how many people will accept to take a COVID vaccine? And how is that different for different groups (e.g., gender/race/income level/political affiliations/religious beliefs/socio-economic status/etc.) A good example of computational social science and humanities research with significant impact in that space is the Center for Mind And Culture (CMAC), which is led by Wesley Wildman – a BU computational theology professor and a member of CDS. See https://mindandculture.org/projects/modeling-social-systems/ for more!