Using Data Science to Address the Gender and Racial Wage Gap
Boston-area women, on average, were paid 70 cents for every dollar earned by men in 2021
The overall gender wage gap hasn’t changed in the Boston area over the past two years, according to a recent report from the Boston Women’s Workforce Council (BWWC). But identifying a problem is often the first step toward finding a solution—and Boston University experts say that data science is increasingly becoming the tool that can provide the quantitative rigor and objectivity needed to enable effective change.
Software engineers and statisticians at BU collaborated with the BWWC, a public-private social impact initiative housed at the University’s Rafik B. Hariri Institute for Computing and Computational Science & Engineering, to securely collect and scrutinize payroll data from Greater Boston employers. The employers submitted the data anonymously as part of a pledge they made with the Boston Mayor’s Office and the BWWC to help close gender and racial wage gaps. This year, the collected data represented 156,000 employees, covering 14 percent of the Greater Boston area workforce. (BU signed the pledge, and as a signer, contributed its salary data.) The BWWC published the results of its wage gap analysis online on December 9.
About five years ago, the Hariri Institute’s Software & Application Innovation Lab (SAIL) developed a secure data collection and analysis platform that the BWWC uses. The software uses a cryptographic technique called multi-party computation, or MPC, to collect and analyze private wage information. “MPC is the idea of multiple parties pooling their data to give you an aggregate result without any one individual’s data being exposed,” says Arezoo Sadeghi, a software engineer at SAIL. “MPC lets you work with sensitive data and you’ll never be able to identify people.”
After companies submit their information, statisticians process the data to support the BWWC’s analysis. This year, Anna Cook, a student in the College of Arts & Sciences MS in Statistical Practice (MSSP) program, and Masanao Yajima, director of MSSP Consulting and a CAS associate professor of the practice for MSSP, led these efforts. Cook, under Yajima’s supervision, crunched numbers to calculate the average gender and racial wage gaps and created visualizations to help the BWWC make sense of the data. Working on this report combined the skillfulness in data science that Cook has gained in MSSP with her interest in social justice.
Cook and Yajima found that Boston-area women, on average, were paid 70 cents for every dollar earned by men in 2021. Although the average gender wage gap hasn’t shrunk over the past two years in Boston, the racial wage gap was slightly smaller in 2021 than in 2019, according to the BWWC. When compared to 2019 figures, Asian and American Indian/Alaskan Native women in 2021 were paid two and three cents more, respectively, for every dollar earned by men.
“I think of Boston as being a city that is highly educated and has a ton of job opportunities,” Cook says. “It was interesting to see that some of the gaps that we saw were comparable to what we would see on a national scale.”
An important point, however, was that the numbers are averages across all companies that submitted their data and do not reflect changes over time within a single, individual company. Researchers at the Hariri Institute hope to refine methods at the interface of privacy and statistics so that these individual-level changes can be better understood while still keeping confidentiality in place.
Benchmarking can help drive progress towards equal pay, even without historical wage information from individual companies. Both Mastercard and Starbucks, for example, reported their gender wage gaps in the past few years, and committed to transparency and a push for equity in the workforce. But without a standardized method of doing the math, those claims feel insignificant to some.
In Boston, the BWWC gives employers both the equations for internal use and a standardized benchmark for comparison. “Working in a way that is data-centered is good. If we have data and analysis to back up claims, they are more likely to be believed,” says Vidya Akavoor, a software engineer at SAIL.
The computational tools used to create the BWWC report could be applied to a variety of other contexts where equity needs to be measured without compromising trust or privacy. For example, the analysis of college admissions data, financial credit ratings, environmental justice practices, and even healthcare operations could benefit from data-centric thinking, but often involve sensitive or proprietary data.
“In the modern era where everything is so data-driven, one of the key elements in addressing such grand challenges should be data, numbers, and quantifying things,” says Eric Kolaczyk, director of the Hariri Institute and a CAS professor of mathematics and statistics. “If you can quantify a problem, then you have some hope for sitting down together objectively to discuss policies and procedures to achieve a goal. The success of the BWWC’s work demonstrates that this can be done in a way that respects yet overcomes barriers arising around trust, governance, and the like.”