All Stories

Making big data work

Big data analytics and data-based decision-making are changing every aspect of business. Here’s how to pick out the information you need—and put it to use.

Last year, when Amazon was scouting locations for a second headquarters, the company sent a 29-page request for information to 20 cities that it had short-listed from more than 100 hopefuls. The cities’ responses were strategic, of course, and they were voluminous. New York’s answer, for example, ran 253 pages, had 140 footnotes, and included thousands of data points. There was the obviously useful data, like the number of computer science degrees awarded by New York schools and colleges, but also some seemingly less relevant stats, like the cost of dry cleaning a shirt in Long Island City ($6.50) and the number of miles of dedicated bike lanes in New York City (24.9). In November 2018, after wringing meaning from answers submitted by the 20 cities, Amazon announced it would build one of its two new headquarters in Long Island City; three months later, it reversed course, pulling out in the face of local opposition. Perhaps better than any other company, Amazon collects data, mines it, and figures out what it all means, but its retreat from New York reminds us that even the best data analysis can fail to consider important human elements.

Still, that pitch document says a lot about the changing ways that important business decisions are made today. From that perspective, what New York City sent to Amazon was a 253-page testament to the power of big data, and it was not coincidental that it was demanded by a company whose growth has been fueled by the skillful mining and analysis of big data, as well as the commercialization of big data services. Amazon’s cloud computing business, Amazon Web Services, originally developed to serve only Amazon, has become the company’s most profitable business unit.

“You can answer all kinds of questions using data analysis,” says Ben Lubin, a clinical associate professor of information systems. As digitization makes it possible to measure more things in more ways, he says, “it is increasingly important to understand the relationship between these data and key business metrics, such as cost, quality, and behavior of employees. Once these relationships are understood, better decisions can be made.”

In response to this new reality, Questrom has hired faculty across six departments specializing in analyzing big data—they’re experts in computer science, stats, economics, and operational research—and is preparing to launch a master’s degree program in data analytics. In reporting this article, Everett talked to several Questrom experts in different applications of big data. Virtually every one of them began the discussions with this caveat: big data is not new to business. What is new, they told us, is the increasing ability to analyze and extract value from big data, thanks to developments in data storage, computing power, and the fine-tuning of algorithms.

Marketing departments, for example, no longer know only your address; they can find a picture of your house and see if it needs a new coat of paint. Supply chain managers can tell when their distributors are dawdling at a retail outlet. Accounting departments are using AI to interact with suppliers and to process and route invoices, often with help from optical recognition technology. HR is mining social media to find and screen the best candidates for open positions. Financial analysts are asking learning machines to redefine the relationship between risk and reward. And chances are good that the portfolio of stocks in your 401K is managed by a computer program.

In addition to replacing many of the gears of legacy enterprises, new tools are opening doors to entirely new kinds of business: location-targeted advertising, which is based on data collected from 200 million mobile devices, was estimated to be worth $21 billion in 2018; an internet-connected thermometer can take a child’s temperature and send information about zip codes where fevers are spiking to advertisers. And this is just the beginning. In May 2018, author Bernard Marr reported in Forbes that 2.5 quintillion bytes of data are created each day, and that 90 percent of all data in the world was generated in the previous two years.

And yet all of the data analytics experts who spoke with Everett also said that while the promise of big data is great, the realization of the promise is still a work in progress.

“Businesses do have a huge quantity of data,” says Garrett Johnson, an assistant professor of marketing. “But if you draw a Venn diagram between big data and good data, the intersect can still be pretty small. Successful data-driven businesses grow that intersection by building a culture of experimentation. These businesses create the data that lights the way forward.”

In the sections below, Questrom experts describe the ways that big data and big data analytics are changing business.

1. Supply Chain: Efficient Deliveries

Ask Arzum Akkas, an assistant professor of operations and technology management, which part of the supply chain is being changed by big data, and the answer is basically “all of it.” From the mines where ores are sourced to the shelves of retailers, Akkas says, data is helping to move things faster and more efficiently. It’s making it easier to deliver the right amount of product to each retail outlet at just the right time, which is what supply chain management is all about.

In 2018, Amazon had revenue of $232.89bn and shipped 1,111 packages every minute.
Sources: Nasdaq and Domo. Ina Fassbender/AFP/Getty Images

“One of the fundamental challenges of supply chain management is matching supply and demand,” says Akkas. “Today, you can get a better idea of demand because you can leverage social media information, in particular for new products that lack sales history. If you are doing e-commerce, you can use clickstream data to manage demand. All this can help you keep the right amount at the right warehouse.”

In the past, says Akkas, when a company like Pepsi wanted to map the most efficient routes for distribution, it would give delivery drivers handheld devices that they mainly used for collecting orders and printing invoices. Today, most companies have access to a constant influx of GPS data, which can be combined with streams of data about everything from traffic congestion to road construction to weather in order to determine the optimal route on any given day. Akkas suggests that data could also be used, for example, to help a beverage distributor estimate loading and unloading times of newer beverages whose package sizes vary from conventional bottles and cans.

Akkas’ own research has used supply chain data from a packaged goods company to determine the factors that impact waste of perishable products at retail stores. She also recently used an analysis of transaction and GPS data to help a food and beverage distributor diagnose problems in its sales and delivery operations.

As well as minimizing the waste of perishable products and finding optimal delivery times for different retail outlets, Akkas says, big data can even be used to define reasonable expectations for employee performance.

“You have all these salespeople on the road,” she says. “You want to know which one is doing a good job and which one is not, but it can be hard to compare them because all of the routes are different and traffic is different and the stores are different. But if you can leverage historical delivery data, you can establish what is a normal performance considering heterogeneity in locations, stores, and products. That’s important, because if you expect a higher performance and don’t get it, someone may be unfairly treated, which can impact retention. You have to establish goals that are realistic.”

2. Marketing: Better Targeting

Georgios Zervas, an assistant professor of marketing, is mining data from user-generated reviews on several websites. Zervas’ research has revealed that Airbnb guests tend to be kinder and gentler in their reviews than guests on TripAdvisor and that 16 percent of Yelp reviews are fake.

“Marketing is primarily concerned with consumer behavior,” says Zervas. “And these are exactly the kind of data sets that marketing people are collecting. They’re looking at everything from web browsing behavior to your location, to your shopping activities, to the ads that you click on.” All of the above, he says, makes targeting a consumer today much easier than it was just a few years ago.

“Suppose you’re in the market for a new car,” he says. “You might go to a website for Edmunds or Kelley Blue Book. Then you go to another website and suddenly you see a very well-targeted ad. That’s how it works.”

“Today’s marketers have your age, your income, and a picture of your house. All of these things can be used to make predictions about your shopping behavior." Georgios Zervas, assistant professor of marketing

Zervas says today’s marketers benefit not just from the abundance of data, but also from the computing power that can make sense of it. “Machine learning,” he says, “can incorporate unstructured data, like open-ended text, photos, and even videos. Today’s marketers have your age, your income, and a picture of your house. All of these things can be used to make predictions about your shopping behavior.”

Several other Questrom professors have also studied the marketing efforts of online platforms. Andrey Fradkin, an assistant professor of marketing, worked with Airbnb to devise ways to make the marketplace more efficient. Airbnb wondered if the methodology of its rating system, in which guests and hosts can see how they were rated before they rate the other party, affected the ratings. To find out, it switched things up, blocking the participants from seeing the others’ reviews. The number of reviews increased, says Fradkin, and the average rating decreased. “But,” he says,” the effect wasn’t very large. It wasn’t the primary determinant in what people rated each other.”

Johnson has helped Yahoo and Google assess the effectiveness of ads, including what are called retargeted ads, which are aimed at people who have already shown an interest in a particular product—the ads that crop up when you visit other sites.

“The big question is: Does this stuff actually work?” says Johnson. “A naïve marketer may believe the ads are really effective because many people who see retargeted ads go on to purchase the products.

“However, marketers need to remember that many of these people would have bought regardless. Plus, marketers run the risk of annoying people, because these ads are privacy intrusive.”

Johnson’s work found that the potential annoyance was worth the boost in sales, at least for one outdoor-goods retailer, where the ads lifted sales roughly 10 percent.

3. Accounting: Automated Auditing

Peter Wysocki, a professor of accounting, says big data analytics is bringing to accounting the same kind of advantages it has brought to other fields, although perhaps at a more measured pace. The more cautious approach is, he says, in part explained by accounting professionals’ preference “for more transparent and interpretable analytics tools [versus black box tech that hides how it works] given the legal exposures faced by both auditors and their corporate clients if there are technological missteps.”

Every minute, more than 120 users join LinkedIn, which has more than 500 million members. Sources: Domo and LinkedIn. Cecile Arcurs/Getty Images

Despite such reservations, he says, change is on the way. Deloitte already uses an application that analyzes volumes of financial information from SEC filings to identify and visualize potential accounting, fraud, and failure risks for all public companies. And blockchain, the decentralized cryptographic ledger, is believed to hold great promise in making auditing, compliance, and reconciliations faster and more accurate. Jean Baptiste Su, a vice president and principal analyst at consulting firm Atherton Research, predicts that AI technologies will automate tax, payroll, audits, and banking in most firms by 2020.

“Accounting professionals are starting to embrace big data,” says Wysocki. “The adoption will certainly be accelerated in cases where the applications use transparent and interpretable methods. In these cases, accounting professionals and their customers and clients can easily understand the intuition of which data is being used, how the technology works, and why it helps with decision-making.”

4. HR: Data-Driven Human Touch

Fred K. Foulkes, professor of management & organizations and director of Questrom’s Human Resources Policy Institute, says in his field, big data analytics and AI have been transformative. “It’s a sea change,” says Foulkes. “Today, HR leaders no longer have to say things like ‘I think we could make improvements.’ They now have the data to find the answers.”

As an HR expert, Foulkes knows what kinds of people companies want to hire, and today, he says, they want data scientists, and not just to fill positions in marketing. They want them for HR itself.

Google, unsurprisingly, was among the first to realize that potential. The company has been regarded as a leader in automated HR since 2008, when it launched Project Oxygen, a several years long analysis of the value of managers in a company that liked to think of itself as organizationally flat. To determine the merits of its managers, Google used a data-driven methodology similar to the one it used to deliberately hire ambitious self-starters and original thinkers. One result of the project was the identification of eight key behaviors, such as being a good mentor and being results oriented, of the most effective managers.

And when Google leaders wanted to know why it took so long to hire people, an analysis of data showed that the company’s practice of having candidates interview with 10 to 12 Google employees was inefficient overkill; seven interviews became the new limit.

Two years ago, three former Google employees used their data expertise to spin out their own HR company, Humu, which uses AI to keep employees happy, in part by “nudging” managers toward specific actions, such as meeting with employees to discuss advancement prospects.

Foulkes reports that more aggressive hiring companies are mining data from job sites like Indeed and Hired, and approaching people (so-called passive candidates) who have a desired skillset, even if those candidates appear to be happily employed elsewhere.

“HR is now more like finance or marketing,” says Foulkes. “They have data that they can use, and that makes them better business partners.”

5. Product Development: Why Didn’t We Think of That?

While new technologies like 3-D printing are changing the way we make things, other processes, like predictive analytics, are telling us what to make. Legacy giants like Procter & Gamble are using predictive analytics to gauge the likelihood of a new product’s success, and customer comments have long been telling Netflix what kind of content it should create. Web-based feedback, whether for hotel rooms or toasters, is widely used to inform the next iteration of a product, or business.

That product development strategy made headlines in December 2016, when Airbnb CEO Brian Chesky used Twitter to ask, “If Airbnb could launch anything in 2017, what would it be?” The tweet attracted more than 1,000 suggestions, including the provision of toiletries and a tool to help guests meet hosts with similar interests. It inspired Twitter CEO Jack Dorsey to issue a similar tweet one week later.

Today’s abundance of data and the ability to combine it in new ways has plowed a fertile field, and creative entrepreneurs have rushed to plant exotic business hybrids. Alexa, Nest, Zillow, Uber, and dozens of other ventures that are essential tools of many lives would not be possible without the power of big data analytics.

6. Finance: Calculating Risk

Marcel Rindisbacher, an associate professor of finance and senior associate dean for faculty and research, is quick to remind us that finance professionals have always kept careful and extensive records. But, he says, machine learning and AI are helping them find new meaning in old numbers. On many days, says Rindisbacher, as much as 70 percent of Wall Street trades are determined by computers. More powerful computers have also enabled high-frequency trading, where the difference of a few milliseconds can make or lose millions of dollars.

When it comes to credit scoring, Rindisbacher says, many financial institutions have realized that they are sitting on very valuable information. “Analytic tools are now much more promising in determining who is credit worthy and who is not,” he says. “A credit score can be based on real behavior of consumers.”

The credit bureau Experian, for example, recently invited consumers with unsatisfactory credit scores to submit their history of bill paying for such things as utilities, cable, and mobile phones; information that could potentially lift a person’s credit rating.

“Really, all of the models that the financial industry used for this were very wrong,” says Rindisbacher. “Initially, the prepayment schedules that everyone used were very simplistic. Now there is a lot of data, and people use machine learning and AI to understand what the real prepayment behavior is. That improves the evaluation and management of risk.”

There’s also a technological revolution in trade and portfolio advice—it can even shape your 401K.

“The big change is that we can now let the data speak,” says Gustavo Schwenkler, an assistant professor of finance. “We let the data guide us to what kind of models we should be thinking about.”

Schwenkler points to Lending Tree, an online broker of loans that factors in information from many sources, including social media. In a 2015 study published in Marketing Science, Chris Dellarocas, Shipley Professor of Management, investigated the increasing number of firms using network-based data to make lending decisions: if you’re connected online with financially responsible friends and family, for instance, you might have a lower risk of default, too. Schwenkler also sees a bright future for robo-advisers like Betterment and Wealthfront, whose business models—with algorithms rather than people deciding how to invest your money—have been borrowed by big players like Fidelity and Morgan Stanley.

“One of the big problems in finance has always been understanding the relationship between risk and reward,” says Schwenkler. “With data analytics, we’ve been able to disentangle those risks.”

7. Market Design: Letting the Algorithms Decide

Hedge funds and other institutional traders use massive amounts of data to drive their proprietary trading strategies, but data can also be used to help design the market itself. According to Lubin, data has been used to improve the efficiency of both markets focused on profit—and on social impact, such as matching organ donors to recipients or students to schools. In a recent project, Lubin says he and his colleagues “designed a market mechanism that lets participants use the power of machine learning to massively reduce the difficulty of bidding in complex markets.” He gives the example of the FCC’s multibillion-dollar auctions of radio spectrum licenses, “in which cellular companies bid against each other in order to obtain packages of licenses that cover their regional or national businesses.”

“It’s a very hard problem,” adds Lubin. “These markets have a complex structure: bidders’ value for licenses depends on both geography and frequency. We can use machine learning to design mechanisms that help bidders navigate this complexity, and this in turn will help improve the efficiency and/or revenue of these types of auctions.”