Learning Not to Share
On social networks, how much information is too much? Evimaria Terzi says it’s less than you think| From Explorations | By Rich Barlow
The type size in the tag cloud indicates the sensitivity of information as perceived by social network users. Illustration by Adam McCauley
Many of us feel we couldn’t live without Facebook, Twitter, and other online social networks. But all this tweeting, friending, posting, and messaging carries privacy risks, ranging from fleeting embarrassment to all-out identity theft. Evimaria Terzi is worried about that.
“People participate in online social networks because they want to share,” says Terzi, a College of Arts & Sciences assistant professor of computer science, who has developed a mathematically derived “score” that can help users control their privacy. “People who might be introverts in their real life suddenly have these online personas and become extroverts. You want to appear cool to your online friends. And you can be cool by revealing something, like your photos. You want to show off.”
The problem, she says, is that information that most people consider perfectly safe for sharing can, in mathematically skilled hands, be puzzled together to reveal things that few people want others to know.
In a recent experiment, researchers at Carnegie Mellon University were able to deduce the Social Security numbers of five million Americans born between 1989 and 2003, mining information that is typically shared on social networks and other data from publicly available sources.
Sifting such data through complex statistical correlations, the Carnegie Mellon researchers hit pay dirt for almost a tenth of Americans born in the target years. Meanwhile, MIT researchers studying 4,000 Facebook student profiles correctly determined, in most cases, whether the profile was that of a gay man, even though the users had not disclosed their sexual preference.
A recent study in Consumer Reports found that 52 percent of social network users disclose information that could leave them vulnerable to cybercriminals. Information considered dangerous by the magazine includes a full birth date, which can help identity thieves get access to bank accounts and credit card accounts and other information; disclosing vacation dates and other absences (3 percent of Facebook users reportedly advertise when their homes will be unoccupied); and posting a child’s name with photos or captions.
Terzi believes social network users would think more carefully about what they publish if they knew how the depth and breadth of their shared information compared to information shared by others.
Working with former IBM colleague Kun Liu, she co-invented a computable privacy score, a metric that could be used to compare a person’s information exposure with that of other users on a network or with the person’s exposures on different networks.
“We want to increase people’s awareness,” says Terzi in the rolling accent of her native Greece. “We say, ‘This is your score and this is how it compares with other people’s. You have been a little more extroverted than other people, or a little more introverted.’”
Terzi says a person’s score is based on both the sensitivity of the information and its visibility on a network. To define that sensitivity, the two researchers surveyed 200 network users worldwide, mostly computer scientists, who were given an array of profile information (name, gender, birthday, and political views, among others) and asked how willing they were to share each tidbit. Respondents could assign a sensitivity grade from 0 (wouldn’t share the info-bit with anyone) to 4 (would share it with everyone). In between were options to share with “some friends,” “all friends,” or “friends of friends.”
Information that most people consider perfectly safe for sharing can, in mathematically skilled hands, be puzzled together to reveal things that few people want others to know, says Evimaria Terzi. Photograph by Vernon Doucette
Acknowledging that her survey sample was small, Terzi believes it did yield some information that seems broadly reflective. She used the respondents’ answers about sensitive and nonsensitive data to draw a “tag cloud,” in which sensitivity of information is illustrated by the font size of the words describing the information, with larger letters indicating greater sensitivity. The information that appears dead-center, dwarfing all words around it, is “mother’s maiden name.” Three-quarters of respondents said they would not share that information with anyone. Aware that many people use their mother’s maiden name as a password for secure Web sites, Terzi was not surprised.
“Gender,” on the other hand, is considered such innocuous information — in many cases it can be figured out from a name — that it’s all but invisible in the cloud; 57 percent of respondents said they’d tell everyone their sex; only 4 percent said they’d tell no one.
Other information scoring high on the sensitivity scale included street address (40 percent wouldn’t disclose that to anyone; just 3 percent would tell anyone), home telephone number (39 percent and 3 percent), and work phone number (36 percent and 8 percent).
More people were bashful about their political and religious views, with 24 and 30 percent, respectively, refusing to share those with anyone, while 18 and 17 percent would share with everyone.
For what would seem to most readers to be unnecessarily complicated mathematical reasons, the theoretical scale of Terzi’s privacy scores is unlimited, ranging from minus-infinity to infinity, with higher scores indicating a greater comfort in sharing information. In practice, the survey yielded an average score for North American networkers of 1,074. Europeans came in at 1,047, while Asians, at 764, and Australians, at 780, were more privacy-conscious. Terzi’s own score, based on information she shares on Facebook, is a comparatively inhibited 450. That’s because, for a young (thirty-one) computer scientist, she’s refreshingly restrained about using computer networks.
“I don’t share my photos,” she says. “Or my status — ‘I’m doing such-and-such right now.’ I am friends with some of my students, and they don’t need to know, for example, that I am currently discussing this with you.” Her relationships? None of your online business. Nor will you find information about any groups she’s in, because she tends to avoid joining any.
“They flood you with e-mails,” she says.
If she needs to call someone, Terzi says, “I send them a personal e-mail. I don’t communicate with people on Facebook — for example, I don’t write on people’s walls. I look into my account and see what people are putting there, and I never imagined they would write so many things. I mean, they write every two hours, three hours. Every Monday, some people will write, ‘Have a good week.’”
Terzi suspects that the compulsive sharing she sees reflects the Internet’s narcotic power to induce us to undress, metaphorically, online.
A far wiser tack, she says, is to think of being online as being at a cocktail party. There are things you wouldn’t shout to a friend across the room, knowing that total strangers you know nothing about would hear. While sitting alone and staring at a computer screen, we don’t feel like we are surrounded by eavesdroppers. But we may well be.
“The privacy score is a useful contribution,” says Alessandro Acquisti, an associate professor of information technology and public policy at Carnegie Mellon and part of the team that cracked the Social Security numbers. “But studies show that providing salient privacy information can’t solve all problems. Sometimes even well-informed users make privacy decisions against their own best interests. So I see privacy scores as one part of a concerted and collective effort.”
Some observers are convinced that the collective effort may soon include government regulators. Last year, when Facebook, which claims 400 million users, loosened the default privacy settings for users, the move sparked a protest petition from the American Civil Liberties Union, as well as a federal investigation. Four U.S. senators wrote to Facebook founder Mark Zuckerberg, urging him to simplify privacy controls for the network’s members. Zuckerberg announced in May that he had done so.
“The Federal Trade Commission has expressed concerns about the dangers of putting too much burden on consumers to police all of the collection and uses of their information,” says Christopher Olsen, assistant director of the FTC’s privacy and identity protection division. Networks and regulators must play a more vigilant role, he argues, which is why the FTC is pondering follow-up steps to a recent series of public roundtables about online privacy.
Like Acquisti, Olsen thinks tools to help consumers understand the possible implications of sharing information online would be useful. But, he says, they would have to be simplified so users of many ages and backgrounds could understand their meaning.
Toward that end, Terzi and her students are working on an application designed to let consumers easily calculate their own scores and compare them to the scores of their friends. Users could share the application with friends on their networks, who would pass it on to other friends. Last year, she and Kun Liu discussed their work at both an international data-mining conference and at a Google-sponsored talk.
Terzi doesn’t think her privacy scoring system would have every networker yanking down the cyber-drapes, tightening their privacy settings. People like to share too much, as Facebook has argued in defense of its relaxation of privacy guarantees. The Facebook problem, says Terzi, is that rather than undertake the tedious task of customizing their privacy settings, most Facebook users accept the company’s default settings.
“Once you make the decision to join a social network, you take some risks,” she says. “If you don’t want to take them, don’t participate. The most worrisome thing is that people are not aware of the risks that they are taking.”
That, says Terzi, and the fact that most people would rather be cool than safe.
Watch a video tutorial on how to customize your Facebook privacy settings. Video by Amy Laskowski