The amount of information yielded by modern day technologies is accumulating at an astonishing pace. Commercial services using satellites, imaging devices, and sensor networks, to name a few, produce vast amounts of data. Social media and its communities on Facebook, Twitter and other sites are not far behind.
With so much information out there, researchers like Professor Alok N. Choudhary are hoping to take a look at all of the accruing data and see what can be discovered from it.
“When you have a huge amount of data, how can you use it to make the world better?” he asked a Boston University crowd on Wednesday.
Choudhary, who teaches in the Electrical Engineering and Computer Science Department at Northwestern University, visited Boston University’s Electrical and Computer Engineering Department last week. He was the first speaker of the Fall 2011 Distinguished Lecture Series, which brings groundbreaking engineers to the university.
He spoke about his research team’s work on data mining to faculty, students, staff, and other members of Boston’s engineering community.
Data mining, though a fairly new field of computer science, is allowing for the discovery of new patterns from immense data sets using artificial intelligence, statistics, and database management.
As an example, Choudhary and his research team recently looked at the tweets from the 2011 Egyptian revolution, an uprising that social media played a large role in.
One feature already enabled on Twitter is “Trending Topics,” which allows users to see the most popular words or phrases being used throughout the site at any given moment.
In the case of the Egyptian revolution, looking at tweets with the word “Egypt” would pull up a lot of posts about the revolution but might leave out messages that only included “Cairo” or “Mubarak,” the then-Egyptian president.
“Given a trending topic, one of the challenges is identifying similar trending topics on Twitter,” Choudhary said. “In data mining, defining these similarities or determining different kinds of relationships – theme, spatial, temporal – is important when looking at patterns.”
Another challenge of collecting information from Twitter is identifying what sources are the most significant or persuasive. A user might tweet frequently about a topic like Egypt, but the messages may be spam.
“If someone has lots of tweets but no one following, it’s probably spam,” said Choudhary. “Quantity doesn’t necessarily mean influential.”
On the other hand, Choudhary and his research team found that CNN and Al Jazeera offered messages that were frequently shared during the revolution. Egyptian origin journalist, Mona Eltahawy, had only a few hundred followers at the start of January when the uprising began, but her posts were among the most influential and also became among the most retweeted.
Time will only tell what other information data mining can uncover, but Choudhary and his team are off to a good start.
Choudhary’s talk was the first in the three-part Fall 2011 Distinguished Lecture Series. The next talk features Professor Behnaam Aazhang of Rice University who will speak on the topic, “Context Aware Wireless Networks: A Physical Layer Perspective.” Hear him on October 12, 2011, at 4 p.m. in PHO 211.
-Rachel Harrington (firstname.lastname@example.org)