Analyzing Global Media

In dealing with some of the world’s most significant issues right now, understanding how public communication flows around the world is vital in developing global policies and engaging in effective diplomacy. Unfortunately, the current methods of studying public communication worldwide are limited by language and culture gaps for both visual- and text-based information. To complicate things further, public communication generates data at a high velocity, in large volumes, and with a substantial variety of perspectives, languages, and platforms.

To address these challenges, the National Science Foundation (NSF) awarded a $1M grant to an interdisciplinary team here at BU: ECE Professor Prakash Ishwar, CS Professors Margrit Betke and Derry Wijaya, and Communications Professor Lei Guo. Together, they will develop the methods and tools needed for collecting, annotating, and analyzing multilingual, multiplatform, and multimodal text and images originating in the U.S. and reported worldwide. This Artificial Intelligence & Emerging Media (AIEM) research group will work together in tackling this issue by combining their expertise in Machine Learning (Ishwar), Computer Vision (Betke), Natural Language Processing (NLP) (Wijaya), and Emerging Media Studies (Guo).

“Machine Learning provides mathematically-principled frameworks and algorithms to make decisions about, and to learn from, human-labeled data,” says Professor Ishwar about his contribution to this project. “For example, this can be used in deciphering the sentiment of a tweet, predicting how many resources should be allocated to most effectively evaluate an article, or identifying topics of interest in any post.”

The project ultimately aims to produce analytical tools for social scientists to effectively and accurately examine the international flow of public communications, particularly concerning the media and press. Through these tools, analyzing the news and social media chatter in local languages could vastly improve the transparency of public media sources. The hope is to eventually automate and optimize these methods, propelling the research of NLP tasks for many languages, especially from sources with limited human-labeled data.

The methods developed in this project will help us to understand how issues such as gun violence, trade, and immigration are perceived and discussed both within the US population and in other parts of the world.