[Evimaria Terzi] On Selection and Ranking
Wednesdays @Hariri / Meet our Fellows
3:00 PM Wednesday, March 27, 2013
Entity Selection and Ranking for Data-Mining Applications
Evimaria Terzi
Assistant Professor of Computer Science
Hariri Institute Junior Faculty Fellow
Abstract: In many data-mining applications, the input consists of a collection of entities (e.g., reviews about a product, experts that declare certain skills, network nodes or edges) and the goal is to identify a subset of important entities (e.g., useful reviews, competent experts, influential nodes respectively). Existing work identifies important entities either by entity ranking or by entity selection. Entity-ranking methods associate a score with every entity. The main drawback of these approaches is that they ignore the redundancy between the highly scored entities. Entity-selection methods try to overcome this drawback by evaluating the goodness of a group of entities collectively. These methods identify the best set of entities, implying that all entities not in the group are unimportant. Such dichotomy of entities conceals the fact that there may be other subsets of entities with equally-good (or almost as good) goodness scores.
Bio: In this talk, we will discuss how the drawbacks of the above methods can be overcome by integrating the entity-ranking and entity-selection paradigms. That is, we will introduce entity-ranking mechanisms that are based on entity selection and entity-selection mechanisms that are based on entity ranking. In this framework, the importance scores of individual entities are determined by how many good groups of entities they participate in. Consequently, a good group of entities consists of entities with high importance scores. The main challenge we will discuss is how to explore the solution space of combinatorial problems in order to identify many entities that participate in many good solutions. In the talk, we will describe how our methods can be applied to applications related to expert management systems, management of online product reviews, and network analysis (including physical and social networks).
Evimaria Terzi was one of the first class of Junior Faculty Fellows for the Hariri Institute. She rjoined the Department of Computer Science in 2009. Before coming to Boston University, she was a member of the research staff at IBM Almaden Research Center. Her current research focuses on data mining with emphasis on social-network analysis, analysis of sequential data, ranking, clustering and bioinformatics. In particular, she is working on problems related to expert identification and team formation in social networks, analysis of online product reviews, and privacy-preserving social network analysis. Evimaria is a Microsoft Faculty Fellow and her research is supported by NSF and gifts from Yahoo!, Google, and Microsoft. She recently was honored with an NFS CAREER Award.