- All Categories
- Featured Events
- Alumni
- Application Deadline
- Arts
- Campus Discourse
- Careers
- BU Central
- Center for the Humanities
- Charity & Volunteering
- Kilachand Center
- Commencement
- Conferences & Workshops
- Diversity & Inclusion
- Examinations
- Food & Beverage
- Global
- Health & Wellbeing
- Keyword Initiative
- Lectures
- LAW Community
- Meetings
- Orientation
- Other Events
- Religious Services & Activities
- Special Interest to Women
- Sports & Recreation
- Social Events
- Study Abroad
- Weeks of Welcome
- PassoverAll day
- Ben & Jerry's #PintProject ContestAll day
- Alternative Visions/Sustainable FuturesAll day
- Laboratory Spring Cleaning Day8:00 am
- The American Civil War: Treasures from the Vault9:00 am
- History of Art and Architecture Dissertation Defense of Katherine Harper9:30 am
- Biostatistics Dissertation Defense of Jayandra Jung Himali9:30 am
- Training and Supervising Student Employees10:00 am
- Biostatistics Dissertation Defense of Han Chen10:00 am
- Exhibition: Teaching the Body10:00 am
- Archaeology Dissertation Defense of Karen A. Hutchins11:00 am
- Exhibition: Simultaneity11:00 am
- Marsh Chapel: Silence Practice12:00 pm
- Marsh Chapel: Common Ground Communion12:20 pm
- The Economics of the Eurozone Crisis: A luncheon discussion with Stefan Collignon12:30 pm
- Administrative Law Research Class1:00 pm
- CTSI Presensts Research Computing Facilities available through IS&T Scientific Computing and Visualization1:00 pm
- Department Seminar2:00 pm
- Sociology & Social Work Dissertation Defense of Ivy Krull2:00 pm
- Spring Faculty Assembly3:00 pm
- Women's Tennis vs. Albany3:00 pm
- Resume & Cover Letter Writing Workshop3:00 pm
- Physics Colloquium/Benson T. Chertok Lecture3:30 pm
- International Perspectives on the Fight against Human Trafficking4:00 pm
- Space Physics Seminar with Michael Nicolls4:00 pm
- Luke Miratrix - Harvard University4:00 pm
- Rising Green Coffee Social4:00 pm
- ECE Seminar with Lingkai Kong4:00 pm
- Coffee and Conversation with Rabbi Beyo4:00 pm
- Visions of a Sustainable Future: Journalists as Public Intellectuals in the Climate Change Debate4:00 pm
- 22nd Annual Public Interest Project Auction Gala5:00 pm
- Environmental Activist Group Potluck with Jaimes Mayhew5:00 pm
- Environmental Activist Group Potluck with Jaimes Mayhew5:00 pm
- Matza Pizza Making5:00 pm
- Test Prep Workshop5:00 pm
- The Tamer Tamed6:00 pm
- Marsh Chapel: Maundy Thursday Service6:00 pm
- The Future of Bicycling: Laws and Policies6:00 pm
- Iraq +10 Film Series: The List6:30 pm
- Kathleen Spivack: Great Poets at Boston University7:00 pm
- Godspell8:00 pm
- Tora Dojo8:00 pm
- Piano Duos8:00 pm
- Mixed Meters in Rock, Pop, and Jazz8:30 pm
- BU Jazz Jam Session8:30 pm
Luke Miratrix - Harvard University
Title: An introspection on using sparse regression techniques to analyze text. Abstract: In this talk, I propose a general framework for topic-specific summarization of large text corpora, and illustrate how it can be used for analysis in two quite different contexts: legal decisions on workers' compensation claims (to understand relevant case law) and an OSHA database of occupation-related accident reports (to search for high risk circumstances). Our summarization framework, built on sparse classification methods, is a lightweight and flexible tool that offers a compromise between simple word frequency based methods currently in wide use, and more heavyweight, model-intensive methods such as Latent Dirichlet Allocation (LDA). For a particular topic of interest (e.g., emotional disability, or chemical gas), we automatically labels documents as being either on- or off-topic, and then use sparse classification methods to predict these labels with the high-dimensional counts of all the other words and phrases in the documents. The resulting small set of phrases found as predictive are then harvested as the summary. Using a branch-and-bound approach, this method can be extended to allow for phrases of arbitrary length, which allows for potentially rich summarization. I further discuss how focus on specific aspects of the corpus and the purpose of the summaries can inform choices of regularization parameters and constraints on the model. Overall, I argue that sparse methods have much to offer text analysis, and hope that this work opens the door for a new branch of research in this important field.
When | 4:00 pm to 5:00 pm on Thursday, March 28, 2013 |
---|---|
Building | MCS 148 |
Fees | Free |