- All Categories
- Featured Events
- Alumni
- Application Deadline
- Arts
- Campus Discourse
- Careers
- BU Central
- Center for the Humanities
- Charity & Volunteering
- Kilachand Center
- Commencement
- Conferences & Workshops
- Diversity & Inclusion
- Examinations
- Food & Beverage
- Global
- Health & Wellbeing
- Keyword Initiative
- Lectures
- LAW Community
- Meetings
- Orientation
- Other Events
- Religious Services & Activities
- Special Interest to Women
- Sports & Recreation
- Social Events
- Study Abroad
- Weeks of Welcome
- Application Period Opens for 2015 Graduate Summer Fellows ProgramAll day
- Biostatistics Dissertation Defense of Zhaoyang Teng9:30 am
- GRS Dissertation Defense of Andrew McMurry10:00 am
- Exploring the Role of Social and Cultural Determinants Influencing Latino HIV and Substance Abuse Health Disparities10:45 am
- Scars of Partition: Franco-British Colonial Legacies in Borderlands12:00 pm
- Monday Meditation12:00 pm
- Scars of Partition: Franco-British Colonial Legacies in Borderlands12:00 pm
- Particle and Fields Seminar2:00 pm
- Particle and Fields Seminar2:00 pm
- GMS OPDPA Career Development Seminar: How to Use ( My Individual Development Plan) myIDP in Your Career Search4:00 pm
- Study Abroad in Haifa, Israel - Info Session4:00 pm
- ECE Seminar: Palash Bharadwaj 4:00 pm
- Lo-Bin Chang – Johns Hopkins University4:00 pm
- Haifa Information Session4:00 pm
- Welcome Back Dinner and Town Hall Meeting with the Orthodox Minyan5:00 pm
- Alpha Kappa Psi Recruitment5:00 pm
- IS&T RCS Tutorial - Introduction to MATLAB5:00 pm
- Community Dinner6:00 pm
- Inspiration in the Desert, a chamber music concert of works by Max Stern7:00 pm
- Muir String Quartet (Cancelled)8:00 pm
Lo-Bin Chang – Johns Hopkins University
Title: Tracking cross-validated estimates of prediction error as studies accumulate.Abstract: In recent years “reproducibility” has emerged as a key factor in evaluating applications of statistics to the biomedical sciences, for example learning predictors of disease phenotypes from high-throughput “omics” data. In particular, “validation” is undermined when error rates on newly acquired data are sharply higher than those originally reported. More precisely, when data are collected from m “studies” representing possibly different sub-phenotypes, more generally different mixtures of sub-phenotypes, the error rates in cross-study validation (CSV) are observed to be larger than those obtained in ordinary randomized cross-validation (RCV), although the “gap” seems to close as m increases. Whereas these findings are hardly surprising for a heterogeneous underlying population, this discrepancy is then seen as a barrier to translational research. In this talk, I will provide a statistical formulation in the large sample limit: studies themselves are modeled as components of a mixture and all error rates are optimal (Bayes) for a two-class problem. Our results cohere with the trends observed in practice and suggest what is likely to be observed with large samples and consistent density estimators, namely that the CSV error rate exceeds the RCV error rates for any m, the latter (appropriately averaged) increases with m, and both converge to the optimal rate for the whole population.
When | 4:00 pm to 5:00 pm on Monday, January 26, 2015 |
---|---|
Location | MCS 148 |