Four-Week Machine Learning Symposium, Fall 2024
The CDS Machine Learning Symposium, developed by Assistant Professors Aldo Pacchiano and Xuezhou Zhang, and hosted by the Faculty of Computing and Data Sciences, brings together leading scholars in machine learning to delve into the cutting-edge developments and foundational technologies shaping the field of machine learning. By uniting experts from various technical disciplines, including algorithmic design, model architectures, and optimization techniques, the symposium aims to illuminate the latest advancements and challenges in core machine learning methodologies.
Speaker listing is coming soon!
- Friday, Nov. 8: 2-3PM
- Friday, Nov. 15: 12-1PM
- Friday, Nov. 22: 2-3PM
- Friday, Dec. 6: 10-11AM
Past Speakers
Date: Wednesday, Nov. 1st at 4:30 PM Location: CDS 1750 Abstract: In experimental design, Neyman allocation refers to the practice of allocating subjects into treated and control groups, potentially in unequal numbers proportional to their respective standard deviations, with the objective of minimizing the variance of the treatment effect estimator. This widely recognized approach increases statistical power in scenarios where the treated and control groups have different standard deviations, as is often the case in social experiments, clinical trials, marketing research, and online A/B testing. However, Neyman allocation cannot be implemented unless the standard deviations are known in advance. Fortunately, the multi-stage nature of the aforementioned applications allows the use of earlier stage observations to estimate the standard deviations, which further guide allocation decisions in later stages. In this paper, we introduce a competitive analysis framework to study this multi-stage experimental design problem. We propose a simple adaptive Neyman allocation algorithm, which almost matches the information-theoretic limit of conducting experiments. Using online A/B testing data from a social media site, we demonstrate the effectiveness of our adaptive Neyman allocation algorithm, highlighting its practicality especially when applied with only a limited number of stages. Bio: Jinglong Zhao is an Assistant Professor of Operations and Technology Management at Questrom School of Business at Boston University. He works at the interface between optimization and econometrics. His research leverages discrete optimization techniques to design field experiments with applications in online platforms. Jinglong completed his PhD in Social and Engineering Systems and Statistics at Massachusetts Institute of Technology. Date & Time: Thursday, Nov. 9th at 4:30 PM Location: CDS 1646 Abstract: Agents capable of carrying out general tasks on a computer can greatly improve efficiency and productivity. Ideally, such agents should be able to solve new computer tasks presented to them through natural language commands. However, previous approaches to this problem require large amounts of expert demonstrations and task-specific reward functions, both of which are impractical for new tasks. In this talk, I show that pre-trained LLMs are able to achieve state-of-the-art performance on MiniWoB, a popular computer task benchmark, by recursively criticizing and improving outputs. I then argue that RLHF is a promising approach toward improving LLM agents, and introduce new work on countering over-optimization in RLHF via constrained RL. Bio: Stephen McAleer is a postdoc at Carnegie Mellon University working with Tuomas Sandholm. His research has led to the first reinforcement learning algorithm to solve the Rubik's cube and the first algorithm to achieve expert-level performance on Stratego. His work has been published in Science, Nature Machine Intelligence, ICML, NeurIPS, and ICLR, and has been featured in news outlets such as the Washington Post, the LA Times, MIT Technology Review, and Forbes. He received a PhD in computer science from UC Irvine working with Pierre Baldi, and a BS in mathematics and economics from Arizona State University. Date & Time: Wednesday, Nov. 15th at 3 PM Location: CDS 548 Bio: Esther Rolf is a postdoctoral fellow with the Harvard Data Science Initiative and the Center for Research on Computation and Society. In fall of 2024, Esther will join University of Colorado, Boulder as an assistant professor of computer science. Esther's research in statistical and geospatial machine learning blends methodological and applied techniques to study and design machine learning algorithms and systems with an emphasis on usability, data-efficiency and fairness. Some of her specific projects span developing algorithms and infrastructure for reliable environmental monitoring using machine learning, responsible and fair algorithm design and use, and the influence of data acquisition and representation on the efficacy and applicability of machine learning systems. Esther completed her PhD in Computer Science in 2022 at UC Berkeley, where she was advised by Ben Recht and Michael I. Jordan. Esther’s PhD was supported by an NSF Graduate Research Fellowship, a Google Research Fellowship, and a UC Berkeley Stonebreaker Fellowship. Esther has won best paper awards at ICML (2018) and the Workshop on AI for Social Good at Neurips (2019), and the impact of her work has been recognized with a SDG Digital Gamechangers award (2023) from the United Nations Development Programme and the International Telecommunication Union. Date: Wednesday, Nov. 29th at 3 PM Location: CDS 548 Abstract: Machine learning —as taught in classrooms and textbooks— typically involves a single, fixed, and homogenous dataset on which models are trained and evaluated. But machine learning in practice is rarely so ‘pristine’. In most real-life applications clean labeled data is typically scarce, so it is often necessary to leverage multiple heterogeneous data sources. In particular, there is an almost-universal discrepancy between training and testing data distributions. This phenomenon has been profoundly amplified by the recent advent of massive reusable ‘pre-trained’ deep learning models, which rely on vast amounts of highly heterogeneous datasets for training, and are then re-purposed for a variety of distinct (and often unrelated) tasks. This emerging paradigm of ‘Machine Learning on collections of datasets’ necessitates new theoretical and algorithmic tools. In this talk, I will argue that Optimal Transport provides an ideal framework on which to lay the foundations for this novel paradigm. It allows to us to define semantically-meaningful distances between datasets, to elucidate correspondences between them, and to solve optimization objectives over them. Through applications in dataset selection, transfer learning, and dataset shaping, I will show that besides enjoying sound theoretical footing, these OT-based approaches yield powerful, highly-scalable, and at times surprisingly insightful methods. Bio: David Alvarez-Melis is an assistant professor of computer science at Harvard SEAS, and is a faculty affiliate at the Harvard Data Science initiative, the Kempner Institute, and the Center for Research on Computation and Society (CRCS). Before Harvard, he spent a few years at Microsoft Research New England, as part of the core Machine Learning and Statistics group. His research seeks to make machine learning more broadly applicable (especially to data-poor applications) and trustworthy (e.g., robust and interpretable). For this, he draws on ideas from various fields including statistics, optimization and applied mathematics, and takes inspiration from problems arising in the application of machine learning to the natural sciences. Date & Time: Thursday, Dec. 7th at 4:30 PM Location: CDS 1646 Abstract: Training large neural networks requires careful tuning of a large number of so-called "hyperparameters" - improper tuning result in dramatically worse performance. The learning rate is perhaps the most important such hyperparameter, and over the past decades a wide variety of schemes for setting this value have been proposed and gone in and out of popularity. Although the core ideas from many of the more venerable methods (e.g. the Adam optimizer) are directly inspired by theory, in recent years there has been a significant divergence between learning rates used in practice and those suggested by theory. In this talk, I will provide an overview of some recent work that can help shed light on this theory/practice gap via new analysis of learning rates that not only justifies the heuristics popular in practice, but also provides actionable guidance for improving on these heuristics. Bio: Ashok Cutkosky is an assistant professor in the ECE department at Boston University. Previously, he was a research scientist at Google, and he earned a PhD in computer science from Stanford University in 2018. He is interested in all aspects of machine learning and stochastic optimization theory. He has worked extensively on optimization algorithms for machine learning that adaptively tune themselves to apriori unknown the statistical properties of their input datasets, as well as on non-convex stochastic optimization.Fall 2023 Speaker Lineup
Adaptive Neyman Allocation
Jinglong Zhao, Assistant Professor, Questrom School of Business
Toward General Virtual Agents
Stephen McAleer, Postdoc, Carnegie Mellon University
Geospatial Machine Learning: Data-focused Algorithm Design, Development, and Evaluation
Esther Rolf, Postdoc at Harvard University; Incoming Assistant Professor, University of Colorado, Boulder
Machine Learning in the Space of Datasets: an Optimal Transport Perspective
David Alvarez, Assistant Professor at Harvard; Senior Researcher at MSR New England
Learning Rate Tuning from Theory to Practice
Ashok Cutkosky, Assistant Professor, Boston University Department of Electrical and Computer Engineering