Scalable, reliable Bayesian inference via approximate likelihoods and random features (Jonathan Huggins - Harvard University)

Data analysis for decision-making in areas such as medicine, economics, environmental science, and computer security commonly share two features: (1) there are vast quantities of data available, so the inference algorithm used for a data analysis must to work at scale, and (2) the safety-critical and high-impact nature of the decisions involved makes it crucial that the results of the analysis are reliable. Therefore, an inference algorithm need to satisfy two competing goals: scalability and reliability. I consider the Bayesian approach to inference because it provides a framework for coherent modeling of our prior beliefs about how the data was generated and representations of uncertainty about inferential conclusions. I develop two approaches to scalable, reliable Bayesian inference. First, I present an approximate likelihood-based framework for developing scalable inference algorithms with a priori guarantees on accuracy in terms of Wasserstein distance. One of my resulting algorithms scales to tens of millions of observations while reducing computation and memory requirements by two to three orders of magnitude. Second, I consider a complementary approach that is applicable to the many heuristic, yet empirically effective, scalable inference algorithms in wide use. The key idea is to validate the output of an inference algorithm to check whether it is trustworthy. I introduce the first validation method applicable to modern scalable algorithms that provides theoretical guarantees and is computationally efficient, opening up the possibility of using these heuristic algorithms in a wide range of high-impact settings.

When 4:00 pm to 5:00 pm on Thursday, January 31, 2019
Location MCS, Room 180, 111 Cummington Mall