"Cardinality Estimation: Achilles’ Heel of Query Optimizers" | CS Distinguished Lecture Series

  • Starts: 11:00 am on Wednesday, February 21, 2024
  • Ends: 12:00 pm on Wednesday, February 21, 2024
**Abstract** Cardinality estimation is the problem of estimating the size of the output of a query, without actually evaluating the query. The estimator has access to some limited statistics on the database, such as table cardinalities, number of distinct values in various columns, sometimes histograms, and needs to estimate the output size of complex queries, often involving many joins and filter predicates. The cardinality estimator is a critical piece of any query optimizer, and is often the main culprit when the optimizer chooses a poor plan. In this talk I will briefly survey existing cardinality estimation methods, discussing their pros and cons, then I will introduce a new approach to the problem, which is based on computing an upper bound instead of an estimate. The upper bound is given by a mathematical formula, and can use the available database statistics in surprising ways. **Bio** Dan Suciu is a Microsoft Endowed Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington. Suciu is conducting research in data management, on topics such as query optimization, probabilistic data, data pricing, parallel data processing, data security. He is a co-author of two books Data on the Web: from Relations to Semistructured Data and XML, 1999, and Probabilistic Databases, 2011. He received the ACM SIGMOD Codd Innovation Award, received several best paper awards and test of time awards, and is a Fellow of the ACM. Suciu is currently an associate editor for the Journal of the ACM. Suciu’s PhD students Gerome Miklau, Christopher Re and Paris Koutris received the ACM SIGMOD Best Dissertation Award in 2006, 2010, and 2016 respectively, and Nilesh Dalvi was a runner up in 2008.
CDS 1750