Data Science

The listing of a course description here does not guarantee a course’s being offered in a particular term. Please refer to the published schedule of classes on the MyBU Student Portal for confirmation a class is actually being taught and for specific course meeting dates and times.

  • CDS DS 592: Special Topics in Mathematical and Computational Sciences
    Topic for Spring 2025: Introduction to Sequential Decision Making This course introduces the study, design and analysis of algorithms for sequential decision making with a particular focus on bandit algorithms and other topics in statistical learning theory. Designed for upper undergraduate and graduate students, the course covers foundational concepts and cutting-edge research in multi-armed bandits, linear bandits, and contextual bandits. Students will gain an understanding of fundamental algorithmic principles in sequential decision making such as optimism, multiplicative weights as well as bandit algorithms such as UCB, EXP3, OFUL. Additionally, the class will cover bandit problems in the general function approximation regime via the study of algorithms such as SquareCB and statistical dimensions for function approximation, including the eluder dimension, dissimilarity dimension, and decision estimation coefficient. Finally, the course will also explore miscellaneous yet essential topics such as online model selection, and offline estimation. Through a combination of theoretical insights and practical applications, students will gain a comprehensive understanding of how to design, analyze, and implement algorithms for sequential decision-making tasks.
  • CDS DS 593: Topics in Data Science Methodologies
    Spring 2026 Topic: Theory and Applications of Large Language Models. In this course, students will become savvy consumers and sophisticated developers of LLMs and related tooling. We will start by orienting ourselves to the history of natural language processing and the current state of AI tools, which students will learn to critically evaluate and use extensively throughout the course. Students will develop a deep intuition for LLM concepts including attention and transformer architectures, sampling, and search, and will build small models from scratch. They will then apply pre-trained LLMs to solve real-world problems, working with advanced techniques including fine-tuning, prompt engineering, RAG, and AI agents. Throughout, the course emphasizes bias, safety, and responsible deployment. Through reflections, labs, and projects, students will demonstrate learning and develop a professional portfolio.
  • CDS DS 594: Spark! Data Visualization X-Lab Practicum
    The Data Visualization X-Lab Practicum offers students an opportunity to learn data visualization skills through course and project-based work. Projects will be completed on a schedule that aligns with topics being covered in class and assignments. This course provides an accurate experience of solving real-world problems with data visualization, and the various tradeoffs that need to be considered. Whether it's how to efficiently use color and space, effectively understand the profile of a dataset or cautiously avoid bias, this course will provide students with a solid understanding of applicable data visualization practices. Effective Fall 2024 fulfills a single unit in each of the following BU Hub areas: Digital Medial Expression, Oral and/or Signed Communication, Writing-Intensive Course.
    • Digital/Multimedia Expression
    • Oral and/or Signed Communication
    • Writing-Intensive Course
  • CDS DS 595: Special Topics in Physical and Engineering Sciences
    Spring 2026 Topic: AI for Science The goal of the course is to equip students with the tools necessary to understand and carry out research at the forefront of AI and the natural sciences. Prerequisites: Multivariable calculus, linear algebra, probability theory; familiarity with neural networks and deep learning frameworks (PyTorch or JAX); proficiency in Python. Exemplary: - Preliminaries: the AI4Science landscape, core ML concepts, automatic differentiation, Bayesian statistics, simulators, common scientific data modalities - Scientific computing infrastructure: data management, compute accelerators, benchmarking and evaluation, reproducibility - Bayesian inference: MCMC and variational methods - Generative modeling (e.g., diffusion models) and surrogate models - Differentiable programming for scientific computing - Neural network building blocks: encoding scientific inductive biases - Neural ODEs and operator learning - Uncertainty quantification - Interpretability and symbolic regression - Foundation models and LLMs for scientific applications - Case studies from across the natural sciences
  • CDS DS 596: Special Topics in Natural, Biological and Medical Sciences
    Prerequisites: One of CDS DS430/630, CDS DS436/636, ENG BE562, CDS DS526, or equivalent, or prior experience with computational biology. - Spring 2026 Topic: Learning From Large-Scale Biological Data We are living in the age of large-scale biological data. Over the last two decades, the cost of genetic sequencing has decreased faster than Moore's law, meaning that the abundance of data is outpacing the improvement in computational power to analyze it. So while these data are incredibly exciting, extracting biological meaning from the sheer quantity of heterogeneous data being generated requires cutting-edge computational methods. In this course we will study the modern algorithms and machine learning tools that have been developed to extract biological insights from these data including deep learning approaches such as for protein structure prediction and learning from sequence data, Bayesian approaches for gene function prediction and network construction, and graph-based approaches. We will examine these through a lens of common pitfalls in learning from biological data and how to best avoid them. By the end of this course students will understand both the contexts in which it is appropriate to use such tools as well as their limitations.
  • CDS DS 597: Special Topics in Social and Behavioral Sciences
    Coverage of a specific topic in relation to social and behavioral sciences in data science. Topics vary semester to semester.
  • CDS DS 598: Special Topics in Machine Learning
    Special Topics in Machine Learning. Please see notes section for current topic.
  • CDS DS 599: CDS Research Development Seminar
    The first--year doctoral seminar is a required two--semester cohort--based course (4 credits) that must be taken during the first full academic year that a student enrolls in the PhD program in CDS. It is divided into two parts, each providing 2 credits. "CDS Research Initiation Seminar" is offered in the fall semester, and "CDS Research Development Seminar" is offered in the spring semester. The seminar serves three key purposes: 1. It introduces students to the scholarship of (and the rich set of research projects pursued by) the CDS faculty and their guests through colloquia pitched to a multidisciplinary audience. 2. It guides students through the challenging transition into the graduate program in CDS by introducing them to the variety of skills and capacities that are needed to succeed as a scholar. 3. It engenders a sense of community amongst the group of students entering the program as a cohort. 4 cr. Either sem.
  • CDS DS 630: Introduction to Bioinformatics and Computational Biology
    Prerequisites: DS 320 (or equivalent introduction to algorithms) is required. Prior knowledge of Python is required. No background in biology is required. Bioinformatics is an interdisciplinary filed combing data science, computing, algorithms, and programming with biology. This course teaches the fundamental algorithms that form the backbone of modern bioinformatics as well as their implementations and applications to data. Topics covered include genome assembly, sequence alignment, phylogenetic trees, gene regulation, and large-scale genomics data as well as associated computational methods including graph algorithms, dynamic programming, combinatorial pattern matching, tree algorithms, and machine learning.
  • CDS DS 653: Crypto for Data Science
    This course investigates techniques for performing trustworthy data analyses without a trusted party, and for conducting data science without data. The first half of the course investigates cryptocurrencies, the blockchain technology underpinning them, and the incentives for each participant, while the second half of the course focuses on privacy and anonymity using advanced tools from cryptography. The course concludes with a broader exploration into the power of conducting data science without being able to see the underlying data.
  • CDS DS 657: Law for Algorithms
    Algorithms - those information-processing machines designed by humans - reach ever more deeply into our lives, creating alternate and sometimes enhanced manifestations of social and biological processes. In doing so, algorithms yield powerful levers for good and ill amidst a sea of unforeseen consequences. This crosscutting and interdisciplinary course investigates several aspects of algorithms and their impact on society and law. Specifically, the course connects concepts of proof, verifiability, privacy, security, trust, and randomness in computer science with legal concepts of autonomy, consent, governance, and liability, and examines interests at the evolving intersection of technology and the law. Grades will be based on a combination of short weekly reflection papers and a final project, to be completed collaboratively in mixed teams of law and computer and data science students. This course will include attendees from the computer science faculty, students and scholars based at Boston University and UC Berkeley.
  • CDS DS 680: Data, Society, and AI Ethics
    This course develops students' ability to critically examine and question the interplay between artificial intelligence (AI), data science, and computational technologies on the one hand, and society and public policy on the other. Students will complete exercises to demonstrate their facility with key ethics tools and techniques, and analyze a series of real-world case studies presented alongside ethical tools and analyses that are useful both for staying alert to emerging ethical challenges and responding to them as they arise in both employment settings and everyday life. Effective Spring 2026, this course fulfills a single unit in each of the following BU Hub areas: Ethical Reasoning, Research and Information Literacy, Social Inquiry 2.
    • Ethical Reasoning
    • Research and Information Literacy
    • Social Inquiry II
  • CDS DS 682: Responsible AI, Law, Ethics & Society
    This course addresses the deployment of Artificial Intelligence systems across various societal domains, raising fundamental challenges and concerns such as accountability, liability, fairness, transparency, and privacy. Tackling these challenges necessitates an interdisciplinary approach, integrating principles and practices from data science, ethics, and law. This unique course will bring together students from computing and data science disciplines as well as law and public policy disciplines from multiple institutions. Permission is required to register for this course. Course page: https://learn.responsibly.ai. Please fill out an application form here: https://forms.gle/bMRECdYcMUwHj7xG8. Instructor: shlomi@bu.edu.
  • CDS DS 690: Directed Study in Computing & Data Sciences
    Directed study in Computing & Data Sciences provides students the opportunity to complete directed research in a selected topic not covered in a regularly scheduled course under the supervision of a faculty member. Student and supervising faculty member arrange and document expectations and requirements. Examples include in-depth study of a special topic or independent research project.
  • CDS DS 701: Tools for Data Science
    This is a new course to be designed specifically for the MS DS program. Students will take this course in their first semester. The goal of the course is to give students exposure to, and practical experience in, formulating data science questions -- particularly learning how to ask good questions in a specific domain. The course will also cover methods of obtaining data and common methods of processing data from a practical standpoint. It will be organized around a semester-long group project in which students are organized into teams and engage with "clients" who bring data science questions from a particular domain. The course will include a formal presentation of results at the end of the semester.
  • CDS DS 719: Data Science Product Management 1
    DS PROD MGNT 1
  • CDS DS 729: Data Science Product Management 2
    DS PROD MGNT 2
  • CDS DS 790: MSDS Internship
    Internship course for MSDS students only.
  • CDS DS 791: Teaching Practicum 1
    This is the first of two required 2-credit courses that must be taken before the last semester in which a student is enrolled in the PhD program in CDS. These two courses (DS-791 and DS-792) are designed to provide graduate students with the practical training necessary to be effective teachers not only in classroom settings but also in research settings that require the communication of technical concepts to peers and mentees. In DS-791, students will be assigned as teaching fellows working as apprentices under a faculty member in support of various teaching duties. Depending on the nature of the teaching assignment, this may include leading discussion sections or labs, offering office hours to assist learners with homework assignments, engaging with learners through online learning management systems, helping with learning assessments, etc. In addition, students are expected to participate in regular meeting of the two courses (DS-791 and DS-792) that introduce them to pedagogical innovation and best practices in communicating and teaching a variety of data science subjects.
  • CDS DS 792: Teaching Practicum 2
    This is the second of two required 2-credit courses that must be taken before the last semester in which a student is enrolled in the PhD program in CDS. These two courses are designed to provide graduate students with the practical training necessary to be effective teachers not only in classroom settings but also in research settings that require the communication of technical concepts to peers and mentees. In this course, students will be assigned as teaching fellows working as apprentices under a faculty member. Beyond the teaching duties in DS-791, students taking this course are expected to take on supervisory roles involving teaching assistants, class assistants, or graders. In addition, students are expected to participate in regular meeting of the two courses (DS-791 and DS-792) that introduce them to pedagogical innovation and best practices in communicating and teaching a variety of data science subjects.