The lower-division data science courses are organized along the following four sequences. Course descriptions (and in cases numbers) are being revised as the details of the program are evolving.

- Introduction and overview (COGS 9)
- Data, Inference, Prediction, and Computation (DSC 10, 20, and 30)
- Data meets Theory (currently special editions of CSE 20 and 21; DSC 40 and 42 after the course revisions are completed)
- Practice and Application of Data Science (DSC 80)

**Introduction and Overview**

#### COGS 9: Introduction to Data Science

**Data, Inference, Prediction, and Computation**

The following three course together provide the basic skills to deal with data statistically and computationally.

#### DSC 10: Principles of Data Science

This introductory course in data science is aimed at developing the computational tools necessary to answer questions that arise from large-scale datasets derived from real-world phenomena. This course emphasizes an end-to-end approach to data science, introducing programming techniques and libraries in Python that cover (1) data collection, (2) data modeling, and (3) data analysis. First, how can data be extracted that describes real-world phenomenon? This part of the course includes data collection, processing, and cleaning ("munging"), and dealing with formatted and semi-formatted data (e.g. json). Second, how can data be modeled, and used to make predictions? This includes methods in regression and classification, and experimental design. And third, how can the results of this analysis be understood and reasoned about? This includes topics in visualization, and methods for hypothesis testing and validation. The course will involve hands-on analysis of a variety of real-world datasets, including economic data, document collections, geographical data and social networks.

Instructor(s): Marina Langlois, Janine Tiefenbruck, Julian McAuley

Prerequisites: None

#### DSC 20: Algorithms, Programming and Data Structures for Data Science I

The sequence of two courses provide a formal and rigorous introduction to the programming topics that appear in DSC 10, expands the repertoire of computational concepts, and exposes students to techniques of abstraction at several levels, including layers of software and machines from a programmer's point of view. It provides an understanding of the structures that underlie the programs, algorithms, and languages used in data science and other settings. Mastery of a particular programming language is a valuable side effect of studying these general techniques. It provides practical experience with composing larger computational systems through several significant programming projects.

Instructors: Marina Langlois, Julian McAuley

Prerequisites: DSC 10

#### DSC 30: Algorithms, Programming and Data Structures for Data Science II

The sequence of two courses provide a formal and rigorous introduction to the programming topics that appear in DSC 10, expands the repertoire of computational concepts, and exposes students to techniques of abstraction at several levels, including layers of software and machines from a programmer's point of view. It provides an understanding of the structures that underlie the programs, algorithms, and languages used in data science and other settings. Mastery of a particular programming language is a valuable side effect of studying these general techniques. It provides practical experience with composing larger computational systems through several significant programming projects.

Instructors: Marina Langlois, Julian McAuley

Prerequisites: DSC 20

**Data meets Theory**

#### DSC 40: Data meets Theory I

Currently a special edition of CSE 20 aimed at data science majors and minors will serve the role of DSC 40 as the course revision is being approved.

This sequence of two courses will introduce the mathematical foundations of data science, including: sets and combinatorics; graphs; probability; statistics; linear algebra; and the fundamentals of algorithms. Students will become familiar with mathematical language for expressing data analysis problems and solution strategies, and will receive training in probabilistic reasoning, mathematical modeling of data, and algorithmic problem solving. These courses connect to DSC 10, 20 and 30 courses by providing a unified view of the mathematical methods that underlie data science.

Instructors: Janine Tiefenbruck, Sanjoy Dasgupta and Mohan Paturi

Prerequisites: Math 20C and Math 18, DSC 10

#### DSC 42: Data meets Theory II

Currently a special edition of CSE 21 aimed at data science majors and minors will serve the role of DSC 42 as the course revision is being approved.

This sequence of two courses will introduce the mathematical foundations of data science, including: sets and combinatorics; graphs; probability; statistics; linear algebra; and the fundamentals of algorithms. Students will become familiar with mathematical language for expressing data analysis problems and solution strategies, and will receive training in probabilistic reasoning, mathematical modeling of data, and algorithmic problem solving. These courses connect to DSC 10, 20 and 30 courses by providing a unified view of the mathematical methods that underlie data science.

Instructors: Janine Tiefenbruck, Sanjoy Dasgupta and Mohan Paturi

Prerequisites: DSC 40

**Practice and Application of Data Science**

#### DSC 80: The Practice and Application of Data Science

The marriage of data, computation, and inferential thinking, or "data science", is redefining how people and organizations solve challenging problems and understand the world. This intermediate level class bridges between DSC 10, 20 and 30 and upper division data science courses as well as methods courses in other fields. Students master the data science life-cycle and learn many of the fundamental principles and techniques of data science spanning algorithms, statistics, machine learning, visualization, and data systems. Compared to DSC 10, 12 and 14, this class adopts an end-to-end approach to data science, focused on building large-scale, working systems on real data, and putting knowledge from previous courses into practice. Skills and expertise developed in this course enable students to pursue careers in data science or apply it to research>

Instructor: Julian McAuley

Prerequisites: DSC 30 and 42