Machine Learning DS-GA 1003 · Spring 2023 · NYU Center for Data Science

Instructors Mengye Ren
Ravid Shwartz-Ziv
Lecture Tuesday 4:55pm-6:35pm (GCASL C95)
Lab Wednesday 4:55pm-5:45pm (GCASL C95)

About This Course

This course covers a wide variety of introductory topics in machine learning and statistical modeling, including statistical learning theory, convex optimization, generative and discriminative models, kernel methods, boosting, latent variable models and so on. The primary goal is to provide students with the tools and principles needed to solve the data science problems found in practice. This course was designed as part of the core curriculum for the Center for Data Science's Masters degree in Data Science, and is intended as a continuation of DS-GA-1001 Intro to Data Science. This course also serves as a foundation on which more specialized courses and further independent study can build. Course syllabus can be found here.

For registration information, please contact Tina Lam.

Prerequisites

If you'd like to waive the prerequisites, please send an email to Mengye Ren (mengye@cs.nyu.edu), Ravid Shwartz-Ziv (rs8020@nyu.edu). Note that this course requires some basic understanding of machine learning (covered by DS-GA-1001). For each prerequisite, please clearly list which courses you've taken are equivalent, and highlight it in the transcript. In addition, please complete the Prerequisite Questionnaire for self-assessment.

Logistics

Grading

Resources

Related courses

Textbooks

The cover of Elements of Statistical Learning The cover of An Introduction to Statistical Learning The cover of Understanding Machine Learning: From Theory to Algorithms The cover of Pattern Recognition and Machine Learning The cover of Bayesian Reasoning and Machine Learning
The Elements of Statistical Learning (Hastie, Friedman, and Tibshirani)
This will be our main textbook for L1 and L2 regularization, trees, bagging, random forests, and boosting. It's written by three statisticians who invented many of the techniques discussed. There's an easier version of this book that covers many of the same topics, described below. (Available for free as a PDF.)
An Introduction to Statistical Learning (James, Witten, Hastie, and Tibshirani)
This book is written by two of the same authors as The Elements of Statistical Learning. It's much less intense mathematically, and it's good for a lighter introduction to the topics. (Available for free as a PDF.)
Understanding Machine Learning: From Theory to Algorithms (Shalev-Shwartz and Ben-David)
Covers a lot of theory that we don't go into, but it would be a good supplemental resource for a more theoretical course, such as Mohri's Foundations of Machine Learning course. (Available for free as a PDF.)
Pattern Recognition and Machine Learning (Christopher Bishop)
Our primary reference for probabilistic methods, including bayesian regression, latent variable models, and the EM algorithm. It's highly recommended, but unfortunately not free online.
Bayesian Reasoning and Machine Learning (David Barber)
A very nice resource for our topics in probabilistic modeling, and a possible substitute for the Bishop book. Would serve as a good supplemental reference for a more advanced course in probabilistic modeling, such as DS-GA 1005: Inference and Representation (Available for free as a PDF.)
Hands-On Machine Learning with Scikit-Learn and TensorFlow (Aurélien Géron)
This is a practical guide to machine learning that corresponds fairly well with the content and level of our course. While most of our homework is about coding ML from scratch with numpy, this book makes heavy use of scikit-learn and TensorFlow. Comfort with the first two chapters of this book would be part of the ideal preparation for this course, and it will also be a handy reference for practical projects and work beyond this course, when you'll want to make use of existing ML packages, rather than rolling your own.
Data Science for Business (Provost and Fawcett)
Ideally, this would be everybody's first book on machine learning. The intended audience is both the ML practitioner and the ML product manager. It's full of important core concepts and practical wisdom. The math is so minimal that it's perfect for reading on your phone, and I encourage you to read it in parallel to doing this class, especially if you haven't taken DS-GA 1001.

Other tutorials and references

Software

Lectures

Week 1

Topics Materials References

Lecture (RSZ) Jan 24

Topics

Materials

References

(None)

Lab Jan 25

Topics

Materials

(None)

References

Week 2

Topics Materials References

Lecture (RSZ) Jan 31

Topics

  • Gradient descent
  • Stochastic gradient descent
  • Loss functions
  • Slides

Materials

References

(None)

Lab Feb 1

Topics

Materials

References

(None)

Week 3

Topics Materials References

Lecture (MR) Feb 7

Topics

  • Feature selection
  • Regularization
  • Lasso Optimization
  • Slides

Materials

(None)

References

  • HTF Ch. 3.3-3.4

Lab Feb 8

Topics

Materials

(None)

References

(None)

Week 4

Topics Materials References

Lecture (MR) Feb 14

Topics

  • Support Vector Machines
  • Subgradient Descent
  • SVM Dual
  • Slides

Materials

References

(None)

Lab Feb 15

Topics

Materials

(None)

References

Week 5

Topics Materials References

Lecture (RSZ) Feb 21

Topics

  • Feature Maps
  • Kernel Trick
  • Representer Theorem
  • Slides

Materials

(None)

References

Lab Feb 22

Topics

Materials

References

(None)

Week 6

Topics Materials References

Lecture (RSZ) Feb 28

Topics

  • Maximum Likelihood Estimation
  • Generative and discriminative models
  • Slides

Materials

References

(None)

Lab Mar 1

Topics

Materials

References

(None)

Week 8

Topics Materials References

Lecture (MR) Mar 21

Topics

  • Bayesian Methods
  • Bayesian Regression
  • Slides

Materials

(None)

References

(None)

Lab Mar 22

Topics

Materials

(None)

References

(None)

Week 9

Topics Materials References

Lecture (MR) Mar 28

Topics

  • Reduction to Binary Classification
  • Linear Multiclass Predictors
  • Structured Prediction
  • Slides

Materials

References

Lab Mar 30

Topics

Materials

(None)

References

(None)

Week 10

Topics Materials References

Lecture (MR) Apr 4

Topics

  • Decision Trees
  • Random Forests
  • AdaBoost
  • Slides

Materials

References

  • JWHT 8.1 (Trees)
  • HTF 9.2 (Trees)

Lab Apr 5

Topics

Materials

(None)

References

  • JWHT 5.2 (Bootstrap)
  • HTF 7.11 (Bootstrap)

Week 11

Topics Materials References

Lecture (RSZ) Apr 11

Topics

  • Forward stagewise additive modeling
  • Gradient boosting
  • Slides

Materials

(None)

References

Lab Apr 12

Topics

Materials

References

Week 12

Topics Materials References

Lecture (MR) Apr 18

Topics

  • Feature Learning
  • Backpropagation
  • Slides

Materials

(None)

References

Lab Apr 19

Topics

Materials

(None)

References

(None)

Week 13

Topics Materials References

Lecture (MR) Apr 25

Topics

  • k-Means
  • Gaussian Mixture Models
  • Expectiation Maximization
  • Slides

Materials

(None)

References

  • HTF, 13.2.1 (k-means)
  • Bishop 9.2,9.3 (GMM/EM)

Week 14

Topics Materials References

Lecture (RSZ) May 2

Topics

Materials

(None)

References

(None)

Lab May 3

Topics

Materials

References

(None)

Assignments

Late Policy: Homeworks are due at 11:59 PM EST on the date specified. You have seven late days in total which can be used throughout the semester without penalty. Once you run out of late days, each additional late day will incur a 20% penalty. For example, if you submit an assignment 1 day late after using all your late days, a score of 90 will only be counted as 72. Note that the maximum late days per homework is two days, meaning that Gradescope will not accept submissions 48 hours after the due date.

Collaboration Policy: You may form study groups and discuss problems with your classmates. However, you must write up the homework solutions and the code from scratch, without referring to notes from your joint session. In your solution to each problem, you must write down the names of any person with whom you discussed the problem—this will not affect your grade.

Submission: Homework should be submitted through Gradescope. If you have not used Gradescope before, please watch this short video: "For students: submitting homework." At the beginning of the semester, you will be added to the Gradescope class roster. This will give you access to the course page, and the assignment submission form. To submit assignments, you will need to:

  1. Upload a single PDF document containing all the math, code, plots, and exposition required for each problem.
  2. Where homework assignments are divided into sections, please begin each section on a new page.
  3. You will then select the appropriate page ranges for each homework problem, as described in the "submitting homework" video.

Feedback: Check Gradescope to get your scores on each individual problem, as well as comments on your answers. Regrading requests should be submitted on Gradescope.

Homework 0

Typesetting your homework

Due: January 1st, 11:59 PM EST

Homework 1

Error Decomposition and Polynomial Regression

Due: February 1st, 11:59 PM EST

Homework 2

Gradient Descent & Regularization

Due: February 15th, 11:59 PM EST

Homework 3

SVMs and Kernel Methods

Due: March 1st, 11:59 PM EST

Homework 4

Probabilistic models

Due: March 22nd, 11:59 PM EST

Homework 5

Multiclass Linear SVM

Due: April 5th, 11:59 PM EST

Homework 6

Decision Trees and Boosting

Due: April 19th, 11:59 PM EST

Homework 7

Computation Graphs, Back-propagation, and Neural Networks

Due: May 3rd, 11:59 PM EST

People

Instructors

Section Leaders

Graders