Applied Machine Learning

Fall 2018, INFO-4604, University of Colorado Boulder


Instructors: Prof. Michael Paul
Arcadia Zhang (TA)
(Office hour: Tuesdays 11:30-12:30, TLC 266)
(Office hour: Wednesdays 11:30-12:30, TLC 215)
Time/Place: Tuesday/Thursday 3:30pm–4:45pm, DUAN G125
Canvas: https://canvas.colorado.edu/courses/22139
Textbook: Raschka and Mirjalili (2017) Python Machine Learning, 2nd Edition.
Introduces algorithms and tools for building intelligent computational systems. Methods will be surveyed for classification, regression and clustering in the context of applications such as document filtering and image recognition. Students will learn the theoretical underpinnings of common algorithms (drawing from mathematical disciplines including statistics and optimization) as well as the skills to apply machine learning in practice. machine learning
Prerequisites:
  • One of INFO 2201, INFO 2301, CSCI 2270, CSCI 3022, or equivalent.
  • Students must be able to program in Python, or be willing to quickly learn.
Schedule
Policies
Resources
Lecture Materials Readings
Part 1: How Machine Learning Works (Foundations of ML)
Tuesday, August 28, 2018
What is machine learning?
An informal introduction. Types of machine learning.
[slides-0]
[slides-1]
Thursday, August 30, 2018
What is machine learning?
A formal introduction. Statistical learning framework.
[slides-1]
Tuesday, September 4, 2018
Mathematical foundations
Geometry of data. Linear regression, K-nearest neighbors classification, K-means clustering.
[slides-2]
  • UML 2.1-2.2
Thursday, September 6, 2018
Linear predictors
Perceptron algorithm.
[slides-3]
  • PML Ch. 2, “Artificial neurons” and “Implementing a perceptron” sections
  • 5604: UML 9.1 (skip 9.1.3)
Tuesday, September 11, 2018
Gradient descent
Optimization methods, stochastic gradient descent.
[slides-4]
Thursday, September 13, 2018
Catch up day
Review concepts so far.
[practice]
Tuesday, September 18, 2018
Logistic regression
Probabilistic classification.
[slides-5]
Thursday, September 20, 2018
Regularization
Overfitting and bias-variance tradeoff. Introducing inductive bias.
[slides-6]
  • PML Ch. 3, “Tackling overfitting” section
  • For more depth: UML Ch. 13
Tuesday, September 25, 2018
Multiclass prediction
Multiclass and multi-label classification. Multinomial logistic regression.
[slides-7]
  • UML 17.1
Thursday, September 27, 2018
Support vector machines
Large margin classification. Kernel methods.
[slides-8]
[notes]
  • PML Ch. 3, “Maximum margin classification” and “Solving nonlinear problems” sections
  • For more depth: UML Ch. 15-16
Tuesday, October 2, 2018
Review day
Practice problems.
[practice]
Thursday, October 4, 2018
Nonlinear predictors
Decision trees.
[slides-9]
  • PML Ch. 3, “Decision tree learning” section
  • For more depth: UML 18.2
Tuesday, October 9, 2018
Nonlinear predictors
Neural networks and multilayer perceptron.
[slides-9]
  • PML Ch. 12, “Modeling complex functions” section
  • For more depth: UML Ch. 20
Thursday, October 11, 2018
Catch up day
Finish Part 1 material.
Part 2: Making Machine Learning Work (ML in Practice)
Tuesday, October 16, 2018
Data creation
Data preprocessing. Feature encoding and normalization.
[slides-10]
  • PML Ch. 4, up to the “Selecting meaningful features” section
  • 5604: UML 25.2
Thursday, October 18, 2018
Data creation
Data collection and annotation.
[slides-11]
Tuesday, October 23, 2018
Feature creation
Feature engineering, extraction, and selection.
[slides-12]
  • PML Ch. 4, “Selecting meaningful features” section
  • For more depth: UML 25.1
Thursday, October 25, 2018
Feature creation
Dimensionality reduction.
[slides-13]
  • PML Ch. 5, “Unsupervised dimensionality reduction” section
  • 5604: PML Ch. 5, “Supervised dimensionality reduction” section
  • For more depth: UML 23.1
Tuesday, October 30, 2018
Model evaluation
Held-out data and cross-validation. Evaluation metrics.
[slides-14]
  • PML Ch. 6
  • For more depth: UML 11.2
Thursday, November 1, 2018
Model diagnosis
Learning curves and confusion matrices.
[slides-14]
  • PML Ch. 6
  • For more depth: UML 11.3
Tuesday, November 6, 2018
Review day
Practice problems.
[practice]
Thursday, November 8, 2018
Midterm Exam
Tuesday, November 13, 2018
Responsible machine learning
Fairness, accountability, and transparency in machine learning.
[slides-15]
Thursday, November 15, 2018
Industry Q&A
Guest speaker.
Tuesday, November 20, 2018
Fall Break – no class
Thursday, November 22, 2018
Thanksgiving – no class
Tuesday, November 27, 2018
Ensemble learning
Combining classifiers.
[slides-16]
  • PML Ch. 7, everything except the “Leveraging weak learners” section
  • 5604: Finish PML Ch. 7
Thursday, November 29, 2018
Generative models
Naive Bayes.
[slides-17]
Tuesday, December 4, 2018
Semi-supervised learning
Utilizing unlabeled data. Self-training.
[slides-18]
Thursday, December 6, 2018
Semi-supervised learning
Latent variables and expectation maximization.
[slides-18]
Tuesday, December 11, 2018
Topic models
Unsupervised Naive Bayes and Latent Dirichlet Allocation.
[slides-19]
Thursday, December 13, 2018
Bayesian learning
Revisiting priors and regularization.
[slides-19]
Final Exam Period
Tuesday, December 18, 2018
Final Project Presentations
Time: 1:30pm-4:00pm       Location: DUAN G125 (our usual room)