Applied Machine Learning

Fall 2017, INFO-4604, University of Colorado Boulder


Instructor: Prof. Michael Paul (Office hours: Thursdays 3:30–4:45pm, ENVD 207)
Time/Place: Tuesday/Thursday 5:00pm–6:15pm, HUMN 135
Discussion: http://piazza.com/colorado/fall2017/info4604
Textbook: Sebastian Raschka (2015) Python Machine Learning, 1st Edition.
Introduces algorithms and tools for building intelligent computational systems. Methods will be surveyed for classification, regression and clustering in the context of applications such as document filtering and image recognition. Students will learn the theoretical underpinnings of common algorithms (drawing from mathematical disciplines including statistics and optimization) as well as the skills to apply machine learning in practice. machine learning
Prerequisites:
  • One of INFO 2201, INFO 2301, CSCI 2270, CSCI 3022, or equivalent.
  • Students must be able to program in Python, or be willing to quickly learn.
Schedule
Policies
Resources
Connect
Lecture Materials Readings
Part 1: How Machine Learning Works (Foundations of ML)
Tuesday, August 29, 2017
What is machine learning?
An informal introduction. Types of machine learning.
[slides-0]
[slides-1]
Thursday, August 31, 2017
What is machine learning?
A formal introduction. Statistical learning framework.
[slides-1]
Tuesday, September 5, 2017
Mathematical foundations
Geometry of data. Linear regression, K-nearest neighbors classification, K-means clustering.
[slides-2]
  • UML 2.1-2.2
Thursday, September 7, 2017
Linear predictors
Perceptron algorithm.
[slides-3]
  • PML Ch. 2, “Artificial neurons” and “Implementing a perceptron” sections
  • 5604: UML 9.1 (skip 9.1.3)
Tuesday, September 12, 2017
Gradient descent
Optimization methods, stochastic gradient descent.
[slides-4]
Thursday, September 14, 2017
Logistic regression
Probabilistic classification.
[slides-5]
Tuesday, September 19, 2017
Regularization
Overfitting and bias-variance tradeoff. Introducing inductive bias.
[slides-6]
  • PML Ch. 2, “Tackling overfitting” section
  • For more depth: UML Ch. 13
Thursday, September 21, 2017
Multiclass prediction
Multiclass and multi-label classification. Multinomial logistic regression.
[slides-7]
  • UML 17.1
Tuesday, September 26, 2017
Class canceled
Thursday, September 28, 2017
Support vector machines
Large margin classification. Kernel methods.
[slides-8]
  • PML Ch. 3, “Maximum margin classification” and “Solving nonlinear problems” sections
  • For more depth: UML Ch. 15-16
Tuesday, October 3, 2017
Review day
Practice problems.
Thursday, October 5, 2017
Nonlinear predictors
Decision trees.
[slides-9]
  • PML Ch. 3, “Decision tree learning” section
  • For more depth: UML 18.2
Tuesday, October 10, 2017
Nonlinear predictors
Neural networks and multilayer perceptron.
[slides-9]
  • PML Ch. 12, “Modeling complex functions” section
  • For more depth: UML Ch. 20
Thursday, October 12, 2017
Catch up day
Finish Part 1 material.
Part 2: Making Machine Learning Work (ML in Practice)
Tuesday, October 17, 2017
Data creation
Data preprocessing. Feature encoding and normalization.
[slides-10]
  • PML Ch. 4, up to the “Selecting meaningful features” section
  • 5604: UML 25.2
Thursday, October 19, 2017
Data creation
Data collection and annotation.
[slides-11]
Tuesday, October 24, 2017
Feature creation
Feature engineering, extraction, and selection.
[slides-12]
  • PML Ch. 4, “Selecting meaningful features” section
  • For more depth: UML 25.1
Thursday, October 26, 2017
Feature creation
Dimensionality reduction.
[slides-13]
  • PML Ch. 5, “Unsupervised dimensionality reduction” section
  • 5604: PML Ch. 5, “Supervised dimensionality reduction” section
  • For more depth: UML 23.1
Tuesday, October 31, 2017
Model evaluation
Held-out data and cross-validation. Evaluation metrics.
[slides-14]
  • PML Ch. 6
  • For more depth: UML 11.2
Thursday, November 2, 2017
Model diagnosis
Learning curves and confusion matrices.
[slides-14]
  • PML Ch. 6
  • For more depth: UML 11.3
Tuesday, November 7, 2017
Review day
Practice problems.
Thursday, November 9, 2016
Midterm Exam
Tuesday, November 14, 2017
Responsible machine learning
Fairness, accountability, and transparency in machine learning.
[slides-15]
Thursday, November 16, 2017
Industry Q&A
Guest speaker.
Tuesday, November 21, 2017
Fall Break – no class
Thursday, November 23, 2017
Thanksgiving – no class
Tuesday, November 28, 2017
Ensemble learning
Combining classifiers.
[slides-16]
  • PML Ch. 7, everything except the “Leveraging weak learners” section
  • 5604: Finish PML Ch. 7
Thursday, November 30, 2017
Generative models
Naive Bayes.
[slides-17]
Tuesday, December 5, 2017
Semi-supervised learning
Utilizing unlabeled data. Self-training.
[slides-18]
Thursday, December 7, 2017
Semi-supervised learning
Latent variables and expectation maximization.
[slides-18]
Tuesday, December 12, 2017
Topic models
Unsupervised Naive Bayes and Latent Dirichlet Allocation.
[slides-19]
Thursday, December 14, 2017
Bayesian learning
Revisiting priors and regularization.
[slides-19]