All material is here: https://github.com/amueller/scipy_2015_sklearn_tutorial
Outline
Morning Session
- What is machine learning? (Sample applications)
- Kinds of machine learning: unsupervised vs supervised.
- Data formats and preparation.
- Supervised learning
- Interface
- Training and test data
- Classification
- Regression
- Unsupervised Learning
- Unsupervised transformers
- Preprocessing and scaling
- Dimensionality reduction
- Clustering
- Summary : Estimator interface
- Application : Classification of digits
- Application : Eigenfaces
- Methods: Text feature abstraction, bag of words
- Application : SMS spam detection
- Summary : Model building and generalization
Afternoon Session
- Cross-Validation
- Model Complexity: Overfitting and underfitting
- Complexity of various model types
- Grid search for adjusting hyperparameters
- Basic regression with cross-validation
- Application : Titanic survival with Random Forest
- Building Pipelines
- Motivation and Basics
- Preprocessing and Classification
- Grid-searching Parameters of the feature extraction
- Application : Image classification
- Model complexity, learning curves and validation curves
- In-Depth supervised models
- Linear Models
- Kernel SVMs
- trees and Forests
- Learning with Big Data
- Out-Of-Core learning
- The hashing trick for large text corpuses
No comments:
Post a Comment