Machine Learning for Big Data
The course will take on average 3 days to complete, including practical work
Having problems? check the errata
Introduction 24m 2s What is Machine Learning, Supervised vs Unsupervised Learning and the Model Building Process |
Preview |
Building a Linear Regression 30m 40s Assembling vectors of features and Model Fitting |
Watch |
Training Data 26m 33s Training vs Test and Holdout Data, Using data from Kaggle, RMSE and R2 tests |
Watch |
Model Fitting Parameters 25m 41s Setting Linear Regression Parameters |
Watch |
Feature Selection 36m 23s Correlation of features, Identifying duplicate features, data preparation |
Watch |
Non Numeric Data 25m 48s Using OneHotEncoding and Vectors |
Watch |
Pipelines 19m 42s How to build a pipeline in SparkML |
Watch |
Case Study 34m 51s A full practical exercise |
Watch |
Logistic Regression 26m 12s True and False Negatives and Postives, Coding a Logistic Regression Model |
Watch |
Decision Trees 46m 21s Building a decicision tree model, Interpreting a tree and Random Forests |
Watch |
Unsupervised Learning: K-Means Clustering 10m 49s K-Means Clustering and how to implement in SparkML |
Watch |
Recommender Systems 29m 7s Matrix Factorisation and how to build a model in SparkML |
Watch |