MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.
Machine learning brings together computer science and statistics to harness that predictive power. It’s a must-have skill for all aspiring data analysts and data scientists, or anyone else who wants to wrestle all that raw data into refined trends and predictions.
This is a class that will teach you the end-to-end process of investigating data through a machine learning lens. It will teach you how to extract and identify useful features that best represent your data, a few of the most important machine learning algorithms, and how to evaluate the performance of your machine learning algorithms.
This course is also a part of our Data Analyst Nanodegree.
Syllabus
LESSON 1
Welcome to Machine Learning
- Learn what Machine Learning is and meet Sebastian Thrun!
- Find out where Machine Learning is applied in Technology and Science.
LESSON 2
Naive Bayes
- Use Naive Bayes with scikit learn in python.
- Splitting data between training sets and testing sets with scikit learn.
- Calculate the posterior probability and the prior probability of simple distributions.
LESSON 3
Support Vector Machines
- Learn the simple intuition behind Support Vector Machines.
- Implement an SVM classifier in SKLearn/scikit-learn.
- Identify how to choose the right kernel for your SVM and learn about RBF and Linear Kernels.
LESSON 4
Decision Trees
- Code your own decision tree in python.
- Learn the formulas for entropy and information gain and how to calculate them.
- Implement a mini project where you identify the authors in a body of emails using a decision tree in Python.
LESSON 5
Choose your own Algorithm
- Decide how to pick the right Machine Learning Algorithm among K-Means, Adaboost, and Decision Trees.
LESSON 6
Datasets and Questions
- Apply your Machine Learning knowledge by looking for patterns in the Enron Email Dataset.
- You'll be investigating one of the biggest frauds in American history!
LESSON 7
Regressions
- Understand how continuous supervised learning is different from discrete learning.
- Code a Linear Regression in Python with scikit-learn.
- Understand different error metrics such as SSE, and R Squared in the context of Linear Regressions.
LESSON 8
Outliers
- Remove outliers to improve the quality of your linear regression predictions.
- Apply your learning in a mini project where you remove the residuals on a real dataset and reimplement your regressor.
- Apply your same understanding of outliers and residuals on the Enron Email Corpus.
LESSON 9
Clustering
- Identify the difference between Unsupervised Learning and Supervised Learning.
- Implement K-Means in Python and Scikit Learn to find the center of clusters.
- Apply your knowledge on the Enron Finance Data to find clusters in a real dataset.
LESSON 10
Feature Scaling
- Understand how to preprocess data with feature scaling to improve your algorithms.
- Use a min mx scaler in sklearn.
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.