Want to learn the basics of large-scale data processing? Need to make predictive models but don’t know the right tools? This course will introduce you to open source tools you can use for parallel, distributed and scalable machine learning.
Are you interested in predicting future outcomes using your data? This course helps you do just that! Machine learning is the process of developing, testing, and applying predictive algorithms to achieve this goal. Make sure to familiarize yourself with course 3 of this specialization before diving into these machine learning concepts. Building on Course 3, which introduces students to integral supervised machine learning concepts, this course will provide an overview of many additional concepts, techniques, and algorithms in machine learning, from basic classification to decision trees and clustering.
By completing this course, you will learn how to apply, test, and interpret machine learning algorithms as alternative methods for addressing your research questions.
Machine Learning for Data Analysis is course 4 of 5 in the Data Analysis and Interpretation Specialisation.
Learn SAS or Python programming, expand your knowledge of analytical methods and applications, and conduct original research to inform complex decisions. The Data Analysis and Interpretation Specialization takes you from data novice to data expert in just four project-based courses. You will apply basic data science tools and techniques, including data management and visualization, modeling, and machine learning using your choice of either SAS or Python (including, but not limited to, the popular pandas and Scikit-learn python libraries). Throughout the Specialization, you will analyze a research question of your choice and summarize your insights. In the final Capstone Project, you will use real data to address an important issue in society, and report your findings in a professional-quality report. You will have the opportunity to work with our industry partner, DRIVENDATA, to help them solve some of the world's biggest social challenges by joining one of their competitions. Regular feedback from peers will provide you a chance to shape your question in new ways. This Specialization is designed to help you whether you are considering a career in data, work in a context where supervisors are looking to you for guidance about using data, or you just have some burning questions you want to explore. No prior experience is required, but by the end you will have mastered analytical methods and applications to conduct original research that can inform complex decisions.
Week 1: Decision Trees
Week 2: Random Forests
Week 3: Lasso Regression
Week 4: K-Means Cluster Analysis