EdX

Analyzing Data with Python (edX)

Offered by IBM,
Analyzing Data with Python (edX)

In this course, you will learn how to analyze data in Python using multi-dimensional arrays in numpy, manipulate DataFrames in pandas, use SciPy library of mathematical routines, and perform machine learning using scikit-learn!

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

Learn how to analyze data using Python in this introductory course. You will go from understanding the basics of Python to exploring many different types of data through lecture, hands-on labs, and assignments. You will learn how to prepare data for analysis, perform simple statistical analyses, create meaningful data visualizations, predict future trends from data, and more!
This course is part of the following programs:

What you'll learn
You will learn how to:

  • How to import data sets, clean and prepare data for analysis, summarize data, and build data pipelines
  • Use Pandas DataFrames, Numpy multidimensional arrays, and SciPy libraries to work with various datasets
  • Load, manipulate, analyze, and visualize datasets with pandas, an open-source library
  • Build machine-learning models and make predictions with scikit-learn, another open-source library

It includes following parts:
Data Analysis libraries: will learn to use Pandas DataFrames, Numpy multi-dimentional arrays, and SciPy libraries to work with a various datasets. We will introduce you to pandas, an open-source library, and we will use it to load, manipulate, analyze, and visualize cool datasets. Then we will introduce you to another open-source library, scikit-learn, and we will use some of its machine learning algorithms to build smart models and make cool predictions.

Syllabus

Module 1 - Importing Datasets

  • Learning Objectives
  • Understanding the Domain
  • Understanding the Dataset
  • Python package for data science
  • Importing and Exporting Data in Python
  • Basic Insights from Datasets

Module 2 - Cleaning and Preparing the Data

  • Identify and Handle Missing Values
  • Data Formatting
  • Data Normalization Sets
  • Binning
  • Indicator variables

Module 3 - Summarizing the Data Frame

  • Descriptive Statistics
  • Basic of Grouping
  • ANOVA
  • Correlation
  • More on Correlation

Module 4 - Model Development

  • Simple and Multiple Linear Regression
  • Model EvaluationUsingVisualization
  • Polynomial Regression and Pipelines
  • R-squared and MSE for In-Sample Evaluation
  • Prediction and Decision Making

Module 5 - Model Evaluation

  • Model Evaluation
  • Over-fitting, Under-fitting and Model Selection
  • Ridge Regression
  • Grid Search
  • Model Refinement
Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Data, Analytics and Learning (edX) EdX
University of Texas at Arlington,UTArlingtonX

Data, Analytics and Learning (edX)

An introduction to the logic and methods of analysis of data to improve teaching and learning. Capturing and analyzing data has changed how decisions are made and resources are allocated in businesses, journalism, government, and military and intelligence fields. Through better use of data, leaders are able to plan and enact strategies with greater clarity and confidence.

No sessions available
4 Weeks
Recommender Systems: Behind the Screen (edX) EdX
Université de Montréal,UMontrealX

Recommender Systems: Behind the Screen (edX)

How are items recommended when you’re browsing for movies, jobs or clothing online? Register here and you’ll discover the fundamental concepts and methods allowing the most relevant item suggestions to users from e-commerce to online advertisement. In this course, you will explore and learn the best methods and practices in recommender systems, which are an essential component of the online ecosystem. This course was developed by IVADO and HEC Montréal as part of a workshop that took place in Montreal.

Sep 26th 2023
5-12 Weeks
Statistics for Business Analytics: Modelling and Forecasting (edX) EdX
University of Queensland,UQx

Statistics for Business Analytics: Modelling and Forecasting (edX)

This is a great course for anyone who wants to gain foundational and critical analysis and statistics skills with no prior background. In this course, we explore statistical methods for examining the relationships between variables. We also consider how data from the past can be used to make forecasts about likely future trends.

Apr 7th 2023
4 Weeks
Big Data Analytics Using Spark (edX) EdX
University of California, San Diego,UC San DiegoX

Big Data Analytics Using Spark (edX)

Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform. In data science, data is called “big” if it cannot fit into the memory of a single standard laptop or workstation. The analysis of big datasets requires using a cluster of tens, hundreds or thousands of computers. Effectively using such clusters requires the use of distributed files systems, such as the Hadoop Distributed File System (HDFS) and corresponding computational models, such as Hadoop, MapReduce and Spark.

Dec 5th 2023
5-12 Weeks
Foundations of Data Analysis - Part 1: Statistics Using R (edX) EdX
University of Texas at Austin,UTAustinX

Foundations of Data Analysis - Part 1: Statistics Using R (edX)

This is a hands on course with a data lab to teach fundamental statistical topics such as descriptive statistics, inferential testing, and modeling. In this first part of a two part course, we’ll walk through the basics of statistical thinking – starting with an interesting question. Then, we’ll learn the correct statistical tool to help answer our question of interest – using R and hands-on Labs.

No sessions available
5-12 Weeks
Introduction to Statistics: Descriptive Statistics (edX) EdX
University of California, Berkeley,BerkeleyX

Introduction to Statistics: Descriptive Statistics (edX)

An introduction to descriptive statistics, emphasizing critical thinking and clear communication. We are surrounded by information, much of it numerical, and it is important to know how to make sense of it. Stat2x is an introduction to the fundamental concepts and methods of statistics, the science of drawing conclusions from data.

No sessions available
4 Weeks
Statistical Predictive Modelling and Applications (edX) EdX
University of Edinburgh,EdinburghX

Statistical Predictive Modelling and Applications (edX)

Learn how to apply statistical modelling techniques to real-world business scenarios using Python. In this course, you will learn three predictive modelling techniques - linear and logistic regression, and naive Bayes - and their applications in real-world scenarios. The first half of the course focuses on linear regression. This technique allows you to model a continuous outcome variable using both continuous and categorical predictors. This technique enables you to predict product sales based on several customer variables.

Jan 18th 2022
5-12 Weeks
Machine Learning with Python: from Linear Models to Deep Learning (edX) EdX
MIT,MITx

Machine Learning with Python: from Linear Models to Deep Learning (edX)

An in-depth introduction to the field of machine learning, from linear models to deep learning and reinforcement learning, through hands-on Python projects. Machine learning methods are commonly used across engineering and sciences, from computer systems to physics. Moreover, commercial sites such as search engines, recommender systems (e.g., Netflix, Amazon), advertisers, and financial institutions employ machine learning algorithms for content recommendation, predicting customer behavior, compliance, or risk.

May 27th 2024
13-24 Weeks
Knowledge Management and Big Data in Business (edX) EdX
The Hong Kong Polytechnic University,HKPolyUx

Knowledge Management and Big Data in Business (edX)

Learn why and how knowledge management and Big Data are vital to the new business era. The business landscape is changing so rapidly that traditional management, business and computing courses do not meet the needs for the next generation of workers in the business world. Most traditional methods are of a repetitive, rule-based nature and will be gradually replaced by Artificial Intelligence.

Self Paced
Self-Paced