Big Data Analysis with Apache Spark (edX)

Start Date
No sessions available
Big Data Analysis with Apache Spark (edX)
Course Auditing
Categories
Effort
Certification
Languages
Programming background and experience with Python required. All exercises will use PySpark (part of Apache Spark). Previous experience with Spark equivalent to CS105x: Introduction to Spark required.
Misc

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Big Data Analysis with Apache Spark (edX)
Learn how to apply data science techniques using parallel programming in Apache Spark to explore big data. Organizations use their data to support and influence decisions and build data-intensive products and services, such as recommendation, prediction, and diagnostic systems. The collection of skills required by organizations to support these functions has been grouped under the term ‘data science’.

Class Deals by MOOC List - Click here and see edX's Active Discounts, Deals, and Promo Codes.

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

This statistics and data analysis course will attempt to articulate the expected output of data scientists and then teach students how to use PySpark (part of Spark) to deliver against these expectations. The course assignments include log mining, textual entity recognition, and collaborative filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), and previous experience with Spark equivalent to Introduction to Apache Spark, is required.


What you'll learn:

- How to use Apache Spark to perform data analysis

- How to use parallel programming to explore data sets

- Apply log mining, textual entity recognition and collaborative filtering techniques to real-world data questions



0
No votes yet

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Course Auditing
83.00 EUR
Programming background and experience with Python required. All exercises will use PySpark (part of Apache Spark). Previous experience with Spark equivalent to CS105x: Introduction to Spark required.

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.