Introduction to Apache Spark (edX)

Introduction to Apache Spark (edX)
Free Course
Categories
Effort
Certification
Languages
Programming background and experience with Python required. All exercises will use PySpark (part of Apache Spark). Previous experience with Spark NOT required.
Misc

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Introduction to Apache Spark (edX)
Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals. Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued.

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL.

This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini-quiz before the course and take this Python mini-course if they need to learn Python or refresh their Python knowledge.

What you'll learn:

- Basic Spark architecture

- Common operations

- How to avoid coding mistakes

- How to debug your Spark program



MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Free Course
Programming background and experience with Python required. All exercises will use PySpark (part of Apache Spark). Previous experience with Spark NOT required.

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.