EdX

Introduction to Apache Spark (edX)

Offered by University of California, Berkeley,

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals. Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers are becoming increasingly valued.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

This statistics and data analysis course will teach you the basics of working with Spark and will provide you with the necessary foundation for diving deeper into Spark. You’ll learn about Spark’s architecture and programming model, including commonly used APIs. After completing this course, you’ll be able to write and debug basic Spark applications. This course will also explain how to use Spark’s web user interface (UI), how to recognize common coding errors, and how to proactively prevent errors. The focus of this course will be Spark Core and Spark SQL.
This course covers advanced undergraduate-level material. It requires a programming background and experience with Python (or the ability to learn it quickly). All exercises will use PySpark (the Python API for Spark), but previous experience with Spark or distributed computing is NOT required. Students should take this Python mini-quiz before the course and take this Python mini-course if they need to learn Python or refresh their Python knowledge.
What you'll learn:

Basic Spark architecture
Common operations
How to avoid coding mistakes
How to debug your Spark program

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

EdX

MIT,MITx

Building Mobile Experiences (edX)

Management & Leadership CS: Software Engineering

A project-based course that guides students through creating a novel mobile application - from generative research to design, usability, implementation and field evaluation.

No sessions available

5-12 Weeks

Programming Java Android

The Beauty and Joy of Computing - AP® CS Principles Part 2 (edX)

EdX

University of California, Berkeley,BerkeleyX

The Beauty and Joy of Computing - AP® CS Principles Part 2 (edX)

CS: Software Engineering Computer Science

A computer science principles course for anyone who wants to learn how to translate ideas into code. Discover the big ideas and thinking practices in computer science plus learn how to code using one of the friendliest programming languages, Snap! (based on Scratch).

No sessions available

13-24 Weeks

Programming Artificial Intelligence Computing

EdX

University of Pennsylvania,PennX

Big Data and Education (edX)

Education Statistics & Data Analysis

Learn the methods and strategies for using large-scale educational data to improve education and make discoveries about learning. Online and software-based learning tools have been used increasingly in education. This movement has resulted in an explosion of data, which can now be used to improve educational effectiveness and support basic research on learning.

Self Paced

Self-Paced

Education Big Data Data Mining

EdX

StanfordOnline

Computer Science 101 (edX)

CS: Software Engineering Engineering

Introduction to Computer Science for a zero-prior-experience audience. Play with little phrases of code to understand what computers are all about. CS101 is a self-paced course that teaches the essential ideas of Computer Science for a zero-prior-experience audience.

Self Paced

Self-Paced

Programming Computer Science Coding

HTML5 Coding Essentials and Best Practices (edX)

EdX

World Wide Web Consortium - W3C,W3Cx

HTML5 Coding Essentials and Best Practices (edX)

CS: Software Engineering CS: Programming

Learn how to write Web pages and Web sites by mastering HTML5 coding techniques and best practices. HTML5 is the standard language of the Web, developed by W3C. For application developers and industry, HTML5 represents a set of features that people will be able to rely on for years to come. HTML5 is supported on a wide variety of devices, lowering the cost of creating rich applications to reach users everywhere.

Self Paced

Self-Paced

Programming HTML5 Coding

EdX

Harvey Mudd College,HarveyMuddX

Programming in Scratch (edX)

CS: Software Engineering CS: Programming

See how easy learning computer science can be. Use Scratch to create games, animations, stories and more. Want to learn computer programming, but unsure where to begin? This is the course for you! Scratch is the computer programming language that makes it easy and fun to create interactive stories, games and animations and share them online.

No sessions available

5-12 Weeks

Programming Computer Science Games

Mobile Computing with App Inventor - CS Principles (edX)

EdX

Trinity College, Hartford

Mobile Computing with App Inventor - CS Principles (edX)

CS: Software Engineering

This course introduces basic principles of computer science by designing and building mobile apps in App Inventor for Android. Learn to use the open development tool, App Inventor, to program on Android devices. You will learn how to design and build mobile apps -- apps that are aware of their location, send and receive text messages, and give advice and directions. The only limit on the types of apps you will learn to build is your own imagination!

No sessions available

5-12 Weeks

Programming Android Mobile Applications

MyCS: Computer Science for Beginners (edX)

EdX

Harvey Mudd College

MyCS: Computer Science for Beginners (edX)

CS: Software Engineering

In this fun and creative introduction to computer science for learners of all ages, you'll learn and apply concepts by programming in Scratch. How do computers work? What do computer scientists do? What does it take to make a computer or a computer program work? We answer these questions and more with MyCS: Computer Science for Beginners.

No sessions available

5-12 Weeks

Programming Computer Science Scratch

Wiretaps to Big Data: Privacy and Surveillance in the Age of Interconnection (edX)

EdX

Cornell University

Wiretaps to Big Data: Privacy and Surveillance in the Age of Interconnection (edX)

Social Sciences

Explore the privacy issues of an interconnected world. How does cellular technology enable massive surveillance? Do users have rights against surveillance? How does surveillance affect how we use cellular and other technologies? How does it affect our democratic institutions? Do you know that the metadata collected by a cellular network speaks volumes about its users? In this course you will explore all of these questions while investigating related issues in WiFi and Internet surveillance.

No sessions available

5-12 Weeks

Cryptography Technology Big Data

EdX

World Wide Web Consortium - W3C,W3Cx

HTML5 Apps and Games (edX)

CS: Software Engineering CS: Programming

Today, developers are increasingly moving from native to HTML5-based apps. Increase your ability to design and deliver innovative services on the Web! Want to learn advanced HTML5 tips and techniques? This is the course for you! Find out more about the powerful Web features that will help you create great content and apps.

Self Paced

Self-Paced

Programming HTML5 Applications Development

CS For All: Introduction to Computer Science and Python Programming (edX)

EdX

Harvey Mudd College,HarveyMuddX

CS For All: Introduction to Computer Science and Python Programming (edX)

CS: Software Engineering

A fun, fast-paced introduction to solving interesting problems with computer science through Python programming. Looking to get started with computer science while learning to program in Python? This computer science course provides an introduction to computer science that’s both challenging and fun.

No sessions available

13-24 Weeks

Programming Python Computer Science

EdX

University of California, San Diego,UC San DiegoX

Big Data Analytics Using Spark (edX)

Statistics & Data Analysis Data Science

Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform. In data science, data is called “big” if it cannot fit into the memory of a single standard laptop or workstation. The analysis of big datasets requires using a cluster of tens, hundreds or thousands of computers. Effectively using such clusters requires the use of distributed files systems, such as the Hadoop Distributed File System (HDFS) and corresponding computational models, such as Hadoop, MapReduce and Spark.

Dec 5th 2023

5-12 Weeks

Machine Learning Big Data Hadoop