EdX

Big Data Computing with Spark (edX)

Offered by The Hong Kong University of Science and Technology - HKUST, HKUSTx,

Learn the theory and gain hands-on experience of big data systems, using Spark as the exemplary platform. Big data systems such as Hadoop and Spark emerge as enabling technologies in managing massive amounts of data across hundreds or even thousands of computing nodes. Meanwhile, cloud computing platforms have made these technologies easily accessible to individuals as well as large enterprises.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

This course exposes students to both the theory and hands-on experience of big data systems, using Spark as the exemplary platform.

What you'll learn

Spark programming using both RDD and DataFrame APIs
Useful packages including ML, GraphX/GraphFrames, and SparkStreaming
Spark internals and performance optimizations
Algorithm design for big data systems

Syllabus

Week 1: Overview, MapReduce, and Hadoop
Week 2-3: Spark Basics and RDD
Week 4: SparkSQL and MLib
Week 5: Spark internals
Week 6: Algorithm design for big data
Week 7: GraphX/GraphFrames
Week 8: Spark Streaming

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

EdX

University of Adelaide,AdelaideX

Big Data Analytics (edX)

Statistics & Data Analysis Data Science

Learn key technologies and techniques, including R and Apache Spark, to analyse large-scale data sets to uncover valuable business information. Gain essential skills in today’s digital age to store, process and analyse data to inform business decisions.

Self Paced

Self-Paced

Big Data R Language Statistical Analysis

Minería de Datos: Segmentación de Mercados (edX)

EdX

Universidad Anáhuac,AnahuacX

Minería de Datos: Segmentación de Mercados (edX)

Computer Science

¿Conoces realmente a tus clientes? En este curso aprenderás a construir modelos basados en técnicas de minería de datos, que te permitirán descubrir los hábitos de compra de tus clientes, para definir estrategias de marketing de acuerdo con sus perfiles, hacer una correcta toma de decisiones y tener una ventaja competitiva a través de herramientas de minería de datos.

Self Paced

Self-Paced

Marketing Big Data Data Mining

EdX

University of Adelaide,AdelaideX

Programming for Data Science (edX)

CS: Programming Data Science

Learn how to apply fundamental programming concepts, computational thinking and data analysis techniques to solve real-world data science problems. There is a rising demand for people with the skills to work with Big Data sets and this course can start you on your journey through our Big Data MicroMasters program towards a recognised credential in this highly competitive area. Using practical activities you will learn how digital technologies work and will develop your coding skills through engaging and collaborative assignments.

Self Paced

Self-Paced

Programming Big Data Data Analysis

Introduction to Computer Science and Programming (edX)

EdX

Tokyo Institute of Technology,TokyoTechX

Introduction to Computer Science and Programming (edX)

CS: Programming

The term “Computation” refers to the action performed by a computer. A computation can be a basic operation and it can also be a sophisticated computer simultation requiring a large amount of data and substantial resources. This course aims at introducing learners with no prior knowledge to basics and key concepts of computer science. By following the lectures and exercises of this course you will have an understanding of algorithms and you will get a real experience of programming using the language Ruby.

Self Paced

Self-Paced

Programming Artificial Intelligence Big Data

Industry 4.0: How to Revolutionize your Business (edX)

EdX

The Hong Kong Polytechnic University,HKPolyUx

Industry 4.0: How to Revolutionize your Business (edX)

Engineering Business

An introduction to the fourth industrial revolution, it's major systems and technologies and how new products and services will impact business and society. We have witnessed the power of mechanization in the early nineteen century, automation in the seventies, information and the internet in the last decades. But now, the adaptation of connected intelligence into the business and social fabrics is advancing at an astonishing speed, which will completely change the way we conduct business.

Self Paced

Self-Paced

Digital Technology Big Data

Introduction to Management Information Systems (MIS): A Survival Guide (edX)

EdX

Universidad Carlos III de Madrid - UC3M,UC3Mx

Introduction to Management Information Systems (MIS): A Survival Guide (edX)

Management & Leadership

Gain the skills and knowledge needed to succeed in an MIS-dominated corporate world. This MIS course will cover supporting tech infrastructures (Cloud, Databases, Big Data), the MIS development/ procurement process, and the main integrated systems, ERPs, such as SAP®, Oracle® or Microsoft Dynamics Navision®, as well as their relationship with Business Process Redesign.

Self Paced

Self-Paced

Cloud Databases Big Data

Data Analytics and Visualization in Health Care (edX)

EdX

Rochester Institute of Technology,RITx

Data Analytics and Visualization in Health Care (edX)

Health & Society Statistics & Data Analysis

Learn best practices in data analytics, informatics, and visualization to gain literacy in data-driven, strategic imperatives that affect all facets of health care. Big data is transforming the health care industry relative to improving quality of care and reducing costs—key objectives for most organizations. Employers are desperately searching for professionals who have the ability to extract, analyze, and interpret data from patient health records, insurance claims, financial records, and more to tell a compelling and actionable story using health care data analytics.

Self Paced

Self-Paced

Healthcare Artificial Intelligence Informatics

Demystifying Biomedical Big Data: A User’s Guide (edX)

EdX

Georgetown University,GeorgetownX

Demystifying Biomedical Big Data: A User’s Guide (edX)

Sci: Biology & Life Sciences Medicine & Pharmacology

Whether you are a student, basic scientist, researcher, clinician, or librarian, this course is designed to help you understand, analyze, and interpret biomedical big data.

No session available

5-12 Weeks

Big Data Biomedical Systems Biology

Spark, Hadoop, and Snowflake for Data Engineering (edX)

EdX

AI (Pragmatic AI Labs)

Spark, Hadoop, and Snowflake for Data Engineering (edX)

Computer Science

Gain the skills for building efficient and scalable data pipelines. Explore essential data engineering platforms (Hadoop, Spark, and Snowflake) and learn how to optimize them using Python, PySpark, and MLflow.

Self Paced

Self-Paced

Python Hadoop Spark

Biostatistics for Big Data Applications (edX)

EdX

University of Texas Medical Branch

Biostatistics for Big Data Applications (edX)

Statistics & Data Analysis

Learn data analysis basics for working with biomedical big data with practical hands-on examples using R. This course provides a broad foundation of statistical terms and concepts as well as an introduction to the R statistical software package. The topics covered are fundamental components of biostatistical methods used in both omics and population health research.

No sessions Available

5-12 Weeks

Big Data Biostatistics Biomedical

Leading Digital and Data Decision Making (edX)

EdX

Arizona State University,ASUx

Leading Digital and Data Decision Making (edX)

Business

In this course, you will learn how leaders make managerial and relevant decisions based on data across multiple global industries. You will also explore how companies benefit from a digital ecosystem including sensors (IoT), Blockchain, artificial intelligence (AI), and augmented reality (AR) that move data-driven insights from the data scientist to the boardroom.

This course is archived

5-12 Weeks

Data Artificial Intelligence Digital

EdX

The Hong Kong University of Science and Technology - HKUST,HKUSTx

Foundations of Data Analytics (edX)

Statistics & Data Analysis Computer Science

Learn the fundamental techniques for data analytics and to be prepared for learning and applying more advanced big data technologies. Foundations of Data Analytics: This course will provide fundamental techniques for data analytics, including data collection, data extraction, data integration, data cleansing, and basic machine learning techniques.

Self Paced

Self-Paced

Machine Learning Big Data Data Privacy