PySpark

Sort options

Introduction to PySpark (Coursera)

Sep 16th 2024
Introduction to PySpark (Coursera)
Course Auditing
Categories
Effort
Languages
Welcome to Introduction to PySpark, a short course strategically crafted to empower you with the skills needed to assess the concepts of Big Data Management and efficiently perform data analysis using PySpark. Throughout this short course, you will acquire the expertise to perform data processing with PySpark, enabling you [...]

Spark, Hadoop, and Snowflake for Data Engineering (edX)

Self Paced
Spark, Hadoop, and Snowflake for Data Engineering (edX)
Course Auditing
Categories
Effort
Languages
Gain the skills for building efficient and scalable data pipelines. Explore essential data engineering platforms (Hadoop, Spark, and Snowflake) and learn how to optimize them using Python, PySpark, and MLflow.

Python and Pandas for Data Engineering (edX)

Self Paced
Python and Pandas for Data Engineering (edX)
Course Auditing
Categories
Effort
Languages
Master Python essentials and Pandas for data engineering. Learn to set up development environments, manipulate data, and efficiently solve real-world problems.

Big Data Analytics Using Spark (edX)

Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform. In data science, data is called “big” if it cannot fit into the memory of a single standard laptop or workstation. The analysis of big datasets requires using a cluster of tens, hundreds or [...]

Spark (Udacity)

Self Paced
Spark (Udacity)
Free Course
Categories
Effort
Languages
Master how to work with big data and build machine learning models at scale using Spark! In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python [...]

Introduction to Apache Spark (edX)

Learn the fundamentals and architecture of Apache Spark, the leading cluster-computing framework among professionals. Spark is rapidly becoming the compute engine of choice for big data. Spark programs are more concise and often run 10-100 times faster than Hadoop MapReduce jobs. As companies realize this, Spark developers [...]