Apache Spark

Sort options

Spark, Hadoop, and Snowflake for Data Engineering (Coursera)

May 27th 2024
Spark, Hadoop, and Snowflake for Data Engineering (Coursera)
Course Auditing
Categories
Effort
Languages
This is primarily aimed at first- and second-year undergraduates interested in engineering or science, along with high school students and professionals with an interest in programming. Gain the skills for building efficient and scalable data pipelines.

Big Data Analysis with Scala and Spark (Scala 2 version) (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written [...]

Apache Spark (TM) SQL for Data Analysts (Coursera)

Apache Spark is one of the most widely used technologies in big data analytics. In this course, you will learn how to leverage your existing SQL skills to start working with Spark immediately. You will also learn how to work with Delta Lake, a highly performant, open-source storage layer [...]

Scalable Machine Learning on Big Data using Apache Spark (Coursera)

May 27th 2024
Scalable Machine Learning on Big Data using Apache Spark (Coursera)
Course Auditing
Categories
Effort
Languages
This course will empower you with the skills to scale data science and machine learning (ML) tasks on Big Data sets using Apache Spark. Most real world machine learning work involves very large data sets that go beyond the CPU, memory and storage limitations of a single computer. Apache [...]

Big Data Analysis with Scala and Spark (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written [...]

Machine Learning With Big Data (Coursera)

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use [...]

Machine Learning with Apache Spark (Coursera)

May 20th 2024
Machine Learning with Apache Spark (Coursera)
Course Auditing
Categories
Effort
Languages
Explore the exciting world of machine learning with this IBM course. Start by learning ML fundamentals before unlocking the power of Apache Spark to build and deploy ML models for data engineering applications. Dive into supervised and unsupervised learning techniques and discover the revolutionary possibilities of Generative AI through [...]
May 20th 2024
Course Auditing
45.00 EUR

Microsoft Azure Databricks for Data Engineering (Coursera)

In this course, you will learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud. You will discover the capabilities of Azure Databricks and the Apache Spark notebook for processing huge files. [...]

Perform data science with Azure Databricks (Coursera)

May 20th 2024
Perform data science with Azure Databricks (Coursera)
Course Auditing
Categories
Effort
Languages
In this course, you will learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run data science workloads in the cloud. This is the fourth course in a five-course program that prepares you to take the DP-100: Designing and [...]

Data Engineering with MS Azure Synapse Apache Spark Pools (Coursera)

May 20th 2024
Data Engineering with MS Azure Synapse Apache Spark Pools (Coursera)
Course Auditing
Categories
Effort
Languages
In this course, you will learn how to perform data engineering with Azure Synapse Apache Spark Pools, which enable you to boost the performance of big-data analytic applications by in-memory cluster computing. You will learn how to differentiate between Apache Spark, Azure Databricks, HDInsight, and SQL Pools and understand [...]