Spark

Sort options

Spark, Hadoop, and Snowflake for Data Engineering (Coursera)

May 27th 2024
Spark, Hadoop, and Snowflake for Data Engineering (Coursera)
Course Auditing
Categories
Effort
Languages
This is primarily aimed at first- and second-year undergraduates interested in engineering or science, along with high school students and professionals with an interest in programming. Gain the skills for building efficient and scalable data pipelines.

Big Data Analysis with Scala and Spark (Scala 2 version) (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written [...]

Arquitecturas de Big Data (Coursera)

El curso de Arquitecturas de Big Data busca que identifiques las características de una solución de Big Data, los datos asociados a estas soluciones, la infraestructura requerida, y las técnicas de procesamiento escalable. Desarrollaremos ejemplos usando infraestructuras basadas en Hadoop y en Spark, teniendo presente la pertinencia de las [...]

Scalable Machine Learning on Big Data using Apache Spark (Coursera)

May 27th 2024
Scalable Machine Learning on Big Data using Apache Spark (Coursera)
Course Auditing
Categories
Effort
Languages
This course will empower you with the skills to scale data science and machine learning (ML) tasks on Big Data sets using Apache Spark. Most real world machine learning work involves very large data sets that go beyond the CPU, memory and storage limitations of a single computer. Apache [...]

Distributed Programming in Java (Coursera)

This course teaches learners (industry professionals and students) the fundamental concepts of Distributed Programming in the context of Java 8. Distributed programming enables developers to use multiple nodes in a data center to increase throughput and/or reduce latency of selected applications. By the end of this course, [...]

Big Data Analysis with Scala and Spark (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written [...]

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade [...]

Machine Learning With Big Data (Coursera)

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use [...]

Data Engineering Capstone Project (Coursera)

May 20th 2024
Data Engineering Capstone Project (Coursera)
Course Auditing
Categories
Effort
Languages
In this course you will apply a variety of data engineering skills and techniques you have learned as part of the previous courses in the IBM Data Engineering Professional Certificate. You will assume the role of a Junior Data Engineer who has recently joined the organization and be presented [...]