Spark

Sort options

Spark, Hadoop, and Snowflake for Data Engineering (Coursera)

Apr 29th 2024
Spark, Hadoop, and Snowflake for Data Engineering (Coursera)
Course Auditing
Categories
Effort
Languages
This is primarily aimed at first- and second-year undergraduates interested in engineering or science, along with high school students and professionals with an interest in programming. Gain the skills for building efficient and scalable data pipelines.

Big Data Analysis with Scala and Spark (Scala 2 version) (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written [...]

Arquitecturas de Big Data (Coursera)

El curso de Arquitecturas de Big Data busca que identifiques las características de una solución de Big Data, los datos asociados a estas soluciones, la infraestructura requerida, y las técnicas de procesamiento escalable. Desarrollaremos ejemplos usando infraestructuras basadas en Hadoop y en Spark, teniendo presente la pertinencia de las [...]

Introduction to Big Data with Spark and Hadoop (Coursera)

Apr 29th 2024
Introduction to Big Data with Spark and Hadoop (Coursera)
Course Auditing
Categories
Effort
Languages
Bernard Marr defines Big Data as the digital trace that we are generating in this digital era. In this course, you will learn about the characteristics of Big Data and its application in Big Data Analytics. You will gain an understanding about the features, benefits, limitations, and applications of [...]

Scalable Machine Learning on Big Data using Apache Spark (Coursera)

Apr 29th 2024
Scalable Machine Learning on Big Data using Apache Spark (Coursera)
Course Auditing
Categories
Effort
Languages
This course will empower you with the skills to scale data science and machine learning (ML) tasks on Big Data sets using Apache Spark. Most real world machine learning work involves very large data sets that go beyond the CPU, memory and storage limitations of a single computer. Apache [...]

Distributed Programming in Java (Coursera)

This course teaches learners (industry professionals and students) the fundamental concepts of Distributed Programming in the context of Java 8. Distributed programming enables developers to use multiple nodes in a data center to increase throughput and/or reduce latency of selected applications. By the end of this course, [...]

Big Data Analysis with Scala and Spark (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written [...]

Data Manipulation at Scale: Systems and Algorithms (Coursera)

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making --- we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade [...]

Machine Learning With Big Data (Coursera)

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use [...]