Coursera

Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames (Coursera)

Offered by Yandex,

No doubt working with huge data volumes is hard, but to move a mountain, you have to deal with a lot of small stones. But why strain yourself? Using Mapreduce and Spark you tackle the issue partially, thus leaving some space for high-level tools. Stop struggling to make your big data workflow productive and efficient, make use of the tools we are offering you.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

This course will teach you how to:

Warehouse your data efficiently using Hive, Spark SQL and Spark DataFframes.
Work with large graphs, such as social graphs or networks.
Optimize your Spark applications for maximum performance.

Precisely, you will master your knowledge in:

Writing and executing Hive & Spark SQL queries;
Reasoning how the queries are translated into actual execution primitives (be it MapReduce jobs or Spark transformations);
Organizing your data in Hive to optimize disk space usage and execution times;
Constructing Spark DataFrames and using them to write ad-hoc analytical jobs easily;
Processing large graphs with Spark GraphFrames;
Debugging, profiling and optimizing Spark application performance.

Still in doubt? Check this out. Become a data ninja by taking this course!

Syllabus

WEEK 1: Welcome to the Second Course: Big Data Analysis; Big Data SQL: Hive
WEEK 2: Big Data SQL: Hive (practice week)
WEEK 3: Spark SQL and Spark Dataframe
WEEK 4: Graph Analysis from Big Data Perspective
WEEK 5: PageRank and Recent Advances
WEEK 6: Spark Internals and Optimization

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Coursera

Google Cloud

Analyzing and Visualizing Data the Google Way (Coursera)

CS: Information & Technology

This learning experience guides you through the process of utilizing various data sources and multiple Google Cloud products (including BigQuery and Google Sheets using Connected Sheets) to analyze, visualize, and interpret data to answer specific questions and share insights with key decision makers.

Aug 17th 2026

1 Week

Data Analysis Data Visualization Google Sheets

Coursera

Johns Hopkins University

Python for Genomic Data Science (Coursera)

Statistics & Data Analysis Data Science

This class provides an introduction to the Python programming language and the iPython notebook. This is the third course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Aug 17th 2026

4 Weeks

Programming Python Big Data

Coursera

Indian Institute of Management Ahmedabad (IIMA)

Pre-MBA Statistics (Coursera)

Statistics & Data Analysis

Welcome to the Pre-MBA Statistics course! By the end of this course, you will be able to describe how statistics can be used to summarize, analyze, and interpret data. This course introduces you to some aspects of descriptive and inferential statistics. You will learn to distinguish between various data types and describe the operations that you can execute with each type of data and the right tools to use.

Aug 17th 2026

5-12 Weeks

Statistics Probability Data Analysis

Coursera

Knowledge Accelerators

Data-Driven Decisions with Power BI (Coursera)

Business

New Power BI users will begin the course by gaining a conceptual understanding of the Power BI desktop application and the Power BI service. Learners will explore the Power BI interface while learning how to manage pages and understand the basics of visualizations. Learners will engage in numerous hands-on experiences to discover how to import, connect, clean, transform, and model their own data in the Power BI desktop application.

Aug 17th 2026

5-12 Weeks

Data Modeling Data Analysis Power BI

Coursera

Johns Hopkins University

Mathematical Biostatistics Boot Camp 2 (Coursera)

Statistics & Data Analysis Data Science

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Aug 17th 2026

4 Weeks

Math Statistics Probability

Coursera

Coursera Instructor Network

Advanced Data Analysis and Collaboration in Qlik Sense (Coursera)

Statistics & Data Analysis Data Science

This course is an advanced level course designed for learners who want to use Qlik Sense to perform sophisticated data analytics, build dashboards, and communicate full reports and stories from their data. These advanced concepts include more than just visualization features such as dynamic filtering and conditional formatting, but more so data functionality such as advanced expressions, drill-downs, leads, lags, and more. This is important as these skills are directly required when creating sophisticated business analyses and dashboards.

Aug 17th 2026

1 Week

Data Analysis Collaboration Visualization

Coursera

University of Illinois at Urbana-Champaign

Infonomics I: Business Information Economics and Data Monetization (Coursera)

Business

Thriving in the Information Age compels organizations to deploy information as an actual business asset, not as an IT asset or merely as a business byproduct. This demands creativity in conceiving and implementing new ways to generate economic benefits from the wide array of information assets available to an organization. Unfortunately, information too frequently is underappreciated and therefore underutilized.

Aug 17th 2026

4 Weeks

Business Economics Big Data

Coursera

University of Illinois at Urbana-Champaign

Infonomics II: Business Information Management and Measurement (Coursera)

Business

Even decades into the Information Age, accounting practices yet fail to recognize the financial value of information. Moreover, traditional asset management practices fail to recognize information as an asset to be managed with earnest discipline. This has led to a business culture of complacence, and the inability for most organizations to fully leverage available information assets. This second course in the two-part Infonomics series explores how and why to adapt well-honed asset management principles and practices to information, and how to apply accepted and new valuation models to gauge information’s potential and realized economic benefits.

Aug 17th 2026

4 Weeks

Business Big Data Accounting

Coursera

Johns Hopkins University

Introduction to Reproducibility in Cancer Informatics (Coursera)

Statistics & Data Analysis Data Science

The course is intended for students in the biomedical sciences and researchers who use informatics tools in their research and have not had training in reproducibility tools and methods.

Aug 17th 2026

5-12 Weeks

Informatics Github Data Analysis

Coursera

Universidad Austral

Fundamentos de Excel para Negocios (Coursera)

Statistics & Data Analysis Data Science

Cuando finalices este curso habrás logrado un gran número de habilidades como introducir información, ordenarla, manipularla, realizar cálculos de diversa índole (matemáticos, trigonométricos, estadísticos, financieros, ingenieriles, probabilísticos), extraer conclusiones, trabajar con fechas y horas, construir gráficos, imprimir reportes y muchas más.

Aug 17th 2026

5-12 Weeks

Business Excel Data Analysis

Coursera

University of Michigan

Applied Machine Learning in Python (Coursera)

Statistics & Data Analysis Data Science

This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through a tutorial.

Aug 17th 2026

4 Weeks

Python ML Machine Learning

Coursera

Erasmus University Rotterdam

Econometrics: Methods and Applications (Coursera)

Statistics & Data Analysis Data Science

Do you wish to know how to analyze and solve business and economic questions with data analysis tools? Then Econometrics by Erasmus University Rotterdam is the right course for you, as you learn how to translate data into models to make forecasts and to support decision making.

Aug 10th 2026

5-12 Weeks

Econometrics Data Analysis Linear Regression