Apache Spark (TM) SQL for Data Analysts (Coursera)

Offered by Databricks,
Apache Spark (TM) SQL for Data Analysts (Coursera)

Apache Spark is one of the most widely used technologies in big data analytics. In this course, you will learn how to leverage your existing SQL skills to start working with Spark immediately. You will also learn how to work with Delta Lake, a highly performant, open-source storage layer that brings reliability to data lakes. By the end of this course, you will be able to use Spark SQL and Delta Lake to ingest, transform, and query data to extract valuable insights that can be shared with your team.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Course 1 of 3 in the Data Science with Databricks for Data Analysts Specialization.

Syllabus

WEEK 1: Welcome to Apache Spark SQL for Data Analysts
WEEK 2: Spark makes big data easy
WEEK 3: Using Spark SQL on Databricks
WEEK 4: Spark Under the Hood
WEEK 5: Complex Queries
WEEK 6: Applied Spark SQL
WEEK 7: Data Storage and Optimization
WEEK 8: Delta Lake with Spark SQL
WEEK 9: SQL Coding Challenges

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Introduction to Machine Learning (Coursera) Coursera
Duke University

Introduction to Machine Learning (Coursera)

This course will provide you a foundational understanding of machine learning models (logistic regression, multilayer perceptrons, convolutional neural networks, natural language processing, etc.) as well as demonstrate how these models can solve complex problems in a variety of industries, from medical diagnostics to image recognition to text prediction.

Jun 5th 2026
5-12 Weeks
Python Project for Data Science (Coursera) Coursera
IBM

Python Project for Data Science (Coursera)

This mini-course is intended to for you to demonstrate foundational Python skills for working with data. The completion of this course involves working on a hands-on project where you will develop a simple dashboard using Python. This course is part of the IBM Data Science Professional Certificate and the IBM Data Analytics Professional Certificate.

Jun 4th 2026
1 Week
Genomic Data Science and Clustering (Bioinformatics V) (Coursera) Coursera
University of California, San Diego

Genomic Data Science and Clustering (Bioinformatics V) (Coursera)

How do we infer which genes orchestrate various processes in the cell? How did humans migrate out of Africa and spread around the world? In this class, we will see that these two seemingly different questions can be addressed using similar algorithmic and machine learning techniques arising from the general problem of dividing data points into distinct clusters.

Jun 1st 2026
3 Weeks
Fundamentals of GIS (Coursera) Coursera
University of California, Davis

Fundamentals of GIS (Coursera)

Explore the world of spatial analysis and cartography with geographic information systems (GIS). What you will learn: define core geospatial concepts; practice with subset data using selections and feature attributes; create map books using advanced mapping techniques; create layer and map packages.

Jun 1st 2026
4 Weeks
Teaching Impacts of Technology: Workplace of the Future (Coursera) Coursera
University of California, San Diego

Teaching Impacts of Technology: Workplace of the Future (Coursera)

In this course you’ll focus on how the Internet has enabled new careers and changed expectations in traditional work settings, creating a new vision for the workplace of the future. This will be done through a series of paired teaching sections, exploring a specific “Impact of Computing” in your typical day and the “Technologies and Computing Concepts” that enable that impact, all at a K12-appropriate level.

Jun 3rd 2026
4 Weeks
Ask Questions to Make Data-Driven Decisions (Coursera) Coursera
Google

Ask Questions to Make Data-Driven Decisions (Coursera)

This is the second course in the Google Data Analytics Certificate. These courses will equip you with the skills needed to apply to introductory-level data analyst jobs. You’ll build on your understanding of the topics that were introduced in the first Google Data Analytics Certificate course. The material will help you learn how to ask effective questions to make data-driven decisions, while connecting with stakeholders’ needs. Current Google data analysts will continue to instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Jun 2nd 2026
4 Weeks
Accounting Analytics (Coursera) Coursera
University of Pennsylvania

Accounting Analytics (Coursera)

Accounting Analytics explores how financial statement data and non-financial metrics can be linked to financial performance. In this course, taught by Wharton’s acclaimed accounting professors, you’ll learn how data is used to assess what drives financial performance and to forecast future financial scenarios. While many accounting and financial organizations deliver data, accounting analytics deploys that data to deliver insight, and this course will explore the many areas in which accounting data provides insight into other business areas including consumer behavior predictions, corporate strategy, risk management, optimization, and more.

Jun 1st 2026
4 Weeks
Data Visualization and Communication with Tableau (Coursera) Coursera
Duke University

Data Visualization and Communication with Tableau (Coursera)

One of the skills that characterizes great business data analysts is the ability to communicate practical implications of quantitative analyses to any kind of audience member. Even the most sophisticated statistical analyses are not useful to a business if they do not lead to actionable advice, or if the answers to those business questions are not conveyed in a way that non-technical people can understand. In this course you will learn how to become a master at communicating business-relevant implications of data analyses.

Jun 1st 2026
5-12 Weeks
Data Science Companion (Coursera) Coursera
MathWorks

Data Science Companion (Coursera)

The Data Science Companion provides an introduction to data science. You will gain a quick background in data science and core machine learning concepts, such as regression and classification. You’ll be introduced to the practical knowledge of data processing and visualization using low-code solutions, as well as an overview of the ways to integrate multiple tools effectively to solve data science problems.

Jun 5th 2026
4 Weeks
Foundations: Data, Data, Everywhere (Coursera) Coursera
Google

Foundations: Data, Data, Everywhere (Coursera)

This is the first course in the Google Data Analytics Certificate. These courses will equip you with the skills you need to apply to introductory-level data analyst jobs. Organizations of all kinds need data analysts to help them improve their processes, identify opportunities and trends, launch new products, and make thoughtful decisions. In this course, you’ll be introduced to the world of data analytics through hands-on curriculum developed by Google. The material shared covers plenty of key data analytics topics, and it’s designed to give you an overview of what’s to come in the Google Data Analytics Certificate. Current Google data analysts will instruct and provide you with hands-on ways to accomplish common data analyst tasks with the best tools and resources.

Jun 2nd 2026
5-12 Weeks
Big Data Science with the BD2K-LINCS Data Coordination and Integration Center (Coursera) Coursera
Icahn School of Medicine at Mount Sinai

Big Data Science with the BD2K-LINCS Data Coordination and Integration Center (Coursera)

In this course we briefly introduce the DCIC and the various Centers that collect data for LINCS. We then cover metadata and how metadata is linked to ontologies. We then present data processing and normalization methods to clean and harmonize LINCS data. This follow discussions about how data is served as RESTful APIs. Most importantly, the course covers computational methods including: data clustering, gene-set enrichment analysis, interactive data visualization, and supervised learning. Finally, we introduce crowdsourcing/citizen-science projects where students can work together in teams to extract expression signatures from public databases and then query such collections of signatures against LINCS data for predicting small molecules as potential therapeutics.

Jun 1st 2026
5-12 Weeks