This Specialization teaches the essential skills for working with large-scale data using SQL.
Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you.
Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable.
To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines.
This Specialization is designed to provide excellent preparation for the Cloudera Certified Associate (CCA) Data Analyst certification exam. You can earn this certification credential by taking a hands-on practical exam using the same SQL engines that this Specialization teaches—Hive and Impala.
In this course, you'll learn how to manage big datasets, how to load them into clusters and cloud storage, and how to apply structure to the data so that you can run queries on it using distributed SQL engines like Apache Hive and Apache Impala. You’ll learn how to [...]
In this course, you'll get a big-picture view of using SQL for big data, starting with an overview of data, database systems, and the common querying language (SQL). Then you'll learn the characteristics of big data and SQL tools for working on big data platforms. You'll also install an [...]
In this course, you'll get an in-depth look at the SQL SELECT statement and its main clauses. The course focuses on big data SQL engines Apache Hive and Apache Impala, but most of the information is applicable to SQL with traditional RDBMs as well; the instructor explicitly addresses differences [...]