NoSQL Big Data and Spark Fundamentals Professional Certificate

What you will learn
- The four categories of NoSQL databases and Database-as-a-Service (DaaS) offerings; and how to work with MongoDB, Cassandra and IBM Cloudant NoSQL databases.
- The characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools, including Hadoop, HDFS, Hive and HBase.
- Discover how data and machine learning engineers use Spark Structured Streaming, GraphFrames, Spark ML,Regression, Classification, and clustering, including the k-means algorithm and ETL using Spark.
Data engineers and Big Data professionals are in overwhelming demand. NoSQL and Big Data technology skills such as Apache Spark are a must-have for modern day Data engineers to enable data-driven decision-making. This three-course Professional Certificate from IBM opens the door for data engineering and big data careers.
Starting with NoSQL Database Basics, this course introduces you to NoSQL fundamentals, including the four key non-relational database categories. By the end of the course, you will have hands-on skills working with MongoDB, Cassandra, and IBM Cloudant NoSQL databases.
A crucial aspect of data engineering is Big Data and Big Data Analytics. When you enroll in Big Data, Hadoop, and Spark Basics, you'll discover the characteristics, features, benefits, limitations, and applications of some of the more popular Big Data processing tools. You explore the open-source ecosystem of Apache tools, including Apache Hadoop, Apache Hive, and Apache Spark. Discover how to leverage Spark to deliver reliable insights. You'll gain hands-on skills analyzing data using PySpark and Spark SQL, creating a streaming analytics application using Spark Streaming, and more.
Then enroll in Apache Spark for Data Engineering and Machine Learning to discover how data and machine learning engineers use Spark Structured Streaming, GraphFrames, Regression, Classification, and clustering. Learn about clustering and how to apply the k-means clustering algorithm using Spark MLlib. ETL is at the heart of data and machine learning engineering, and you'll gain skills using Spark to perform extract, transform and load (ETL) tasks. This course will culminate with a hands-on Spark project.
This Professional Certificate does not require any prior programming or data science skills, however prior basic data literacy and SQL skills will prove valuable in completing this program.

Sort options

Apache Spark for Data Engineering and Machine Learning (edX)

Self Paced
Apache Spark for Data Engineering and Machine Learning (edX)
Course Auditing
Categories
Effort
Languages
This short course introduces you to the fundamentals of Data Engineering and Machine Learning with Apache Spark, including Spark Structured Streaming, ETL for Machine Learning (ML) Pipelines, and Spark ML. By the end of the course, you will have hands-on experience applying Spark skills to ETL and ML [...]
0
No votes yet

Big Data, Hadoop, and Spark Basics (edX)

Self Paced
Big Data, Hadoop, and Spark Basics (edX)
Course Auditing
Categories
Effort
Languages
This course provides foundational big data practitioner knowledge and analytical skills using popular big data tools, including Hadoop and Spark. Learn and practice your big data skills hands-on. Organizations need skilled, forward-thinking Big Data practitioners who can apply their business and technical skills to unstructured data such as tweets, [...]
0
No votes yet
Self Paced
Course Auditing
83.00 EUR