Coursera

Text Retrieval and Search Engines (Coursera)

Offered by University of Illinois at Urbana-Champaign,

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

This course will cover search engine technologies, which play an important role in any data mining applications involving text data for two reasons. First, while the raw data may be large for any particular problem, it is often a relatively small subset of the data that are relevant, and a search engine is an essential tool for quickly discovering a small subset of relevant text data in a large text collection. Second, search engines are needed to help analysts interpret any patterns discovered in the data by allowing them to examine the relevant original text data to make sense of any discovered pattern. You will learn the basic concepts, principles, and the major techniques in text retrieval, which is the underlying science of search engines.

Course 2 of 6 in the Data Mining Specialization

Syllabus

WEEK 1
Orientation
You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.
During this week's lessons, you will learn of natural language processing techniques, which are the foundation for all kinds of text-processing applications, the concept of a retrieval model, and the basic idea of the vector space model.

WEEK 2
In this week's lessons, you will learn how the vector space model works in detail, the major heuristics used in designing a retrieval function for ranking documents with respect to a query, and how to implement an information retrieval system (i.e., a search engine), including how to build an inverted index and how to score documents quickly for a query.

WEEK 3
In this week's lessons, you will learn how to evaluate an information retrieval system (a search engine), including the basic measures for evaluating a set of retrieved results and the major measures for evaluating a ranked list, including the average precision (AP) and the normalized discounted cumulative gain (nDCG), and practical issues in evaluation, including statistical significance testing and pooling.

WEEK 4
In this week's lessons, you will learn probabilistic retrieval models and statistical language models, particularly the detail of the query likelihood retrieval function with two specific smoothing methods, and how the query likelihood retrieval function is connected with the retrieval heuristics used in the vector space model.

WEEK 5
In this week's lessons, you will learn feedback techniques in information retrieval, including the Rocchio feedback method for the vector space model, and a mixture model for feedback with language models. You will also learn how web search engines work, including web crawling, web indexing, and how links between web pages can be leveraged to score web pages.

WEEK 6
In this week's lessons, you will learn how machine learning can be used to combine multiple scoring factors to optimize ranking of documents in web search (i.e., learning to rank), and learn techniques used in recommender systems (also called filtering systems), including content-based recommendation/filtering and collaborative filtering. You will also have a chance to review the entire course.

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Coursera

Eindhoven University of Technology

Process Mining: Data science in Action (Coursera)

Statistics & Data Analysis Data Science

Process mining is the missing link between model-based process analysis and data-oriented analysis techniques. Through concrete data sets and easy to use software the course provides data science knowledge that can be applied directly to analyze and improve processes in a variety of domains. Data science is the profession of the future, because organizations that are unable to use (big) data in a smart way will not survive. It is not sufficient to focus on data storage and data analysis. The data scientist also needs to relate data to process analysis.

Jul 27th 2026

5-12 Weeks

Business Data Mining Data Analysis

Coursera

Illinois Tech

Cloud: Platform as a Service - Bachelor's (Coursera)

CS: Information & Technology

This course is aimed at preparing individuals to gain knowledge, skills, and abilities to demonstrate the knowledge for managing Platform as a Service (PaaS) in the Cloud. Students will learn to deploy, operate, and maintain cloud platforms for storing, processing, and transferring information with architecture design principles and a structured approach. Students will also learn the shared responsibility model and cloud security best practices to secure PaaS platforms for the application-hosting environments.

Aug 3rd 2026

5-12 Weeks

Cloud Machine Learning PaaS

Coursera

Columbia University

Decision Making and Reinforcement Learning (Coursera)

Computer Science

This course is an introduction to sequential decision making and reinforcement learning. We start with a discussion of utility theory to learn how preferences can be represented and modeled for decision making. We first model simple decision problems as multi-armed bandit problems in and discuss several approaches to evaluate feedback. We will then model decision problems as finite Markov decision processes (MDPs), and discuss their solutions via dynamic programming algorithms. We touch on the notion of partial observability in real problems, modeled by POMDPs and then solved by online planning methods.

Aug 3rd 2026

5-12 Weeks

Algorithms Machine Learning Monte Carlo Method

Coursera

University of Michigan

Information Extraction from Free Text Data in Health (Coursera)

Sci: Biology & Life Sciences Health & Society

In this MOOC, you will be introduced to advanced machine learning and natural language processing techniques to parse and extract information from unstructured text documents in healthcare, such as clinical notes, radiology reports, and discharge summaries. Whether you are an aspiring data scientist or an early or mid-career professional in data science or information technology in healthcare, it is critical that you keep up-to-date your skills in information extraction and analysis.

Aug 3rd 2026

4 Weeks

Machine Learning Natural Language Natural Language Processing

Coursera

Google Cloud

Google Cloud Product Fundamentals en Español (Coursera)

Business

Este curso, que es una continuación de Business Transformation with Google Cloud, le permitirá conocer la perspectiva tecnológica de la transformación de una organización. Para ser más específicos, explicaremos cómo la tecnología de Google Cloud puede transformar digitalmente una organización en los siguientes aspectos: modernizar la infraestructura de TI; mejorar la forma en que los equipos desarrollan las aplicaciones que utiliza la empresa; saber cómo aprovechar el aprendizaje automático y la inteligencia artificial para generar más valor; advertir el rol fundamental de las herramientas de productividad basadas en la nube, como G Suite, para cumplir con el trabajo, y comprender los desafíos y las oportunidades de la administración de costos que trae aparejados una infraestructura de TI cambiante basada en la nube.

Aug 3rd 2026

5-12 Weeks

Artificial Intelligence Cloud Machine Learning

Coursera

University of Illinois at Urbana-Champaign

Empathy, Data, and Risk (Coursera)

Business

Risk Management and Innovation develops your ability to conduct empathy-driven and data-driven analysis in the domain of risk management. This course introduces empathy as a professional competency. It explains the psychological processes that inhibit empathy-building and the processes that determine how organizational stakeholders respond to risk.

Aug 3rd 2026

4 Weeks

Risk Management Data Analysis Brainstorming

Coursera

University of Illinois at Urbana-Champaign

Machine Learning and Human Learning (Coursera)

Education

This course examines the differences between machine and human learning and the ways in which machines can complement human learning. It examines technical definitions of supervised and unsupervised machine learning, as well as broader views of mechanical intelligence able to replicate or exceed human intelligence.

Aug 3rd 2026

4 Weeks

Learning Artificial Intelligence Machine Learning

Coursera

University of Geneva

Global Statistics - Composite Indices for International Comparisons (Coursera)

Statistics & Data Analysis Data Science

In this course on global statistics, offered by the University of Geneva jointly with the ETH Zürich KOF, you will learn the general approach of constructing composite indices and some of resulting problems. We will discuss the technical properties, the internal structure (like aggregation, weighting, stability of time series), the primary data used and the variable selection methods. These concepts will be illustrated using a sample of the most popular composite indices. We will try to address not only statistical questions but also focus on the distinction between policy-, media- and paradigm-driven indicators.

Aug 3rd 2026

5-12 Weeks

Statistics Data Analysis Comparison

Coursera

Coursera Project Network

Machine Learning in Retail (Coursera)

Data Science

Who are your customers? What are they like? How do they interact with your business? This Short Course was created to help analysts better understand their customer behaviour through the power of machine learning. In this course, you will apply two different machine learning techniques to segment customers according to their purchasing behaviour and provide actionable insights for each group. Along the way, you'll also examine some other retail case studies, including web visitor analysis for marketing and store clustering for logistics.

Aug 3rd 2026

1 Week

ML Machine Learning Decision Tree

Coursera

Northeastern University

Machine Learning in Healthcare: Fundamentals & Applications (Coursera)

Health & Society Data Science

Examines data mining perspectives and methods in a healthcare context. Introduces the theoretical foundations for major data mining methods and studies how to select and use the appropriate data mining method and the major advantages for each. Students are exposed to contemporary data mining software applications and basic programming skills. Focuses on solving real-world problems, which require data cleaning, data transformation, and data modeling.

Aug 3rd 2026

4 Weeks

Healthcare Artificial Intelligence Machine Learning

Coursera

Edge Impulse

Computer Vision with Embedded Machine Learning (Coursera)

Data Science

Computer vision (CV) is a fascinating field of study that attempts to automate the process of assigning meaning to digital images or videos. In other words, we are helping computers see and understand the world around us! A number of machine learning (ML) algorithms and techniques can be used to accomplish CV tasks, and as ML becomes faster and more efficient, we can deploy these techniques to embedded systems.

Aug 3rd 2026

3 Weeks

Machine Learning Computer Vision Object Detection

Coursera

Johns Hopkins University

Gender Foundations in Health Data: A Data for Health Course (Coursera)

Health & Society

Welcome to Gender Foundations in Health Data: A Data for Health course. This course was developed from an online seminar series of the same name, that was hosted by Johns Hopkins University Bloomberg School of Health in 2021-22. The course instructors are Drs. Michelle Kaufman and Tahilin Sanchez Karver. This course will raise learners' awareness of the necessity of utilizing a gender lens in global public health data, policy, and practice, feature how-tos and key examples of integration of gender in data collection, analysis, and use from Data for Health partners.

Aug 3rd 2026

1 Week

Gender Data Analysis Health Data