Coursera

Robotics: Perception (Coursera)

Offered by University of Pennsylvania,

How can robots perceive the world and their own movements so that they accomplish navigation and manipulation tasks? In this module, we will study how images and videos acquired by cameras mounted on robots are transformed into representations like features and optical flow. Such 2D representations allow us then to extract 3D information about where the camera is and in which direction the robot moves. You will come to understand how grasping objects is facilitated by the computation of 3D posing of objects and navigation can be accomplished by visual odometry and landmark-based localization.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Course 4 of 6 in the Robotics Specialization.

Syllabus

WEEK 1
Geometry of Image Formation
Welcome to Robotics: Perception! We will begin this course with a tutorial on the standard camera models used in computer vision. These models allow us to understand, in a geometric fashion, how light from a scene enters a camera and projects onto a 2D image. By defining these models mathematically, we will be able understand exactly how a point in 3D corresponds to a point in the image and how an image will change as we move a camera in a 3D environment. In the later modules, we will be able to use this information to perform complex perception tasks such as reconstructing 3D scenes from video.

WEEK 2
Projective Transformations
Now that we have a good camera model, we will explore the geometry of perspective projections in depth. We will find that this projection is the cause of the main challenge in perception, as we lose a dimension that we can no longer directly observe. In this module, we will learn about several properties of projective transformations in depth, such as vanishing points, which allow us to infer complex information beyond our basic camera model.

WEEK 3
Pose Estimation
In this module we will be learning about feature extraction and pose estimation from two images. We will learn how to find the most salient parts of an image and track them across multiple frames (i.e. in a video sequence). We will then learn how to use features to find the position of the camera with respect to another reference frame on a plane using Homographies. We will also learn about how to make these techniques more robust, using least squares to hand noisy feature points or RANSAC to remove completely erroneous feature points.

WEEK 4
Multi-View Geometry
Now we will use what we learned from two view geometry and extend it to sequences of images, such as a video. We will explain the fundamental geometric constraints between point features in images, the Epipolar constraint, and learn how to use it to extract the relative poses between multiple frames. We will finish by combining all this information together for the application of Structure from Motion, where we will compute the trajectory of a camera and a map throughout many frames and refine our estimates using Bundle adjustment.

Go to Class

MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Coursera

Technische Universität München - TUM

Digitalisation in the Aerospace Industry (Coursera)

Engineering Sci: Physics

The online course Digitalisation in Aerospace aims at making you aware of special production requirements connected with digitalisation. You will learn about the role of robotics and automation in manufacturing and gain a better understanding of differing perspectives on research and manufacturing as well as the points where these intersect.

Jun 29th 2026

3 Weeks

Artificial Intelligence Robotics Engineering

Coursera

University at Buffalo,The State University of New York

Computer Vision Basics (Coursera)

Computer Science

By the end of this course, learners will understand what computer vision is, as well as its mission of making computers see and interpret the world as humans do, by learning core concepts of the field and receiving an introduction to human vision capabilities. They are equipped to identify some key application areas of computer vision and understand the digital imaging process. The course covers crucial elements that enable computer vision: digital signal processing, neuroscience and artificial intelligence.

Jun 29th 2026

4 Weeks

MATLAB Computer Vision Computer Programming

Coursera

Northwestern University

Modern Robotics, Course 4: Robot Motion Planning and Control (Coursera)

Robotics & Computer Vision

Do you want to know how robots work? Are you interested in robotics as a career? Are you willing to invest the effort to learn fundamental mathematical modeling techniques that are used in all subfields of robotics? If so, then the "Modern Robotics: Mechanics, Planning, and Control" specialization may be for you. This specialization, consisting of six short courses, is serious preparation for serious students who hope to work in the field of robotics or to undertake advanced study. It is not a sampler.

Jul 6th 2026

4 Weeks

Robotics Motion Planning Modern Robotics: Mechanics Planning and Control Specialization

Coursera

Arm

Getting Started with Machine Learning at the Edge on Arm (Coursera)

Computer Science

The age of machine learning has arrived! Arm technology is powering a new generation of connected devices with sophisticated sensors that can collect a vast range of environmental, spatial and audio/visual data. Typically this data is processed in the cloud using advanced machine learning tools that are enabling new applications reshaping the way we work, travel, live and play.

Jul 6th 2026

5-12 Weeks

Machine Learning Computer Vision IoT

Coursera

MathWorks

Introduction to Deep Learning for Computer Vision (Coursera)

Data Science

Starting with zero deep learning knowledge, this foundational course will guide you to effectively train cutting-edge models for image classification purposes. From analyzing medical images to recognizing traffic signs, classification is important for many applications. Classification models also serve as the backbone for more complicated object detection models.

Jul 6th 2026

4 Weeks

Artificial Intelligence MATLAB Computer Vision

Coursera

Columbia University

Visual Perception (Coursera)

Computer Science

The ultimate goal of a computer vision system is to generate a detailed symbolic description of each image shown. This course focuses on the all-important problem of perception. We first describe the problem of tracking objects in complex scenes. We look at two key challenges in this context. The first is the separation of an image into object and background using a technique called change detection.

Jun 29th 2026

5-12 Weeks

Neural Networks Computer Vision Visual Perception

Coursera

Microsoft

Artificial Intelligence on Microsoft Azure (Coursera)

CS: Information & Technology

Whether you're just beginning to work with Artificial Intelligence (AI) or you already have AI experience and are new to Microsoft Azure, this course provides you with everything you need to get started. Artificial Intelligence (AI) empowers amazing new solutions and experiences; and Microsoft Azure provides easy to use services to help you build solutions that seemed like science fiction a short time ago; enabling incredible advances in health care, financial management, environmental protection, and other areas to make a better world for everyone.

Jun 29th 2026

1 Week

Artificial Intelligence NLP Machine Learning

Coursera

Microsoft

Computer Vision in Microsoft Azure (Coursera)

CS: Information & Technology

In Microsoft Azure, the Computer Vision cognitive service uses pre-trained models to analyze images, enabling software developers to easily build applications"see" the world and make sense of it. This ability to process images is the key to creating software that can emulate human visual perception. In this course, you'll explore some of these capabilities as you learn how to use the Computer Vision service to analyze images.

Jun 29th 2026

3 Weeks

Artificial Intelligence Machine Learning Computer Vision

Coursera

Northwestern University

Modern Robotics, Course 3: Robot Dynamics (Coursera)

Robotics & Computer Vision

Jun 29th 2026

4 Weeks

Robotics Mechanical Engineering Robot

Coursera

Northwestern University

Modern Robotics, Course 6: Capstone Project, Mobile Manipulation (Coursera)

Engineering Sci: Physics

The capstone project of the Modern Robotics specialization is on mobile manipulation: simultaneously controlling the motion of a wheeled mobile base and its robot arm to achieve a manipulation task. This project integrates several topics from the specialization, including trajectory planning, odometry for mobile robots, and feedback control. Beginning from the Modern Robotics software library provided to you (written in Python, Mathematica, and MATLAB), and software you have written for previous courses, you will develop software to plan and control the motion of a mobile manipulator to perform a pick and place task.

Jul 6th 2026

4 Weeks

Robotics MATLAB Mobile

Coursera

Columbia University

Features and Boundaries (Coursera)

Statistics & Data Analysis Data Science

This course focuses on the detection of features and boundaries in images. Feature and boundary detection is a critical preprocessing step for a variety of vision tasks including object detection, object recognition and metrology – the measurement of the physical dimensions and other properties of objects. The course presents a variety of methods for detecting features and boundaries and shows how features extracted from an image can be used to solve important vision tasks.

Jun 29th 2026

5-12 Weeks

Computer Vision Metrology Coursera Plus

Coursera

Columbia University

3D Reconstruction - Multiple Viewpoints (Coursera)

Computer Science

This course focuses on the recovery of the 3D structure of a scene from images taken from different viewpoints. We start by first building a comprehensive geometric model of a camera and then develop a method for finding (calibrating) the internal and external parameters of the camera model. Then, we show how two such calibrated cameras, whose relative positions and orientations are known, can be used to recover the 3D structure of the scene. This is what we refer to as simple binocular stereo.

Jun 29th 2026

5-12 Weeks

Computer Vision 3D Reconstruction 3D Structures