Jianbo's group is developing vision algorithms for both human and image recognition. Their ultimate goal is to develop computation algorithms to understand human behavior and interaction with objects in video, and to do so at multiple levels of abstractions: from the basic body limb tracking, to human identification, gesture recognition, and activity inference. Jianbo and his group are working to develop a visual thinking model that allows computers not only to understand their surroundings, but also to achieve higher level cognitive abilities such as machine memory and learning.

How can robots perceive the world and their own movements so that they accomplish navigation and manipulation tasks? In this module, we will study how images and videos acquired by cameras mounted on robots are transformed into representations like features and optical flow. Such 2D representations allow us then to extract 3D information about where the camera is and in which direction the robot moves. You will come to understand how grasping objects is facilitated by the computation of 3D posing of objects and navigation can be accomplished by visual odometry and landmark-based localization.

Learn how to design robot vision systems that avoid collisions, safely work with humans and understand their environment. How do robots “see”, respond to and learn from their interactions with the world around them? This is the fascinating field of visual intelligence and machine learning. Visual intelligence allows a robot to “sense” and “recognize” the surrounding environment. It also enables a robot to “learn” from the memory of past experiences by extracting patterns in visual signals.

