• • Multiple Object Tracking: Object tracking, by definition, is to track an object (or multiple objects) over a sequence of images. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, non-rigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. See examples

  • • Visual Mono/Stereo Odometry: A key component of autonomous vehicles and driver assistance systems is the navigation module. Accurate and real-time estimation of the position of the vehicle is essential for tasks such as motion planning, emergency braking or trajectory correction. Computer vision offers a cheap, accurate and reliable alternative for vehicle localization. The decision on whether to use one (mono) or two (stereo) camera setups depends on the environment assumptions that can be made and the number degrees of freedom of the vehicle motion. See examples

  • • Supervised Machine Learning is the machine learning task of inferring a function from labeled training data. Supervised learning takes a known set of input data and known responses to the data, and seeks to build a predictor model that generates reasonable predictions for the response to new data. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. It has several application in computer vision such as detection, recognition and tracking of visual objects. See examples

  • • Detection-by-Classification: It is a particular methodology of object detection in images using statistical classifiers. Usually these classifiers are trained using supervised learning techniques. The detection method consists on the classification of image regions at different positions and scales in order to find target objects in the image. See examples

  • • Perspective Analysis and Camera Calibration: How a video camera performs the perspective projection from the 3D world to the 2D image plane depends on some parameters: focal length, lens distortion, field of view, sensor size, etc... These parameters can be estimated with a camera calibration process. With this information it is possible to recover information like the camera's orientation respect to the horizon or a ground plane or to estimate metric measures like a person's height or the speed of a car. Working with multiple cameras or a single moving camera it is possible to recover the 3D structure of the world that the cameras are watching. See examples

  • • Face Detection, Tracking and Recognition: Detecting faces consists on locating automatically the image region where faces can be observed. Tracking faces refers to relating from frame to frame, in a video sequence, the positions of the same faces. And finally, recognizing faces refers to labelling the face image according to a knowledge database that relates face appearances to identities. See examples

  • • Eye-Gaze Estimation: Eye-gaze estimation consists on estimating the direction of the eye-gaze vectors, normally to find where a person stares at a screen. See examples

  • • Temporal Event Recognition: Recognizing temporal events refers to finding temporal events in an observed scene according to a knowledge database that relates observations to semantic labels, which could be human actions, activities, behaviors, expressions, etc. See examples

  • • Human Motion Capture: Motion capture is the process of tracking the movement of human body parts. See examples

  • • Deformable Model Fitting: Deformable model fitting is the problem of finding the optimal configuration of a parameterized shape model that best describes the object of interest in an image. See examples