MAT200A 02W
Courses:MAT200A 02W:Report:Motion Recognition

schedule MAT200A 02W

Report by: Mahesh Kale

Motion-Based Recognition

in engineering, surveillance, and art.....

Definition Motion Recognition may be defined as the perception of a change of state of specific objects in the observed (sensed) environment, both spatially and temporally, to provide data for a more abstract analysis.

Basic Concept Motion-based recognition deals with the recognition of an object and/or its motion, based on motion in a series of images. In one of the approaches, a sequence containing a large number of frames is used to extract motion information. The advantage is that a longer sequence leads to recognition of higher level motions, like walking or running, which consist of a complex and coordinated series of events. Unlike much previous research in motion, this approach does not require explicit reconstruction of shape from the images prior to recognition.

The application of currently available automatic facial expression recognition systems to the analysis of natural scenes is often very restricted due to the limited robustness of these systems and due to the hard constraints posed on the subjects and on the recording conditions.

Recognition Techniques Most existing computer vision techniques for the recognition of motion patterns can be classified coarsely into
  • Model-based methods
  • Methods that extract spatio-temporal features
  • A few computational models and neural models for the recognition of biological motion in humans have been proposed
Some Fields of Application Some of the Applications of Motion-based Recognition can be found in
  • Lipreading
  • Gesture recognition
  • Facial expression recognition
  • Gait analysis
  • Cyclic motion detection
  • Activity recognition

Motion-based Recognition in Engineering.....

Sensor-Based Pedestrian Protection A complementary approach is to focus on sensor-based solutions, which let vehicles "look ahead" and detect pedestrians in their surroundings. This article investigates the state of the art in this domain, reviewing passive, video-based approaches and approaches involving active sensors (radar and laser range finders).

Recognition by Locomotion Surprisingly, we can identify a walking person from viewing just a few moving point lights. People can further determine above chance the gender of the walker, even from viewing only two moving ankle points. We posed a related question of whether we could have a computer vision system recognize the differences between children and adults from the way they walk, rather than from static physical traits requiring a calibrated camera (e.g., body height). Specifically, the question is whether there are two distinct motion categories, one for young children and another for adults.

Recursive Models of Human Motion Perception is mediated by expectation.

If they hope to build computers that help people, they must build computers that are able to understand people. One step is the ability to understand human activity. Human motion is a very complex phenomenon, but it is not entirely arbitrary. The physical limitations of the body, the patterns encoded into our physiological structures, and even the habits of motion that we acquire over time, all combine to provide strong constraints on how we move. Modeling these constraints is an important step toward eventual understanding.

Eye Tracker Cybernet's EyeTracker System reproduces an image of the real world marked by the user's gaze location, and provides a means to observe the pupil with a sophisticated system at a low cost. The head mounted eye tracker works with the supplied Unix-based computer system to bring you a customizable and easy to use graphic interface. The lightweight headpiece allows extended usage with physical convenience. Left, right, or dual eye tracking is available. EyeTracker allows hands-free control of computer interface systems and has many configurable options available to provide flexibility for research and specialized tasks.

Robotics - Cooperation Motion Recognition Based Cooperation between Human Operating Robot and Autonomous Assistant Robot

The purpose of this research is to make multiple robots perform a designated task according to the intentions of a human operator. Since operating multiple robots simultaneously is difficult for a human operator, we propose a cooperation style in which the human operates one robot and the other autonomous robots assist it. In order to assist the human operating robot, the autonomous robots must recognize its motions in real time. The same motion recognition mechanism that is used in the human cognition model. We implement the motion symbolization method and the motion recognition method using the fuzzy filter and the template matching with predefined patterns. We implement the "Task-Operation Model" to describe the motion of the autonomous robot's assistance mechanism, and the ``Event Driven Method'' to manage the execution of this motion. The effectiveness of these methods is tested through an experiment in which two hexapod robots lift a box in cooperation.

A few more links.....

Motion-based Recognition in Surveillance.....

DARPA Image Understanding Program

The goal of this project is to develop and demonstrate new video understanding technology for battlefield information systems in rural and urban environments. New technology to be developed in this program will focus on video-specific problems that have not been solved reliably in previous image understanding research, with two key technical objectives:
1) Develop and demonstrate algorithms for robust moving object detection
2)Develop and demonstrate algorithms for video event description and recognition

Airborne Video Surveillance The goal of this group is to develop a real-time computer vision system to monitor and detect human and vehicle activities from an airborne camera. They have developed a real-time tracker and object classifier for AM, and the components that integrate AM to the rest of AVS. The system was successfully demonstrated in a live test in October, 1999.

Visual and Acoustic Surveillance and Monitoring This project explores fundamental research problems related to the integration of visual (color video and IR) and acoustic sensory sources for visual surveillance of urban areas for military or law enforcement purposes, and to demonstrate the research in a series of integrated surveillance scenarios. The vision of an autonomous urban battlefield surveillance system is based on distributed visual and auditory sensors monitoring an urban site in the context of a pre-existing site model. The site model contains knowledge used by the surveillance system to focus its attention and constrain its image analysis to detect people, vehicles and their interactions.

Motion-based Recognition in Art.....

Very Nervous System Very Nervous System is the third generation of interactive sound installations which David Rockeby have created. In these systems, he uses video cameras, image processors, computers, synthesizers and a sound system to create a space in which the movements of one's body create sound and/or music. It has been primarily presented as an installation in galleries but has also been installed in public outdoor spaces, and has been used in a number of performances.

Master of Space by David Rockeby In this installation string, hanging from the ceiling at 1/2 meter intervals, was used to establish the edges of the perceptual field of the system's camera and its shadows. The camera was set up near the ceiling and angled so that the top of its field corresponded to the floor at the entrance to the space. As one walked into the installation, one was in effect increasingly submerged in the interactive space.

Sounds included heartbeats and breathing that increased pace with continued interaction, pebbles and waves, wind, footsteps recorded within the space. As one approached the camera at the apex of the two walls of string, the intensity of the interaction increased. Certain sounds were mapped to specific areas of the space. At the "European Media Arts Festival", the space had a number of columns in it. Behind the most dominant column a behaviour was set up that generated the sound of breaking glass only if someone passed behind the column from the point of view of the camera.

Reflections Reflexions was David Rockeby's first interactive sound installation. He constructed some very bulky 8 x 8 pixel video cameras (the large black box over the monitor in the image), connected then to a wire-wrapped card in the Apple ][ which digitized the images, and wrote a program for the Apple ][ which controlled a Korg MS-20 Analog synthesizer to make sounds in response to the movements seen by the cameras. Movement also controlled the volume of two tape loops of water sounds. The synthesizer and water sounds were mixed individually to 4 seoakers in a tetrahedron (one on the ceiling and three in a triangle on the floor. The sounds would move around you in addition to responding to your movement.

d-Rhum d-Rhum is an environmental installation which responds to the presence and activities of its participants. Movements are sensed by a series of sonar motion detectors which register speed and density of people wandering throughout the space. Computers translate the activities of the occupants to a variety of mechanical devices which control walls and percussion elements of the room. The enclosure of d-rhum consists primarily of malleable materials such as stretched latex panels which provide sonic responses and spatial deformations. Motors, controlled by the computer, process sensor data, stretch, push, strike with mallets and contort sections of the walls themselves.

The d-rhum team consists of Peter Franck , Architect, Adjunct Associate Professor Pratt Graduate Program in Architecture; Richard Hughes , Computer Hardware Specialist and Robotics Enthusiast; Dan Schwartz , President, RomeBlack, Inc., Colin Faber and Eugene James Flotteron, Architectural Research and Design.

Plasma In the middle of the room is a pneumatic projection screen that can be controlled with a computer. In front of the projection screen is a circular pattern painted on the floor: The center of action and interaction.

Audio Park-The Party Effect by Christian Moeller. An interactive light and 3D audio sculpture in the Museum Park designed by Rem Koolhaas on the occasion of the City of Rotterdam Summer Festival In this demonstration a sound can be moved around in space and listened to over headphones by alternatively entering the four light sensors positioned on the floor.

Boundary Functions Scott Sona Snibbe We think of personal space as something that belongs solely to ourselves. However, Boundary Functions shows us that personal space exists only in relation to others. Our personal space changes dynamically in relation to those around us.

Boundary Functions is realized as a set of lines projected from overhead onto the floor which divide each person in the gallery from one another. With one person in the gallery there is no response. When two are present, there is a single line drawn halfway between them segmenting the room into two regions. As each person moves, this line dynamically changes, maintaining an even distance between the two. With more than two people, the floor becomes divided into cellular regions, each with the mathematical quality that all space within the region is closer to the person inside than any other.

the meadow Stepping into the installation space, the viewer is surrounded by four large colour monitors. Displayed on each monitor is real-time, full motion video of a different view: the four edges or corners of a meadow as seen from a central vantage point. It is winter in the meadow, then suddenly the season shifts. The views remain the same, but a certain motion or sequence of movements has triggered a transformation. Suddenly it is spring. The viewer discovers, as they move within the installation space, how to trigger these seasonal changes and find it is possible to move backwards in time, from winter to fall, or across seasons, from fall to spring, as well.

Text Rain To interact with the installation participants stand or move in front of a large projection screen. On the screen they see a mirrored video projection of themselves in black and white, combined with a color animation of falling text. Like rain or snow, the text appears to land on participants' heads and arms. The text responds to the participants' motions and can be caught, lifted, and then let fall again. The falling text will 'land' on anything darker than a certain threshold, and 'fall' whenever that obstacle is removed.

Skies SKIES involves people's cooperation with themselves and with nature. Visitors encounter sound, and moving video imagery projected onto the floor and wall. The installation is larger than the floor projection, permitting visitors to walk onto the imagery or in the surrounding area. When no visitors stand on the projection, sound and imagery of a night sky is presented. As visitors walk onto the projection, they discover black paths hidden within the projected imagery. Discovery of the paths causes presentation of different video sequences and sounds. Thirty-two looped video sequences and sound tracks are contained within the installation, selected by the specific combination of paths being discovered

Intersection This installation presents visitors with sounds of speeding cars travelling across a completely dark exhibition space. The illusion of traffic is created using various car sounds which are played through four pairs of stereo speakers placed at either end of four invisible lanes of traffic.

Art & Research This page contains short descriptions of some of Bill Keays art and/or research projects, many of which were done at the MIT Media Lab.

Sensing/Speaking Space by George Legrady (Visual artist) & Stephen Pope (Sound composition). "Sensing/Speaking Space" is an interactive digital media installation that is a real-time feedback environment where visualization and sound will be generated to represent the presence and movement of spectators within a public space such as a museum or shopping center. The interaction will focus on the notion of the "intelligent space", a space that knows you are there and reacts to your presence and movements through a custom camera tracking system. The installation will be able to accommodate simultaneously anywhere from 1 to 20 spectators.

A few more links.....