Home Concept Technology Budget About

Image Construction - Tracking and Rendering - Challenges

Technically, this project is implemented by tracking the position of the viewers, recording their faces as they move about, and reprojecting their heads (both the front and back) to moving locations in the space. Camera tracking is accomplished via overhead and wall-mounted cameras, in which each camera has its own dedicated computer. A central server collects capture data and performs reconstruction and rendering. Computers communicate with one another via a dedicated TCP/IP network.

Technical details involve a series of steps to go from visual data to rendered image. These steps are described as follows:
  1. One overhead video camera that tracks the position of all people in the space.
  2. Up to eight video cameras mounted at eye-level in the wall record the people in the space from several different angles.
  3. When a new person enters the room, the overhead camera position is correlated with the wall-mounted cameras to generate a complete 3D virtual reconstruction of that individual's head.
  4. As the person moves about the room, minimal reconstruction is used to track their movement as well as the direction they are facing.
  5. As the person looks around the room, the head tracking data from step 4 is combined with the 3D reconstruction from step 3 to place real-time rendered images of that person's head in front of them and at their peripheral vision.
In the above process, step one (overhead tracking) is relatively straightforward. Background subtraction is performed, along with centroid detection to track the absolute position of each person in the space. This information (but not the camera image) is passed in real-time to the central server. When a new person enters the room, or when someone leaves, a signal is sent to the server indicating this event.



Step two consists of a network of eight computers connected to eight video cameras, each recording a particular, high-resolution view of the space. No video data is passed between computers at this stage.



In step three, the overhead tracking computer has indicated someone new has entered the room. The main server receives this message, and sends a similar message to all wall-camera units along with the absolute location of that person as determined by the overhead camera. Using the absolute location of the person, the known physical location of each camera, and feature detection (ie. skin color), the wall-mounted cameras can extract an image of that person's head from their locations.



Each of these images of the individual's head is passed (just once) to the main system. Using a technique called inverse projection, a 3D reconstruction of the head is created by back-projecting each camera image onto a head-shaped mesh centered around the viewer. This spherical texture is a virtual model of the head of the person who entered the room.