Live Stream from 3 Cameras

3D Scene Overview

3D Scene Overview

3D Scene Overview

Swarm Vision, 2013
3 custom designed rails each with Sony PTZ camera, custom software animation, Apple MacPro, 2 projectors (Panasonic PT-DZ6710U or equivalent) or 2 HD large screens, dimensions variable

George Legrady, Marco Pinter, Danny Bazo

Swarm Vision explores the translation of rules of human photographic behavior to machine language. Initiated by research in autonomous swarm robotic camera behavior, SwarmVision is an installation consisting of multiple Pan-Tilt-Zoom cameras on rails positioned above spectators in an exhibition space, where each camera behaves autonomously based on selected rules of computer vision that simulate aspects of how human vision functions. Each of the cameras are programmed to detect visual information of interest based on separate algorithms, and each negotiates with the other two, influencing what subject matter to study in a collective way. 

Viewers can perceive both individual robotic camera behaviors (microcosmic) and their relationships to each other (macrocosmic) on 2 large screens. Visual fragments of spectators who enter the viewing space populate the images, leaving an imprint of their presence that become erased over time as the stream of new images replace the older ones.


[View camera on rails video here]

Robotic Cameras on Rails
Three cameras mounted on motorized rails and positioned in a horsheshoe lay-out on 3 walls of the gallery space, scan the room for visual information of interest as defined by computer vision algorithms. Even though each camera functions autonomously, they compare and address each other’s results. When one of the cameras identifies subject matter of high saliency for its particular algorithm, the other cameras suspend their search and converge to examine the identified subject of interest from their different angles.

[View 3 Cameras in 3D spatialized screen here here]

Visualization Screens
In the installation, four visualizations are featured on two screens/projections. The first screen features what each of the three cameras "see" - a depiction of what their vision algorithms are currently processing. The second screen shows an overview in a 3D reconstruction of the environment featuring a live video stream of the location of the cameras, and of the images they generate. Each camera continuously produces 10 still frames per second, and fills the 3D space with up to a hundred images per camera resulting in a volumetric form of layered stacked photographs that continuously changes as images fade away.  The images' sizes and locations are determined by the locations and poses of the cameras, as well as their focal planes and focus locations at a given moment. The 4th visualization features the sum of activities situating all generated images and the three camera locations within a reduced virtual 3D spatial reconstruction of the exhibition space.

Danny Bazo has a background in Film Studies, Engineering, and Robotics. His contributions include building most of the custom hardware and software development. Marco Pinter's background is in dance performance and kinetic artworks. His contribution to the project also includes his expertise in live video technology, robotics, and telepresence. George Legrady is project manager and brings conceptual directions based on his background in photography, conceptual art, and interactive digital media installations.

An MAT ExpVisLab project. Funded by Robert W. Deutsch Foundation, National Science Foundation IIS #1149001