Search and Find
A DISTRIBUTED MULTIPLE CAMERA SURVEILLANCE SYSTEM (p.107-108)
T.Ellis, J.Black, M.Xu and D.Makris
Digital Imaging Research Centre (DIRC), Kingston University, UK
An important capability of an ambient intelligent environment is the capacity to detect, locate and identify objects of interest. In many cases interesting can move, and in order to provide meaningful interaction, capturing and tracking the motion creates a perceptively-enabled interface, capable of understanding and reacting to a wide range of actions and activities. CCTV systems fulfill an increasingly important role in the modern world, providing live video access to remote environments. Whilst the role of CCTV has been primarily focused on rather specific surveillance and monitoring tasks (i.e. security and traffic monitoring), the potential uses cover a much wider range.
The proliferation of video security surveillance systems over the past 5-10 years, in both public and commercial environments, is extensively used to remotely monitor activity in sensitive locations and publicly accessible spaces. In town and city centres, surveillance has been acknowledged to result in significant reductions in crime. However, in order to provide comprehensive and large area coverage of anything but the simplest environments, a large number of cameras must be employed.
In complex and cluttered environments with even moderate numbers of moving objects (e.g. 10-20) the problem of tracking individual objects is significantly complicated by occlusions in the scene, where an object may be partially occluded or totally disappear from camera view for both short or extended periods of time. Static occlusion results from objects moving behind (with respect to the camera) fixed elements in the scene (e.g. walls, bushes), whilst dynamic occlusion occurs as a result of moving objects in the scene occluding each other, where targets may merge or separate (e.g. a group of people walking together).
Information can be combined from multiple viewpoints to improve reliability, particularly taking advantage of the additional information where it minimises occlusion within the field-of-view (FOV). We treat the non-visible regions between camera views as simply another type of occlusion, and employ spatio-temporal reasoning to match targets moving between cameras that are spatially adjacent. The "boundaries" of the system represent locations from which previously unseen targets can enter the network.
To aid robust tracking across the camera network requires the system to maintain a record of each target entering the system and throughout its duration. When a target disappears from any camera FOV, motion prediction, colour identification, and learnt route patterns are used to re-establish tracking when the target reappears. Each target is maintained as a persistent object in the active database and spatial and temporal reasoning are used to detect these activities and ensure that entries are not retained for indefinite periods.
This chapter describes a multi-camera surveillance network that can detect and track objects (principally pedestrians and vehicles) moving through an outdoor environment. The remainder of this chapter is divided into four sections. The first describes the architecture of our multi-camera surveillance system. The second considers the image analysis methods for detecting and tracking objects within a single camera view. The next section deals with the integration of information from multiple cameras. The final section describes the structure of the database.