The Personal Media Management project aims to provide a coherent set of tools and storage architecture for managing archives of digital media, specifically digital still images and video. The project follows on from the DART project developing further the image archiving and retrieval work done on the ShoeBox digital picture storage product. The work on still image archiving and retrieval is being actively pursued in the lab while the more speculative video retrieval side of the project is being researched by case students in the Engineering Lab at Cambridge University and at Imperial College.
The still image retrieval side of the PERMM project has as its goal the generation of a natural language query interface for large unfamiliar collections of images. The mechanism for generating this function is as follows. Image processing is used to segment an image and provide a parametric description of the colour, shape and texture properties of the resulting image regions. These region properties are passed to a series of neural net based classifiers which recognise a series of classes of stuff: skin, sky, cloud, snow, sand, tarmac, water, wood, trees, grass, cloth and interior wall. Image retrieval is currently possible using an area based metric and either a graphical query composition tool or similarity to a set of seed images. Hard constraints on existence of a particular region with in an image may also be explicitly included in the composed query. The currently active area of research is in extending the competence of the image processing and stuff recognition model to cope with generic classes of compound object rather than simple categories of visual stuff and to develop a well founded image description language including a grammar for compounding descriptive terms.
To showcase the image processing and retrieval technology developed in the PERMM project the ICON (Image Content based Organisation and Navigation) system has been written. This is a Java image browsing and retrieval tool (called the ICON client) that will allow thumbnail browsing of images in the local file system and on request export images to an archiving server (called the ICON server). The ICON server will process the images and make then searchable by image content. The ICON client contains a sophisticated visual query composition interface incorporating a visual thesaurus as well as hard region property constraints.
The video archiving and retrieval side of the project is not being actively pursued in the lab but through case awards to PhD students. Ideally it is hoped that algorithms for motion tracking in video will lead to automated annotation of motion events in video and through still keyframe selection to a natural language searchable annotation system for video. By products of motion event tracking in video are liable to include mosaicing and smart editing facilities which will allow semi automatic separation and removal of foreground objects, advert substitution and reconstruction of the geometry of the scene.
The image segmentation code developed for the image retrieval project has been used to form the basis of a `smart' image editor. This has been christened the CVIeditor (or Computer Vision Image editor) as it aims to make some of the developments in computer vision available to the public. The editor in written in Java (naturally cross platform and therefore available to the Unix/Linux community) and is a departure from existing image editor in being fundamentally region based.
The following articles are available describing image segmentation and aspects of the ICON system:
Voronoi seeded image segmentation
Content Based Image Retrieval using Semantic Visual Categories
Ontological Query Language for Content Based Image Retrieval
© AT&T Laboratories Cambridge, 2001