The ICON system aims to enable users to easily and effectively view, navigate, and search collections of digital images. It combines an intuitive cross-platform thumbnail based user interface with powerful image processing and content description functionality to facilitate automated organisation and retrieval of large heterogeneous image sets based on both meta data and visual content.
In order to adapt to the varying demands of home users, professional photographers, and commercial image collections, the system is designed to be inherently flexible and extensible. A client-server split and object oriented design provide layers of abstraction and encapsulation, thus allowing the system to be customised for a given application domain to meet specific user requirements. ICON currently consists of two parts:
In order to provide a quick and convenient means of viewing and manipulating images via the ICON client, the user is presented with a familiar directory tree view onto the local and remote filesystems. With a single mouse click the client can be made to scan the directory structure to search for images, create thumbnails, and extract meta data such as digital camera settings and annotations.
Images can then be exported to the repository to generate visual content descriptors and for archiving purposes. The repository stores meta data and images on a per-user basis but also provides support for collaborative access and for pictures and associated data to be re-imported into the ICON client. Each user may access the repository through an arbitrary number of clients running on any machine or operating system with a current version of the Java runtime environment.
Both images stored locally and those in the repository can be browsed and organised according to meta data and visual properties rather than just on a per-directory basis. The ICON client provides a range of methods for easy navigation of potentially very large image sets, including sophisticated clustering and visualisation functionality.
Our research effort aims to make robust content based image retrieval (CBIR) of general digital images a reality. The image analysis carried out by the ICON repository segments pictures into regions with associated visual properties and uses neural network classifiers to assign a probabilistic labelling of such image regions with semantic terms corresponding to visual categories such as grass, sky, and water.
The ICON client allows image databases to be searched according to meta data (e.g. picture date and digital camera make), annotations, and classified image content. Queries can be formulated in a number of different ways to cater for a range of different retrieval needs and levels of detail in the user's conceptualisation of desired image material. A query may comprise one or several of the following elements:
The user may assign different weights to the various elements that comprise a query and can choose from a set of similarity metrics to specify the emphasis that is to be placed on the relative localisation of target content within images and overall compositional aspects.
Despite its sophistication, the retrieval system is easy to use and simple queries can be created very rapidly. The search process also entails an element of interaction as users can provide relevance feedback by selecting a few relevant or non-relevant images after an initial search which causes the query elements to be re-weighted to adapt to the user's retrieval requirement and expectations.
In order to enable retrieval of images based on their visual properties and semantically labelled content, ICON performs a number of pre-processing and image analysis stages on images exported to the repository:
The choice of visual categories such as grass or water which mirror aspects of human perception allows the implementation of intuitive and versatile query composition methods while greatly reducing the search space. Through the relationship graph representation of regions we can make the matching of clusters of regions invariant with respect to displacement and rotation, whereas the grid pyramid representation caters for a comparison of absolute position and size. This may be regarded as an intermediate level representation which does not preclude additional stages of visual inference and composite object recognition in light of query specific saliency measures and the integration of contextual information.
Click the links below to see some screen shots of the ICON client (these pages may require some time to load over a slow connection).
© AT&T Laboratories Cambridge, 2001