Sentient Computing:Using sensors and resource status data to maintain a model of the world which is shared between users and applications.
What could we do if computer programs could see a model of the world? By acting within the world, we would be interacting with programs via the model. It would seem to us as though the whole world were a user interface.
Here is a diagram showing what we are trying to achieve. While people can observe and act on the environment directly, application programs observe and act on the environment via the world model, which is kept up to date using sensors and provides an interface to various actuators; if the terms used by the model are natural enough, then people can interpret their perceptions of the world in terms of the model, and it appears to them as though they and the computer programs are sharing a perception of the real world.
These pictures show how well our model corresponds to the real world. On the left is a photograph of a real-life situation -- on the right is a screen shot from a real-time 3D display of the model, as it was when the photograph was taken. (Notice that the model is representing the phone in red because it has determined from its interface with the telephone switch that the phone is off the hook).
The technological challenges for our system are: creating an accurate and robust sensor system which can detect the locations of objects in the real world; integrating, storing and distributing the model's sensor and telemetry information to applications so that they get an accurate and consistent view of the model; and finding suitable abstractions for representing location and resource information so that the model is usable by application programs and also comprehensible by people.
To solve these problems we have built an ultrasonic location system, which provides a location accuracy of about 3cm throughout our 10000 square foot building, making it the most accurate large-scale wireless sensor system in the world; we have built a distributed object model, which integrates location and resource status data for all the objects and people in our building; and we have built a spatial monitoring system to implement an abstraction for expressing spatial relationships, which lets applications detect spatial relationships between objects in a way that seems natural to human users.
The location sensor system uses small (8cm long) devices called bats, each of which has a unique id, an ultrasound transmitter and a radio transceiver. The bats are located by a central controller, and a the world model stores the correspondence between bats and their owners, applying type-specific filtering algorithms to the bat location data to determine the location of the person or object which owns it.
|A bat, showing from right to left: two copper coiled antennae, the radio transmitter module (in gold, receiver underneath), the AA battery (the large white and green object) and two ultrasonic transmitters. Total length of the device is about 2.5 inches.|
To locate a bat, the central controller sends its unique id over the radio channel, simultaneously resetting counters in a network of ultrasound receivers hidden in the ceiling. When a bat detects its id, it sends an ultrasonic pulse, which is picked up by some of the ceiling receivers. From the times of flight of the pulse from the bat to each of the receivers that detected it, the system can calculate the 3D position of the bat, to an accuracy of about 3cm.
The bat also contains a pair of buttons and a beeper which can be used to provide context-sensitive control and feedback.
The location and resource status data are represented by a set of persistent CORBA objects implemented using omniORB, our own GPL-ed CORBA ORB. Each real-world object is represented by a corresponding CORBA software object. Around 40 different types of object are modelled, each by its own CORBA object type; for example, the system includes the types: Person, Computer, Mouse, Camera, Scanner, Printer, Phone and so on.
As well as the location of the real object, each software object makes available current properties of the real object, and provides an interface to control the real object, so, for example, a scanner can be set up and made to perform a scan by an application via its corresponding Scanner software object.
The set of all these persistent CORBA objects make up the world model that is seen by applications. The objects themselves take care of transactions, fail-over, session management, event distribution and the all the other issues implicit in a large-scale distributed system, presenting a simple programming interface to the applications.
Location data is transformed into containment relations by the spatial monitor. Objects can have one or more named spaces defined around them. A quad-tree based indexing method is used to quickly determine when spaces overlap, or when one space contains another, and applications are notified using a scalable event mechanism.
|A map of one of our offices, showing visibility spaces around computers, and usage spaces around people. The red shading indicates a containment state.|
Containment relationships between 2D geometric shapes are a good way of formalising vague spatial relationships. Simpler abstractions fail to capture complexities in the environment which are obvious to the user, while more sophisticated ones risk being too complex for the user to understand. It turns out that people are very well-suited to reasoning about and remembering 2D geometric shapes.
This abstraction also works well for application programmers because they can use traditional GUI programming styles, treating spaces around objects as though they were buttons on a computer screen.
|A map which helps users contact each other by phone. The user rmc is on the phone, and there is some kind of meeting going on in the small room near the bottom-left of the map.|
One of our applications is this map which helps users contact each other by phone. As people move around the building, their positions are updated in real time. If you want to talk to somebody, just type in their name and the map will zoom to their current location. Work out whether you should interrupt them by seeing who they are with, which way they are facing and whether they are moving -- you will see on the map if they are meeting a visitor, or talking to the boss. If you want to call them, just click on the telephone nearest them to set up a phone call. If they are already on the phone it will be coloured red, but as soon as they put it down it will go grey again.
|Controlling the 3D visualisation of the world model by using the bat as a pointing device.|
In our reception area, we have a 3D visualisation of our world model, again updated in real time with the positions of people and equipment. This can be controlled by using the bat as a pointer. The user can zoom in on any room by simply moving the bat over it and pressing the bat's button. To support this the world model contains a special kind of software object, called a mouse panel, which represents a plane covering the screen, onto which 3D bat locations are projected.
The sentient computing system uses our VNC display-remoting software to create computer desktops which follow their owner around. Approaching a display and pressing a button on the bat causes the user's VNC desktop to be moved to that display. So if I'm in another office, but want to show a colleague what I'm working on, I can simply teleport my desktop to his PC, and then teleport away when I'm finished.
Because the system is also integrated with our telephone switch, we can automatically route external telephone calls. Each user has their own direct dial number and, when that number is dialled, their bat's beeper makes a short ringing tone, wherever they are. If the user decides to take the call it is automatically forwarded to the nearest phone.
|Using the Follow Me camera application to support mobile videoconferencing. The remote participant always sees a view of the person they are talking to (inset), and, because he knows which cameras cover which spaces, the local participant always knows which camera to look at.|
Our Follow Me camera application selects a suitable camera to watch a person wherever they go. This makes applications like video conferencing more useful -- if you want to show the other participant a diagram on your whiteboard, you just walk over to the whiteboard and point to it, knowing that a suitable camera will be selected.
Sentient computing can help us to store and retrieve data. Whenever information is created, the system knows who created it, where they were and who they were with. This contextual metadata can support the retrieval of multimedia information.
In our system, each user has an `information hopper', which is a timeline of information which they have created. Two items of information created at the same time will be in the same place in the timeline -- this allows us to associate data items in a composable way, without having to maintain referential integrity. The system knows who the user was with at any point on the timeline, and the timelines of users who were working together can be merged to generate another timeline. This lets us generate records of collaborative work without any maintenance effort, by using the sentient computing system as a kind of ubiquitous filing clerk. The timeline can be browsed using a normal web browser. Here is an example timeline which illustrates some of these points.
Because the sentient computing system creates an interface that extends throughout the environment, we can treat it just like a traditional user interface and create a `button' anywhere in the environment. In a traditional GUI, a button is just an area of the screen, which usually (but not necessarily) has a label associated with it. In a sentient computing system, a button can be a small space anywhere in a building -- again, it may have a label associated with it. Of course, the label need be nothing more than a piece of paper. To press the button, the user just puts the bat on the label and clicks a button on the bat. As a bonus, because the system knows which bat was used, it knows who pressed the button.
We can create a poster with several of these buttons on it -- the poster is a user interface which can be printed out and stuck on the wall.
|Using a smart poster to control a networked scanner. The highlighted spots on the poster are buttons in the sentient computing system's model of the world. This user is telling the system to scan a document that is lying on the scanner glass. He will then press the 'scan to hopper' button to scan the document to his information hopper.|
Here is one `smart poster' we have created which is used to control a networked scanner. The user can use buttons on the poster to select high-level options, like colour/monochrome, data compression format, resolution and whether to use the sheet feeder, and then scan the document to their information hopper or to their email inbox. Because the system knows who is pressing which button, it knows where to send the scan and it can even remember a user's preferred scanner setup and use that.
|This smart poster controls the phone call fowarding service. The button at the bottom left toggles the service on or off. There is another button over the picture of the bat which makes the bat ring with the sound it will use when the user is called.|
We also use smart posters to control our ubiquitous services -- the one above is the poster which controls our phone forwarding service. A user can turn the service on or off, or even hear an example of the sound the bat will make when they get a phone call.
Smart Posters are a good way of advertising new services: they catch the eye, explain what the service does, and provide a way for users to opt in to it, all on a sheet of paper -- the ultimate thin interface.
We can control other devices using located tags like the bat. We have implemented a video streaming and camera control system using networked MPEG codecs and pan-tilt-zoom controllable cameras.
One of our applications of this is a distributed video bulletin board which lets users create video messages, and organise them into threads, controlling a camera by using a bat as a pointing device. Users can control the camera's pan, tilt and zoom by pointing at things with their bat, or by wearing their bat, letting the camera track them as they move around. Here's an example clip (warning: 40Mbytes), which was recorded using the bulletin board and has a brief description of what it does. The bulletin board uses standard networked MPEG-1 codec boxes which transmit via IP multicast, so it could also be made to support recording of n-way video conferences.
|Selecting a thread from our video bulletin board.|
Bats are small enough to be attachable to most small portable devices. Because it knows exactly where people and devices are, the sentient computing system can work out who is using a device at any time.
|Using a digital audio recorder to record a memo which is automatically filed and transcribed using a speaker-dependent transcription service.|
We have implemented two applications. Using a standard digital camera which is located using a bat, a user can take photographs which are automatically filed in their information hopper. And using an digital audio recorder a user can make audio memos which are also automatically filed in their information hopper, together with a textual transcription of what they said. Because we know who was holding the audio memo recorder we can increase the quality of the text transcription by using a voice model and vocabulary appropriate to that user.
Over the next few years we expect wireless devices and LANs to become more widespread. But without a sentient computing system, the value of a wireless device is limited. There is a widespread assumption that radio devices themselves have some kind of innate sensing capability because useful proximity information can be inferred from radio signal strength. This assumption is incorrect, firstly because multipath effects within buildings greatly distort the relationship between signal strength and distance, and secondly because it fails to take account of environmental discontinuities like walls and floors.
We are investigating the use of sentient computing systems to measure how buildings are used. The movements of personnel and resources are monitored in real time, and used to drive a model of space usage and interactions. The aim is to detect and characterise work events such as meetings, and to provide a wealth of information to building managers and designers.
We have made a recording of the occupation density of our building over the course of a day. Intensity in this animation is proportional to the logarithm of the integral of number of people present over time, windowed by an exponential decay with half time of four minutes. Each frame in the video is a minute of real time. For comparison, this is the layout of floors represented in the animation:
The sentient computing system we have described here is a real working system. It works throughout our 10000 square foot building here at AT&T Labs in Cambridge, England. Each of our 50 members of staff has a bat, which they use continuously when they are in the building. The system and all its applications are always available apart from a few minutes backing-up time at night.
We are now looking at new applications, refining the software architecture, simplifying the infrastructure installation process, and integrating other sensor technologies into the system. When our next round of enhancements is complete, we aim to have a system which will be easily to install, maintain, and integrate with a site's existing systems.
The most obvious real-world application areas for this technology are in large buildings with highly mobile populations who spend their time generating information, looking at information, using different kinds of equipment and communicating with each other. Examples of such environments include hospitals and large office buildings. Ultimately, we believe that sentient computing provides benefits wherever people have to interact with machines. One day, all user interfaces could work this way.
© AT&T Laboratories Cambridge, 2001