Features

Making tiny computers recognise people? It can be done!

by Guy Kewney | posted on 09 April 2002

Normally, we try to make computers see things the way we do; use a lens, and a retina. The result is that computers are very bad at it. You need a brain of human size to do it well, even with binocular vision. Except - there is a better way!

That better way is to use something more like radar - only, using light. You don't form a two-dimensional representation of the 3D world, and then try to reconstruct the 3D world from the image. Instead, you bounce light off everything, and time how long it takes to be reflected. You then have a picture of the world around you in 3D contours.

The technology will be one of the star exhibits in the new Wireless Ventures exhibition, to be held in Burlingame, California, April 30 and May 1 this year.

Startup innovator, Canesta has patented this radar-like method for forming electronic images of nearby objects in three dimensions. Unlike the sensors in digital still and video cameras that see the world in flat images, Canesta's electronic perception technology "can additionally compute the distance from the sensor of every single pixel in the image, in real time."

Intriguingly, many of the management team are from wireless backgrounds, or have been involved in mobile data; they see a wide range of powerful applications in mobile IT for this technology.

This article draws heavily on information available on the company's web site - because until the technology appeared at last month's PC Forum as one of Esther Dyson's "star picks" almost no other data was available on it. However, it's clear that there's something of a gap between the current state of the art, and the company vision.

The vision is awesome: "The appeal of Canesta's breakthrough interface technology for mobile devices has attracted top industry talent to our management team," said Nazim Kareemi, Canesta CEO and co-founder. "With James Goldberger's track record in developing partnerships in the key mobile and wireless market, Michael Van Meter's proficiency at streamlining operations and James Spare's proven ability in introducing emerging technologies and products, Canesta now has a solid management foundation in place for business success."

Kareemi sees, eventually, potential for "broad application of electronic perception in three principal categories -- navigation through the environment, identification and cataloging, and human-computer interfaces. For the first time, a wide range of products - from cell phones, PDAs or video games, to automobiles, security systems, and medical instrumentation - will ultimately be able to react to and interact with individuals and the nearby environment in real time, through the medium of sight."

"Navigation applications are those where electronic perception technology enables a machine or its operator to navigate more accurately, easily or safely through its surroundings, such as a car that warns of a potentially dangerous lane change," said Kareemi.

By contrast, he said, identification and cataloging applications are those in which a machine or electronic device constantly assesses the nearby environment, identifying features, objects or changes. "A useful application might be a low cost baby room monitor that could raise an alarm if a child begins to exit a pre-defined area, such as the playpen." Other applications in this category might include building and airport security, law enforcement, or medical diagnosis.

In the human-computer interface, Kareemi imagines two teenagers in the family room playing a video game that senses their body and hand movements, relying on electronic perception rather than mechanical controllers for input. These are a class of "virtual" input devices that include virtual keyboards and gestural control of consumer electronics, two areas where Canesta has done significant research and development since its founding.

However, reality may be that these applications are still some way in the future. Its first patent, number 6,323,942, shows that the current technology may be substantially less ambitious, according to eeTimes which summarised:

"The patent describes a chip that contains a 100 x 100-pixel sensor array, a microcontroller, high-speed clock, memory and I/O. The controller triggers regular pulses of light from an LED. The on-chip sensor array consists of pulse detectors, each associated with a high-speed counter linked to the on-board clock. When light hits an individual pixel, the counter information — essentially the light's time of flight — is saved and passed to an on-board controller or digital signal processor that calculates the image map. The patent also describes a second implementation that counts the amount of light over a fixed time, eliminating the counters and clock."

Which is fine, but: "The initial device will only detect images across a short distance of perhaps 18 inches on a flat surface. The company demonstrates the device being used to create a virtual keyboard projected onto a table by a PDA or a cellular phone."

There are other patents granted, which will take the technology further; but clearly, we aren't looking at a technology which could, say, solve the lawn-mower problem. Robot lawn-mowers are a constant dream of technology designers; they'd sit in the sun, soaking up energy, and then wander around the lawn, ignoring the flowers and trimming the grass until they ran out of power or out of grass to mow.

In reality, they get lost; they can't see where they are, and there aren't enough reference points for their limited intelligences to keep track of. Canesta technology might conceivable make them better at avoiding flower beds, but until the next generation of sensors comes along, these mowers wouldn't be able to find the garden gate, or even to cross a path of paving stones.

Canesta is readying subsets of its technology for incorporation this year by OEMs in the personal electronics area, in the form of application-specific sensor chips, interface software and support services. There's almost no public information available on this yet; products will be announced by Canesta's customers, and till then, they're under wraps.

The sensors work in a manner rather like radar, where the distance to remote objects can be calculated. In radar, it's done by measuring the time it takes an electronic burst of radio waves to make the round trip from a transmitting antenna to a reflective object. and back. In the case of these chips, however, a burst of unobtrusive (infra-red) light is transmitted instead. This is of a known frequency, and all other frequencies are ignored, so ambient light doesn't confuse the picture.

Canesta's sensors then have two ways to measure distance. Either, they time how long it takes the pulse to reflect back to each "pixel" using high speed, on-chip timers, or else simply count the number of returning photons -- an indirect measure of the distance.

The image processing part, therefore, is far simpler than conventional software, which requires huge visual processing capacity. Instead of having to generate the 3D view of the world from multiple 2D images, Canesta's software starts this contour map, provided "for free" by the hardware.

So, this makes it possible to embed the application-independent portion of the processing software directly into the chips themselves, without the need for a PC attached. The result is that these devices can be built into cheap, and portable devices; they don't need high-power processors, and they don't suck up electrical power either. Canesta describes them as "similar to video chips, but smarter."

Just how much smarter, you can gauge from the fact that Canesta reckons to compute 3-dimensional image maps at more than 50 frames per second. Conventional image technology "can take from several seconds to several minutes to generate a 3-dimensional representation of a single, static frame."

Questions which delegates to Wireless Internet will be asking, will include:

what resolution is this visual scan?

what frequency is the infra-red beam?

does it work in a planar scan, or in a spherical one?

how does it cope with reflective surfaces like mirrors?

what standards does it support?

what are the limits of its range, in practice, and in theory?

and, of course, price?

Newswireless Net will report further after the show.