This page discusses the possibility of creating NoBat,
an economical, open source, and portable version of The Sonic Eye (TSE).
Imagine that you had a "guide bat", a companion that (like a microbat
used ultrasonic echolocation
to examine its surroundings.
It would then replay the echoes in a human-friendly fashion (e.g., filtered, slowed down),
so that you could interpret them for yourself.
We're building just such a companion, except that there's no bat
Various navigation aids for the blind are available or under development.
Typically, these gather information using cameras, ultrasonic sensors,
and/or external servers (e.g., GPS
They generally provide audible and/or tactile feedback to the user,
in the form of buzzes, clicks, tones, or spoken words.
Our current approach uses ultrasonic echolocation,
then processes the echoes into the audible range.
So, it's a technologically-assisted form of human echolocation.
That said, we expect to incorporate other information sources over time.
Human echolocation is the ability of humans to detect objects in their environment
by sensing echoes from those objects, by actively creating sounds –
for example, by tapping their canes, lightly stomping their foot, snapping their fingers,
or making clicking noises with their mouths –
people trained to orient by echolocation can interpret the sound waves reflected by nearby objects,
accurately identifying their location and size.
This ability is used by some blind people for acoustic wayfinding,
or navigating within their environment using auditory rather than visual cues.
It is similar in principle to active sonar and to animal echolocation,
which is employed by bats, dolphins and toothed whales to find prey.
Most methods used for human echolocation
(e.g., as taught by World Access for the Blind
are based on direct, audible echoes.
This is convenient, because it requires little or no external assistance.
However, these echoes have low accuracy and precision
as well as very limited amplitude
Finally, because they are not digitally processed,
they cannot be augmented, filtered, or transformed.
Performing emission and sensing in the ultrasonic range (e.g., 25-50 kHz)
shortens the wavelengths
which automatically increases the resulting image resolution
It also lets us increase the amplitude substantially,
without annoying other humans in the vicinity.
As we process the resulting echoes, we can lower their frequency range,
stretch them out in time, reduce noise, etc.
The Sonic Eye
Much of our inspiration comes from The Sonic Eye
a promising navigation aid for the blind and visually impaired.
It works by allowing the user to recognize the acoustic signatures of staircases, walls, etc.
TSE's output is amazingly informative and human-friendly;
if you haven't done so, listen to the YouTube video
(preferably wearing stereo headphones).
TSE's inspirations and approach are biomimetic
based on the echolocation
capabilities of microbats
- emits a 3 ms ultrasonic chirp
- captures the reflections in stereo
- slows down the reflections by 25x
- plays them into the user's ears
Without understanding everything that goes on in the user's brain,
we can guess that:
- The temporal position of each reflection indicates its distance.
- The use of stereo allows the user to capture phase information
(and thus, left-right directionality) from the reflected ultrasound.
- The rising frequency of the chirp (25-50 kHz) allows the user to map
the pitch of a reflection to its temporal position in the initial signal.
The original prototype for TSE was large and rather ungainly:
a backpack and a helmet with large, cup-like auricles
(modeled on bat ears).
However, the developers are working on a much more convenient version,
about the size of a pair of sunglasses, but requiring a cell phone.
In the meanwhile, the TSE paper
provides enough information
to create an interim platform for experimentation.
So, some friends are helping me to do just that, releasing details as we proceed.
Our hardware and software designs will be open source,
allowing (nay, encouraging!) others to jump in and try things out.
NoBat's design tries to balance minimalism against convenience and flexibility.
For example, we plan to omit the auricles,
because they are ungainly and do not appear to be particularly critical.
On the other hand,
we're using an Arduino
microcomputer and a Bluetooth
because they allow NoBat to be reconfigured in the field, etc.
The current physical design uses a rectangular box (e.g., 3" x 6" x 2")
to house the amplifiers, battery pack, computer, controls, emitter, and so forth.
This box would typically be suspended from a neck lanyard.
Pairs of electret microphones
and tiny speakers
can then be attached to a hat or a pair of sunglasses.
Basically, the Arduino will play a pre-recorded chirp (~3 ms)
through a high-speed DAC
and a power amplifier
into an "emitter" (piezoelectric horn loudspeaker
It will then use a high-speed ADC
to capture the reflected ultrasound waveforms.
After the end of the expected reflections (~60 ms),
it will play back the (slowed-down) reflections via another DAC.
The use of an Arduino as the control device allows NoBat to be reconfigured
(and even reprogrammed entirely) "in the field".
The Bluetooth interface can serve a number of functions,
including audio output, device control, and data logging.
Open source software (e.g., Audacity
) can be used to generate waveforms.
We have a hardware prototype, set up for desktop experimentation.
The control software is mostly complete, though it still needs work
in the areas of user and host system interfaces.
Once we have demonstrated end-to-end functionality,
we'll build a pair of field prototypes and start working
on parameter tuning, interaction programming, etc.
The initial NoBat prototypes (Version 0.1)
simply present a slowed-down version of the reflected ultrasound.
Although this may be all that is necessary (it works for microbats!),
it may be possible to do better.
For example, we could try to capture more information on the surrounding area,
filter out uninteresting echoes, add virtual objects to the landscape, etc.
Using local information (e.g., collected by sweeps of the user's cane)
and contextual information (e.g., downloaded from a geographic server),
a program could create a 3D model of the surrounding area.
This would then be mapped into a synthetic stereo pair of "reflected" signals.
The processing would use a variation on ray tracing,
making allowances for the differences between light and ultrasound.
Microsoft is basing an entire solution on a somewhat related approach.
They generate sounds and words, then use stereo imaging to place them
in the user's aural field.
subweb has several relevant pages, including:
This wiki page is maintained by Rich Morin
an independent consultant specializing in software design, development, and documentation.
Please feel free to email
comments, inquiries, suggestions, etc!