NoBat

This page discusses the possibility of creating NoBat, an economical, open source, and portable version of The Sonic Eye (TSE).

Introduction

Imagine that you had a "guide bat", a companion that (like a microbat) used ultrasonic echolocation to examine its surroundings. It would then replay the echoes in a human-friendly fashion (e.g., filtered, slowed down), so that you could interpret them for yourself. We're building just such a companion, except that there's no bat.

Various navigation aids for the blind are available or under development. Typically, these gather information using cameras, ultrasonic sensors, and/or external servers (e.g., GPS, OSM). They generally provide audible and/or tactile feedback to the user, in the form of buzzes, clicks, tones, or spoken words.

Our current approach uses ultrasonic echolocation, then processes the echoes into the audible range. So, it's a technologically-assisted form of human echolocation. That said, we expect to incorporate other information sources over time.

Human Echolocation

Human echolocation is the ability of humans to detect objects in their environment by sensing echoes from those objects, by actively creating sounds – for example, by tapping their canes, lightly stomping their foot, snapping their fingers, or making clicking noises with their mouths – people trained to orient by echolocation can interpret the sound waves reflected by nearby objects, accurately identifying their location and size. This ability is used by some blind people for acoustic wayfinding, or navigating within their environment using auditory rather than visual cues. It is similar in principle to active sonar and to animal echolocation, which is employed by bats, dolphins and toothed whales to find prey.

Most methods used for human echolocation (e.g., as taught by World Access for the Blind) are based on direct, audible echoes. This is convenient, because it requires little or no external assistance. However, these echoes have low accuracy and precision, as well as very limited amplitude. Finally, because they are not digitally processed, they cannot be augmented, filtered, or transformed.

Performing emission and sensing in the ultrasonic range (e.g., 25-50 kHz) shortens the wavelengths involved, which automatically increases the resulting image resolution. It also lets us increase the amplitude substantially, without annoying other humans in the vicinity. As we process the resulting echoes, we can lower their frequency range, stretch them out in time, reduce noise, etc.

The Sonic Eye

Much of our inspiration comes from The Sonic Eye (TSE), a promising navigation aid for the blind and visually impaired. It works by allowing the user to recognize the acoustic signatures of staircases, walls, etc. TSE's output is amazingly informative and human-friendly; if you haven't done so, listen to the YouTube video (preferably wearing stereo headphones).

TSE's inspirations and approach are biomimetic, based on the echolocation capabilities of microbats. Basically, it:

  • emits a 3 ms ultrasonic chirp
  • captures the reflections in stereo
  • slows down the reflections by 25x
  • plays them into the user's ears

Without understanding everything that goes on in the user's brain, we can guess that:

  • The temporal position of each reflection indicates its distance.

  • The use of stereo allows the user to capture phase information
    (and thus, left-right directionality) from the reflected ultrasound.

  • The rising frequency of the chirp (25-50 kHz) allows the user to map
    the pitch of a reflection to its temporal position in the initial signal.

The original prototype for TSE was large and rather ungainly: a backpack and a helmet with large, cup-like auricles (modeled on bat ears). However, the developers are working on a much more convenient version, about the size of a pair of sunglasses, but requiring a cell phone.

In the meanwhile, the TSE paper provides enough information to create an interim platform for experimentation. So, some friends are helping me to do just that, releasing details as we proceed. Our hardware and software designs will be open source, allowing (nay, encouraging!) others to jump in and try things out.

Design

NoBat's design tries to balance minimalism against convenience and flexibility. For example, we plan to omit the auricles, because they are ungainly and do not appear to be particularly critical. On the other hand, we're using an Arduino microcomputer and a Bluetooth link, because they allow NoBat to be reconfigured in the field, etc.

The current physical design uses a rectangular box (e.g., 3" x 6" x 2") to house the amplifiers, battery pack, computer, controls, emitter, and so forth. This box would typically be suspended from a neck lanyard. Pairs of electret microphones and tiny speakers can then be attached to a hat or a pair of sunglasses.

Basically, the Arduino will play a pre-recorded chirp (~3 ms) through a high-speed DAC and a power amplifier into an "emitter" (piezoelectric horn loudspeaker). It will then use a high-speed ADC to capture the reflected ultrasound waveforms. After the end of the expected reflections (~60 ms), it will play back the (slowed-down) reflections via another DAC.

The use of an Arduino as the control device allows NoBat to be reconfigured (and even reprogrammed entirely) "in the field". The Bluetooth interface can serve a number of functions, including audio output, device control, and data logging. Open source software (e.g., Audacity) can be used to generate waveforms.

Status

We have a hardware prototype, set up for desktop experimentation. The control software is mostly complete, though it still needs work in the areas of user and host system interfaces. Once we have demonstrated end-to-end functionality, we'll build a pair of field prototypes and start working on parameter tuning, interaction programming, etc.

Futures

The initial NoBat prototypes (Version 0.1) simply present a slowed-down version of the reflected ultrasound. Although this may be all that is necessary (it works for microbats!), it may be possible to do better. For example, we could try to capture more information on the surrounding area, filter out uninteresting echoes, add virtual objects to the landscape, etc.

Using local information (e.g., collected by sweeps of the user's cane) and contextual information (e.g., downloaded from a geographic server), a program could create a 3D model of the surrounding area. This would then be mapped into a synthetic stereo pair of "reflected" signals. The processing would use a variation on ray tracing, making allowances for the differences between light and ultrasound.

Microsoft is basing an entire solution on a somewhat related approach. They generate sounds and words, then use stereo imaging to place them in the user's aural field.

Resources

Project Pages

The Utiles/Arduino subweb has several relevant pages, including:


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r139 - 10 Aug 2017, RichMorin
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding CFCL Wiki? Send feedback