Wish List - Documents

Although applications exist to support every popular document format, both the user interface and the level of accessibility can vary substantially. In addition, they may only be available on particular operating systems. So, it would useful to present a variety of documents, using a single, accessible form of navigation.

Any Document

Skimming large documents can be very tedious for blind readers. So, having the ability to summarize documents can be extremely convenient. The Summarize Service has been part of Mac OS for several years, but it can only assist users of other OSes if made into a web service. The berkeley-doc-summarizer utility appears very promising over the long term, but it is still research software. So, a web service such as SMMRY may be more applicable in the short term.

Input Output Brief Description of Feature ED SI SU WIP Activities
Document (eg, HTML, PDF) JSON summarize document text 5 7 7 AUR, DGA
Back to Summary Table

Digital Video

Web sites such as YouTube contain vast amounts of digital video content, including a large number of slide-show presentations. In an ideal world, these would be accompanied by accessible versions (e.g., time-synchronized audio and structured text in DAISY format). Sadly, most of the presentations are far from this level of accessibility. Often, the video switches back and forth between the speaker and the projection screen, making the slides difficult for even sighted viewers to understand.

Fortunately, it may be possible to extract accessible content from some video streams. For example, tools already exist to extract images from video streams. If we could extract crisp images of presentation slides, we could subject them to follow-on processing (e.g., OCR-based text and format extraction).

Although I don't have a great solution for deblurring, one possibility would be to employ something like the software described here: UCLA researchers release open source code for powerful image detection algorithm.

Input Output Brief Description of Feature ED SI SU WIP Activities
Digital Video (eg, MPEG-4) JSON, PNG extract static images (eg, slides) 5 3 3 AUR, DGA
Back to Summary Table

File System

Most operating systems support hierarchical file systems, composed of directories (i.e., folders) and files. However, the user interfaces for navigation and searching vary widely. By providing a consistent interface, AxAp can make common tasks easier for blind users. Of course, this interface can also be used to locate documents for display, etc.

Input Output Brief Description of Feature ED SI SU WIP Activities
File System (eg, directories) HTML present with accessible navigation 5 7 7 AUR, DGA
Back to Summary Table

Formatted Text

A variety of standards exist for electronic publication of documents, but their popularity varies markedly. According to this article, PDF is by far the most popular (72%), Office Open XML (aka DOCX) is a strong second (16%), and the others are all below 10%. However, certain formats (e.g., DAISY, EPUB) may be of particular importance to a blind reader.

A DAISY Digital Talking Book (DTB) is a collection of files, including audio, images, text, and a variety of metadata. This allows it to present information in assorted ways, including Braille, speech, large print text, etc.

An EPUB document is basically a "web site in a can": a renamed ZIP archive, containing assorted file types (e.g., CSS, HTML, JavaScript). So, it seems plausible that the contents could be unpacked, tweaked, and presented as a set of web pages. Also, a number of resources exist to translate EPUB to HTML or PDF, etc.

PDF is a rather complicated format, but it is also very popular and well supported. So, a number of resources exist to parse it, translate it to HTML or SVG, etc. Translating PDF to HTML should be largely a problem of tool selection and wrapping.

Input Output Brief Description of Feature ED SI SU WIP Activities
Formatted Text (EPUB) HTML present with accessible navigation 5 7 7 POC
Formatted Text (eg, PDF) HTML present with accessible navigation 5 7 7 AUR, DGA
Back to Summary Table

Plain Text

Plain text files, which contain only printable characters and white space, are commonly used in code and data files. Unlike formatted documents, they tend to have little explicit structure. However, their implicit structure may be both ambiguous and complex.

Data Files

Many data files are encoded as text using standardized formats (e.g., CSV, JSON). If the format is known and a parser is available, these files can translated into navigable tables, trees, etc. Otherwise, they can only be presented as preformatted text.

White Space

Sequences of white space are commonly used to encode structural information. For example, indentation is used to indicate columnar and/or hierarchical structure. This is quite useful to sighted programmers, but mostly inaccessible to blind ones. In fact, reading past these sequences is a substantial nuisance and no assistance in available for recognizing, navigating, or maintaining them.

If the user is only interested in reading a code or data file, a simple shortening mechanism (e.g., regular expression) may suffice. However, sequences in displays and quoted strings will also be modified, so shortened code isn't guaranteed to work the same. This problem can be mitigated substantially by use of a syntax-aware shortener.

Editing a code or data file, while preserving white space usage, is a more intricate problem. In some cases (e.g., Python, YAML), white space may be an integral part of the code or data format. In any case, a blind programmer working with sighted programmers will need to follow the conventions of the language, project, etc.

Assume that our programmer has acquired shortened versions of the relevant files, studied them to find the problem, and edited a proposed solution. Simply transferring the solution to the non-shortened version of the file(s) is an extremely tedious and error-prone task. Fortunately, there are various approaches to mechanizing a solution.

Similarly, program source code can be analyzed and reformatted for accessibility. For example, we have experimented with code folding, using the accordion pattern. It seems plausible that a modular IDE such as Eclipse Che could be extended to support accessibility. The Language Server Protocol (LSP), in particular, appears to be a great basis for exchanging editing metadata.

Input Output Brief Description of Feature ED SI SU WIP Activities
Plain Text (eg, code, data) Unicode shorten whitespace sequences 2 7 7 AUR, MUI
Plain Text (eg, code) HTML present with accessible navigation 2 7 7 ... POC
Plain Text (eg, data) HTML present with accessible navigation 2 7 7 AUR, MUI
Plain Text (eg, code, data) HTML present with accessible navigation 2 7 7 AUR, MUI
Plain Text (eg, code, data) JSON extract structure from white space 2 7 2 AUR, POC
Back to Summary Table

See Also

  • LSP - Language Server Protocol (standard)

Web Page

Making web pages more accessible is one of AxAp's major functions. The initial step involves retrieving a page, while satisfying authentication, authorization, and other requirements. We also want to edit links to add AxAp to the URL (so that AxAp gets called in for the follow-on processing), cache and pre-fetch pages to minimize latency, etc. However, that's just the starting point...

Emoji, Emoticons, etc.

It is a common practice to insert small graphic elements or text strings into email messages. Even if a sighted person hasn't encountered a given element before, the meaning is usually obvious upon inspection. However, a blind reader is quite likely to be mystified by the following:

Using resources such as Emojipedia, AxAp could provide terse explanations of Emoji (e.g., "Dog Face - The face of a dog, showing both eyes, both ears, nose and mouth. Shows the face of a dog smiling, with eyes open and tongue hanging out."). It could also provide links to relevant pages (e.g., Dog Face), facilitate search for and use of Emoji, etc.

Explicit Style Markup

Screen readers typically discard style markup (e.g., bold, italic, code), apparently under the theory that it isn't sufficiently important to warrant use of the reader's limited bandwidth. Although this may be true in general, there will certainly be exceptions. For example, style information is often used to encode semantic distinctions which a given reader might want to retain. So, we'd like to add explicit style markup (eg, *bold*, _italic_, `code`) as an optional feature.


HTML5 tags and WAI-ARIA attributes can do a lot to make pages more accessible. However, client-side support for them can be inconsistent: a browser may support a tag but not the equivalent attribute. More to the point, few web sites even employ this sort of markup. By rewriting the page, we can make it more accessible to a broad range of web browsers and screen readers.

Supplementary content can be added to pages to make them more accessible. For example, the section index at the top of this page provides a useful summary and enables navigation to desired sections. CSS and JavaScript code can also be added to enhance navigation. For example, the table below can be sorted by any desired column, in either ascending or descending order.

Contextual menus can provide access to many useful features. For example, binding menu items to an embedded image could allow it to be processed by image recognition, OCR, etc. Binding menu items to Emoji, unusual words, or other content could allow the user to look up relevant information.

Information Capture

A wealth of information (i.e., page content, contextual metadata) will be available to AxAp in a typical web browsing session. Which pages were accessed when, what keywords and links did they contain, etc? By capturing and recording this (e.g., in a graph database), we can provide the user with a searchable information resource. For example, the user might ask for a list of pages she has visited, either on a given topic or during a given period of time, etc.

Input Output Brief Description of Feature ED SI SU WIP Activities
Web Page (HTML, etc.) HTML add HTML tags and ARIA attributes 2 3 3 AUR
Web Page (HTML, etc.) HTML add explicit style markup (eg, *bold*) 2 3 3 AUR, MUI
Web Page (HTML, etc.) HTML identify and explain Emoji, etc. 3 1 1 AUR, DGA
Web Page (HTML, etc.) HTML record browsing context and history 5 7 7 AUR, DGA
Web Page (HTML, etc.) HTML retrieve web page from remote site 3 7 7 AUR, DGA
Web Page (HTML, etc.) HTML rewrite links to use local versions 3 7 7 AUR, DGA
Back to Summary Table

This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r25 - 28 Nov 2016, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email