Graphs: Architecture

This page uses AxAp's software architecture diagram as the basis for experimentation in accessible graph representation.

Motivation

The data flow diagram on the Architecture page summarizes AxAp's primary data flows. Although the diagram is encoded in a textual format (SVG), its semantic payload is essentially inaccessible to most humans. In particular, blind readers such as Amanda Lacy have no convenient way to examine it or other, similar diagrams.

So, we've been experimenting with other representations for graph-based information, including English text and computer-readable specifications (e.g., Cypher patterns, HTML tables). Building on some practices we tried out on the Graphs page, we start with a data flow diagram, move on to a textual description, and finish with a variety of specification formats. Please give us your reactions and suggestions!

Note: Although the expositions all contain roughly the same information, some may "know" things that the others do not. This is a predictable consequence of the way they were created. While interpreting a diagram, a human editor can draw on resources such as domain knowledge and semiotic hints (e.g., meanings of node shapes). However, this could be a real challenge for a mechanized system.

Diagram

This diagram shows the major data flows in a typical AxAp system:

Discussion

A sighted reader will note (perhaps subconsciously) that:

  • The Cloud Resource node is vaguely "fluffy".

  • The Local File node is a squat cylinder.

  • The Audio Output and Braille Display nodes
    vaguely resemble a Cathode Ray Tube (CRT).

Domain knowledge can also play a strong part in the diagram's analysis. A reader who is familiar with flowchart symbols will recognize the cylinder as representing data storage and the CRT as representing a display device. Some readers may enjoy the visual pun used by the Cloud Resource node.

Before any of this analysis and interpretation can occur, however, the reader has to have basic information on the nodes' shapes. This will seldom be apparent in the SVG image description; OmniGraffle, for example, describes nodes using ungrouped collections of tags such as path, rect, and text. Recognizing (possibly distorted) shapes from geometric descriptions could be a substantial challenge.

Once the diagram's entities and connectivity have been analyzed and interpreted, we get to encode the results. Any significant diagram can be encoded in multiple ways. Basically, picking a "good" encoding is thus an exercise in data structure serialization, complicated by issues in disciplines such as human factors.

English Text

First, let's try for a human-friendly description, using English text:

AxAp can retrieve information from a Cloud Resource, a Local File, or a System Command. It can present results using local device resources such as Audio Output channels and Braille Displays (via BRLTTY). It can also access these resources through the combination of a Web Browser and a Screen Reader.

Specifications

Computer-readable diagram specifications (like the ones below) are a form of source code, so they need to be far more explicit than the description above. In these examples, we use minor variations on Cypher syntax, sometimes reformatted into a tabular format. Both forms can be composed and read (with a bit of practice) by most techies.

With appropriate pre-processing, they can also be used to specify DOT diagrams and/or store information in a graph database. Cypher queries can use similar syntax for pattern matching, allowing searches for specified sub-graphs, etc. These capabilities could all be useful in the creation of a graph-based knowledge exploration tool.

Entity Definitions

In order to make our patterns as concise as possible, let's define the diagram's entities first. This section can be used to define the node labels, harvested attributes such as "color" or "shape", and derived attributes such as "type". This assumes that the reader can remember our codes, so we'll try to make them mnemonic.

Monospace Display

Here is some Cypher entity definition code, presented as a "nicely formatted" monospace display. Being sighted, Rich finds that the spacing and vertical alignment make it easy for him to recognize and absorb parallelism and structure.

CREATE
 (aa { type: 'software', label: 'AxAp',         ... } ),
 (ao { type: 'device',   label: 'Audio Output', ... } ),
 ...

Simple Tabular Format

However, the code display above doesn't work at all well for Amanda:

  • Cursor positions is ignored by her screen reader.
  • Cypher syntax adds a lot of incidental complexity.
  • Vertical alignment is ignored by her screen reader.

On the other hand, most browsers support navigation of tables, so she finds the simple table below to be quite accessible. In addition, this table even provides sorting by columns, so she can order the entries by label, position, or type, if desired.

Table

ID Type Label
aa software AxAp
ao device Audio Output
bd device Braille Display
bt software BRLTTY
cr network Cloud Resource
lf storage Local File
sc software System Command
sr software Screen Reader
wb software Web Browser

Legend

  • ID - terse identification code
  • Type - entity type (mapped from node shape)
  • Label - visible label (extracted from node)

Extended Tabular Format

Careful Reader will note that the table above contains no information about the relative positions of the nodes. Given that most graph-based diagrams make heavy use of positional cues, this seems like an odd omission. So, in an experiment to assess their comparative utility, we have added several positional indicators to the table below.

  • The PB and PE fields tally the shortest paths to a beginning or ending node.
    AxAp's values (1, 1) tell us that it is only one (directed) edge in each.

  • The RC field provides a rough indication of the node's row and column.
    AxAp's RC value (2b) tells us that it is located in the row 2, column b.
    Note that some cells (e.g., 2a) may be empty.

  • The XP and YP fields indicate the X and Y offsets for the node centroid,
    in nominal tenths of an inch. Screen Reader's values (20, 40) tell us that it is
    about 2" to the right and 4" down from the upper left corner of the diagram.

Table

ID Type Label PB PE RC XO YO
aa software AxAp 1 2 2b 20 20
ao device Audio Output 2 0 5a 10 50
bd device Braille Display 3 0 5c 30 50
bt software BRLTTY 2 1 3c 30 30
cr network Cloud Resource 0 2 1a 10 10
lf storage Local File 0 2 1c 30 10
sc software System Command 0 2 1b 20 10
sr software Screen Reader 3 1 4b 20 40
wb software Web Browser 2 2 3b 20 30

Legend

  • ID - terse identification code
  • Type - entity type (mapped from node shape)
  • Label - visible label (extracted from node)
  • PB - length of shortest path to a beginning
  • PE - length of shortest path to an ending
  • RC - row and column for node (rough)
  • XO - x offset of node centroid
  • YO - y offset of node centroid

Connectivity

The graph's edges (i.e., connectivity) can also be defined in a tabular fashion, using variations on an adjacency list or an adjacency matrix.

3 Columns

Here is a reasonably concise and searchable table, using three columns. The center column ("<>") indicates the direction of the arrow. In the current example, this indicates the primary data flow.

Source <> Targets
aa < cr, lf, sc
aa > ao, bt, wb
bt > bd
sr < wb
sr > ao, bd

2 Columns

This table uses two columns, folding the sense information into the Targets. It has fewer but longer rows, which could be a problem for 40-character Braille displays:

Source Targets
aa <(cr, lf, sc), >(ao, bt, wb)
bt >(bd)
sr <(wb), >(ao, bd)

2 Columns, redundant

Here is a two-column table, using redundancy to ease searching. However, we pay for this with a larger number of rows:

Source Targets
aa <(cr, lf, sc), >(ao, bt, wb)
ao <(aa, sr)
bd <(bt, sr)
bt <(aa), >(bd)
cr >(aa)
lf >(aa)
sc >(aa)
sr <(wb), >(ao, bd)
wb <(aa), >(sr)

3 Columns, redundant

This table uses three columns and redundancy. This eases searching, while maintaining short row lengths, but it could double the number of rows.

Source <> Targets
aa < cr, lf, wb
aa > ao, bt, sc
ao < aa, sr
bd < bt, sr
bt < aa
bt > bd
cr > aa
lf > aa
sc > aa
sr < wb
sr > ao, bd
wb < aa
wb > sr

Adjacency Matrix

This table, based on an adjacency matrix, has a row and a column for each entity. Connections are indicated at the intersections.

  aa ao bd bt cr lf sc sr wb
aa . >   > < < <   >
ao < .           <  
bd     . <       <  
bt <   > .          
cr >       .        
lf >         .      
sc >           .    
sr   > >         . <
wb <             > .

Linear Paths

The following versions describe a linear "central path" through the graph, then fill in the side paths. The "Vanilla Cypher" versions use legal Cypher, but the following ones add various kinds of syntactic sugar.

Vanilla Cypher

Here is a complete CREATE command, using legal Cypher syntax. The first line after the entity definitions tells us that "(cr)" has a link of type "E" to "(aa)", which links to "(wb)", etc.

CREATE
 (bd { type: 'device',   label: 'Braille Display' } ),
 (bt { type: 'software', label: 'BRLTTY'          } ),
 (cr { type: 'network',  label: 'Cloud Resource'  } ),
 (lf { type: 'storage',  label: 'Local File'      } ),
 (sc { type: 'software', label: 'System Command'  } ),
 (sr { type: 'software', label: 'Screen Reader'   } ),
 (wb { type: 'software', label: 'Web Browser'     } ),
 (cr)-[:E]->(aa)-[:E]->(wb)-[:E]->(sr)-[:E]->(ao),
 (sc)-[:E]->(aa)-[:E]->(ao),      (sr)-[:E]->(bd),
 (lf)-[:E]->(aa)-[:E]->(bt)-[:E]->(bd);

Here is a MATCH command which retrieves the entire graph:

MATCH (n)-[r]-() RETURN n,r;

Edge Type Elision

All of our edges are the same type, so we can elide the syntax that specifies this.

 (cr)-->(aa)-->(wb)-->(sr)-->(ao),
 (sc)-->(aa)-->(ao),  (sr)-->(bd),
 (lf)-->(aa)-->(bt)-->(bd);

Alternation

Here, we introduce some syntax for alternation. Specifically, "(cr, sc)" expands to both "cr" and "sc", telling us that both nodes link to "aa".

CREATE
 (cr, sc)-->(aa)-->(wb)-->(sr)-->(ao, bd),
 (lf)    -->(aa)-->(bt)-->(bd),
            (aa)-->(ao);

Pattern Variables

Finally, we define the Cypher pattern variable "bt_bd", which expands to the string "(bt) -> (bd)".

CREATE
 bt_bd = (bt)-->(bd),
 (cr, sc, lf)-->(aa)-->(wb)-->(sr)-->(ao, bd),
                (aa)-->(ao, bt_bd);

Futures

It's pretty clear that no single format is going to meet all use cases. However, the fact that all of these tables and queries are based on Cypher plays nicely with the idea of loading the information into a graph database. User-defined (or selected) queries can then extract desired information, which can be displayed as a report (e.g., HTML table), etc.

Making all of this happen at all (let alone gracefully) will certainly involve a lot of work, but it's mostly "a simple matter of software":

  • analyze an SVG diagram
  • record entities and links
  • generate CREATE patterns
  • load up a Neo4j database
  • generate MATCH patterns
  • fold into a Cypher query
  • post-process the results


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r28 - 29 Aug 2016, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email