Graphs

A large and ever-increasing component of digital content is devoted to describing graphs (i.e., networks of related objects). As the Neo4j (graph database) folks like to say: "graphs are everywhere". Or, expressed in their Cypher query language syntax:

(Graphs)-[:ARE]->(Everywhere)

In this context, the word "graph" has a very specific meaning:

In mathematics, and more specifically in graph theory, a graph is a representation of a set of objects where some pairs of objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices (also called nodes or points), and the links that connect some pairs of vertices are called edges (also called arcs or lines). Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines or curves for the edges. Graphs are one of the objects of study in discrete mathematics.

Graph-based diagrams are commonly used to represent sets of related items. Software developers, for example, use diagrams to represent control flow graphs, data flow diagrams, and entity–relationship models. They also use graph-based data structures (e.g., the World Wide Web!) to store information in an organized fashion.

Example Diagram

As an example, let's consider a diagram of possible data flows through a suite of graph diagramming tools:

  • We gather some Initial Information (e.g., data, notes).

  • We create an SVG Encoding, using one of these tools:

  • Finally, we provide ways to access the diagram:
    • Static Rendering - visual or tactile image
    • Interactive Navigator - graph exploration tool
    • Generated Description - textual commentary

Image Formats

Although diagram generation programs have their own internal formats, they can typically be used to produce images in a variety of formats. The lefthand image below was captured from OmniGraffle as a raster graphics (PNG) screenshot; the righthand image was exported as a Scalable Vector Graphics (SVG) file.

Sighted readers may notice that the righthand image is a bit crisper, but that isn't the most important difference. Because the lefthand image is a rectangular array of white and black pixels, it is largely inaccessible to blind readers. Even with the aid of image recognition or sonification, much of the content would probably be lost.

The righthand image, in contrast, uses a text-based vector graphics file. Consequently, much more information is available for analysis and/or display. For example, any labels will be available as text, rather than patterns of dots. Also, the image could support interactive navigation, via a zooming user interface.

Note: The source code for the righthand (SVG) image above is hidden by the img tag. If you are interested in inspecting the code, visit AxAp_Graphs.svg and use your web browser's Developer Tools.

SVG Drawbacks

Like most XML dialects, SVG is complicated, repetitive, and verbose. This is fine for computer programs (e.g., web browsers), but confusing and tedious for human readers. More fundamentally, however, SVG's content and semantic level are totally inappropriate for graph descriptions.

Graph-based diagrams describe sets of entities and relationships in some topical domain. Each node's shape and label indicate the corresponding entity's type and identity. Similarly, lines and arrowheads can be styled to indicate assorted attributes. This makes sense: we want to learn about identities, types, and so forth.

Unfortunately, although this is what the web browser presents to a sighted reader, the SVG code doesn't describe the diagram at anything like that level. Instead, it details characteristics (e.g., color, location, shape, size, style) of the geometric objects that are used to create the image. Here, for example, is a description of a straight line, ending in a solid arrowhead:

<line
  x1="166.84698" y1="216.35362"
  x2="186.13137" y2="235.64498"
  marker-end="url(#FilledArrow_Marker)"
  stroke="black" stroke-linecap="round"
  stroke-linejoin="round" stroke-width="1" />

It would be challenging, at best, to recover a diagram's semantic content from this sort of encoding. Simply put, a lot of useful information has been discarded or obscured (e.g., which objects does this line connect?).

Fortunately, a computer program can determine which objects a line connects. Label and style information can also be extracted for each object and line. The metadata situation will also improve in SVG 2, which is expected to add a number of WAI-ARIA attributes. Some of these could encode adjacency, connectivity, and other semantic characteristics.

Alternative Formats

Even if a diagram is encoded in a vector format such as SVG, it still won't be easy to understand (even for programmers!). So, we may wish to transform it in assorted ways.

DOT (Graphviz)

Given that we want a readable representation, maybe we should start with a language which was designed to allow humans to specify diagrams for mechanized generation. Here's a machine-readable specification for our example diagram, encoded in DOT syntax:

digraph AxAp_Graphs {
  GD [ label="Generated Description" shape=ellipse ]
  IE [ label="Interactive Editor"    shape=box     ]
  II [ label="Initial Information"   shape=ellipse ]
  IN [ label="Interactive Navigator" shape=box     ]
  MG [ label="Mechanized Generator"  shape=box     ]
  SE [ label="SVG Encoding"          shape=ellipse ]
  SR [ label="Static Rendering"      shape=ellipse ]

  II -> { IE MG } -> SE -> { GD IN SR }
}

This specification is concise, explicit, and unambiguous. It names the graph, defines the node shapes and labels, and describes the connectivity. So, for example, "II -> { IE MG }" tells us that a pair of arrows go from the II (Initial Information) node to the IE (Interactive Editor) and MG (Mechanized Generator) nodes. As in the visual image, the node shapes hint at the types of objects they represent.

Because DOT is supported by the open source Graphviz project, a DOT description can be mechanically laid out as a diagram (e.g., in SVG). This could allow us to display graphs that have been filtered, etc.

Cypher (Neo4j)

Cypher is a query language for graph databases. Although it is native to Neo4j, OpenCypher is being promoted as a de facto industry standard. The syntax is (roughly speaking) a cross between SQL and DOT. So, it can be used to describe graphs for either assertions or queries. Because graph databases are highly optimized for link traversal, even multi-node patterns can be retrieved very quickly.

Although Cypher syntax differs from DOT syntax, it can be used to encode all of the same information (and more). Here's a machine-readable specification for our example diagram, encoded as a CREATE query. Note that, although it combines terms in a rather different way, the resulting connectivity is identical:

CREATE
 (gd { label: 'Generated Description', shape: 'ellipse' } ),
 (ie { label: 'Interactive Editor',    shape: 'box'     } ),
 (ii { label: 'Initial Information',   shape: 'ellipse' } ),
 (in { label: 'Interactive Navigator', shape: 'box'     } ),
 (mg { label: 'Mechanized Generator',  shape: 'box'     } ),
 (se { label: 'SVG Encoding',          shape: 'ellipse' } ),
 (sr { label: 'Static Rendering',      shape: 'ellipse' } ),

 (ii) -> (ie) -> (se) -> (gd),
 (ii) -> (mg) -> (se) -> (in),
                 (se) -> (sr).

English Text

Unfortunately, even for this relatively simple graph, these machine-readable specifications don't "tell the story" in a form that is accessible to the general public. Many "techies" (e.g., programmers) may find them easy to digest, but others will not. However, dynamically generating a textual description should be relatively trivial, e.g.:

To be continued...

Interactive Navigation

Given that SVG is designed to be scalable (and graphs can be huge!), some form of interactive navigation is clearly worth pursuing. Filtering, panning, navigation, and zooming are the most immediate goals; after that, we can consider more arcane possibilities.

To be continued...


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r30 - 24 Aug 2016, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email