Background

This page provides background information on the issues involved in getting Elixir and Neo4j to work together.

Ecto

Ecto is the de facto standard for use of databases from Elixir in general and the Phoenix Framework in particular. It provides a number of convenient and valuable features which would be nice to have in accessing any database. So, when I first started looking into this topic, extending Ecto seemed like the obvious approach.

However, there are several areas where Ecto and Neo4j differ markedly in their notions of what a database is, how it acts, etc. Let's walk through some of these differences...

Tables vs. Graphs

Ecto is strongly tied to the relational model, as supported by RDBMS implementations such as PostgreSQL. That is, it expects information to be stored in database tables, connected by unique keys, SQL joins, inverted indexes, etc.

Neo4j, in contrast, is based on the property graph model. Information is stored in nodes and the relationships (i.e., arcs) that connect them. Specifically, each relationship connects a pair of nodes (i.e., entities). Nodes and relationships can have any number of properties (i.e., key/value pairs), so they can act a bit like table rows with variable numbers of columns.

Neo4j's nodes, properties, and relationships are represented by sets of fixed-length data structures with known locations in virtual memory. This enables extremely rapid traversal, as long as the needed data (i.e., working set) is resident in physical memory.

Keys vs. Relationships

In a relational database, a foreign key in one table may be used to look up a row in another table. If the target table has an appropriate index for the key in question, this is an O(log N) operation; otherwise, it's O(N). So, searching through chains of relationships is quite expensive, particularly when multi-table joins are involved.

Neo4j, in contrast, can follow a relationship to another node simply by dereferencing a memory pointer. Gaining access to node and arc properties is similarly direct. Neo4j's Cypher query language takes advantage of this speed, allowing declarative, pattern-based queries such as:

MATCH  (charlie:Person { name:'Charlie Sheen' })
       -[:ACTED_IN]->
       (movie:Movie)
RETURN movie

Neo4j also uses inverted indexes (etc.) to support various forms of global searching. In the query above, the node types (:Person, :Movie) and the relationship type (:ACTED_IN) might be looked up in an index.

Schemas vs. Schemas

Ecto makes use of database schemas, based on data definition language (DDL). Cypher also supports (optional) DDL-based schemas, based on concepts such as constraints, indexes, and labels. It isn't clear how these notions of schemas will interact.

SQL vs. Cypher

Much of Cypher's syntax is borrowed from SQL, but certain elements (especially patterns) are quite alien to it. This may introduce issues for the Ecto API, query generation, etc.

TinkerPop

Apache TinkerPopTM is a graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP).

// What are the names of Gremlin's friends' friends?
    g.V().has("name","gremlin").
      out("knows").out("knows").values("name")

-- Apache TinkerPop

Like Neo4j, TinkerPop is based on the notion of a property graph. And, because, it is supported by a variety of graph databases, getting Elixir to play nicely with TinkerPop could provide a lot of flexibility in system design.

However, TinkerPop doesn't support database-specific features such as Neo4j's Cypher query language and unmanaged extensions. Instead, it uses the Gremlin programming language. So, YMMV...


This wiki page is maintained by Rich Morin, an independent consultant specializing in software design, development, and documentation. Please feel free to email comments, inquiries, suggestions, etc!

Topic revision: r9 - 06 Apr 2016, RichMorin
This site is powered by Foswiki Copyright © by the contributing authors. All material on this wiki is the property of the contributing authors.
Foswiki version v2.1.6, Release Foswiki-2.1.6, Plugin API version 2.4
Ideas, requests, problems regarding CFCL Wiki? Send us email