YAGO - Neo4j Mapping
YAGO is based on
RDF
(Resource Description Framework), a family of specifications from the
W3C (World Wide Web Consortium).
RDF uses (subject, predicate, object) "triples"
which map fairly easily to Neo4j's nodes, properties, and relations.
However, there are also some significant differences.
So, a bit of structural and syntactic mapping may be appropriate.
Direct Mapping
The
SPARQL Plugin for Neo4j maps each RDF triple
to a Neo4j relation, generating supporting nodes as needed.
Subjects and objects become nodes, with
value
and
kind
properties, eg:
{ value='joe', kind='literal' }
{ value='http://neo4j.org#sara', kind='uri' }
Predicates become relations, with
cp
,
c
, and
p
properties, eg:
{ cp='U http://neo4j.org U http://neo4j.org#knows',
c='U http://neo4j.org',
p='U http://neo4j.org#knows' }
This allows the plugin to support SPARQL in Neo4j,
but the encoding doesn't seem very convenient, let alone idiomatic.
There are also issues having to do with encoding efficiency,
handling of literal types, etc.
Idiomatic Mapping
Let's see if we can create an idiomatic mapping for YAGO
that is also convenient and efficient.
We'll start with first principles, then suggest accommodations...
- Any component of a triple may be a unique name (ie, URI).
- The object component may also be a literal (eg,
"joe"
, "1.85"^^<m>).
- Most triples that have literals as objects become node properties.
- The remaining triples generate relations (and supporting nodes).
- URI prefixes (eg,
owl
, rdfs
) have their own property (ns
).
- The default prefix (
yago
) is not recorded.
See
Syntax for details on handling of concrete syntax
and
Predicates for a discussion of predicate mapping.
Nodes
To allow Cypher to make numeric comparisons,
we remove the metadata from some predicate-based properties (eg,
has_height
).
( {
node: 'fred_frobisher',
has_given_name: 'fred',
has_height: 1.85,
...
} )
Relations
Relations look something like this:
-[ :Has_predecessor ]->
Helpers
In some cases, it may make sense to create helper nodes and/or relations.
For example, some kinds of pattern-based date comparisons
can be supported by collections of year, month, and day nodes.
This wiki page is maintained by
Rich Morin,
an independent consultant specializing in software design, development, and documentation.
Please feel free to
email
comments, inquiries, suggestions, etc!