-
Notifications
You must be signed in to change notification settings - Fork 58
RDF Format
-
InputFormat:
com.thinkaurelius.faunus.formats.edgelist.rdf.RDFInputFormat
The Semantic Web community is one of the original promoters of the graph as an approach to data modeling. Their efforts have led to the development of the RDF data model. While there are many serialization formats for RDF, an RDF graph is composed of RDF triples, in which a subject is connected to an object by a predicate. For instance:
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
In this way, RDF is an edge list data model. Faunus, on the other hand, makes use of an adjacency list in its representation. Therefore, for these two formats to interoperate, the RDFInputFormat
provided by Faunus contains a MapReduce job that converts an edge list into a adjacency list.
faunus.graph.input.rdf.format
There are numerous RDF serialization formats. Faunus currently supports the following formats:
NOTE: Faunus makes use of LineRecordReader
to read statements from an RDF file. If a line (\n
) does not contain a complete legal RDF fragment, then an exception is thrown by the RDF parser.
faunus.graph.input.rdf.literal-as-property
There are two types of triples to be aware of — one in which the object is a URI or blank node, and one in which the object is a literal value. The two types of triples are exemplified below.
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#age> "32"^^<http://www.w3.org/2001/XMLSchema#int> .
If the above Faunus property is set to true
, then the Hercules vertex has an age property with an integer value of 32.
faunus.graph.input.rdf.use-localname
The theoretically infinite RDF graph is embedded with the infinite address space of URIs. In many situations, the full URI is not desired and as such, if the above property is set to true
, then
<http://thinkaurelius.com#hercules> <http://thinkaurelius.com#father> <http://thinkaurelius.com#jupiter> .
Generates vertices with name hercules
and jupiter
connected by a father
edge.
faunus.graph.input.rdf.as-properties
RDF is a triple data model — there are no properties, only vertices and edges. In some situations, an object URI should be treated as a property of the vertex. For instance, when http://www.w3.org/1999/02/22-rdf-syntax-ns#type
is specified in the String
list of the property above, then the triple
<http://thinkaurelius.com#hercules> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://thinkaurelius.com#demigod>
yields a Hercules vertex with type-property demigod. A typical setting for this property is below.
faunus.input.format.rdf.as-properties=http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://www.w3.org/2000/01/rdf-schema#label