This repository comprises tests for the DYDRA RDF cloud service:
- The Sesame HTTP communication protocol,
- The SPARQL graph store HTTP protocol,
- The SPARQL query protocol,
- The DYDRA account administration HTTP API
- DYDRA extension tests for
- language-specific collation
- request meta-data
- provenance
- sort precedence
- temporal operators
- values request parameter
- xpath operators
These tests are implemented as shell scripts and arranged according to topic. The root directory contains several utility scripts which establish the test environment, administer the target repositories and execute tests.
define.sh
Defines the shell environment variables and operators to be employed by the test scriptsinitialize.sh
Creates the test target repositories with respective meta-data and content.reset.sh
Resets test target repository contentrun.sh
Runs a given collection of test scripts, reports the outcomes. Observes known failures fromknown-to-fail.txt
. Records new failures in the filefailed.txt
. Returns the error count as its result.
The scripts are arranged in directories which reflect the protocol resource paths.
The account (openrdf-sesame
) and repository (mem-rdf
) are defined such that,
for the sesame protocol tests, the openrdf documentation examples should
apply, as given in its documentation.
In order to execute scripts manually:
- Establish values for the shell variables:
STORE_URL
: the HTTP URI to specify the remote host.STORE_ACCOUNT
: the account name.STORE_REPOSITORY
: the repository name eg.STORE_TOKEN
: an authentication if authentication is required.
- Define the shell environment:
source define.sh
- Run the desired script(s) :
run_tests <pathnames>
run.sh <directory>
For example
export STORE_URL="https://dydra.com"
export STORE_ACCOUNT="openrdf-sesame"
export STORE_REPOSITORY="mem-rdf"
export STORE_TOKEN="1234567890"
source define.sh
bash run.sh extensions/sparql-protocol/temporal-data
Note that numerous scripts modify the shell variable bindings to correspond to particular variations in repository, graph, or user and, as such, must be run in a distinct sub-shell in order that the modification not be pervasive.
The test environment includes a range of repositories and users, as described in the initialize.sh
script, in order to account for variations in access and authorization. As a rule, the default
repository, that is "${STORE_ACCOUNT}/${STORE_REPOSITORY}" is treated as read-only, in order that
most tests need to no set-up and/or tear-down.
Any modification is restricted to "${STORE_ACCOUNT}/${STORE_REPOSITORY}-write" and every tests which
modifies that repository also initializes it to the required state.
The tests are coded as bash shell scripts. They depend on several utility programs:
jq
:apt-get install jq
json_reformat
:apt-get install yajl-tools
rapper
:apt-get install raptor2-utils
tidy
:apt-get install tidy
json_diff
: pip install json-delta
To test and (if necessary) solve all dependencies you may simply run:
These tests exercise the Sesame rest api, as per the OpenRDF "HTTP communication protocol" description (archival link), or (archival link). For the v2.0 Sesame protocol, the concrete resources, with reference to the described overview:
${STORE_URL}/${STORE_ACCOUNT}
/protocol : protocol version (GET)
/repositories : overview of available repositories (GET)
/${STORE_REPOSITORY} : query evaluation and administration tasks on
a repository (GET/POST/DELETE)
/statements : repository statements (GET/POST/PUT/DELETE)
/contexts : context overview (GET)
/size : #statements in repository (GET)
/rdf-graphs : named graphs overview (GET)
/service : Graph Store operations on indirectly referenced named graphs
in repository (GET/PUT/POST/DELETE)
includes the query argument graph=${STORE_IGRAPH}
/${STORE_RGRAPH} : Graph Store operations on directly referenced named graphs
in repository (GET/PUT/POST/DELETE)
/namespaces : overview of namespace definitions (GET/DELETE)
/${STORE_PREFIX} : namespace-prefix definition (GET/PUT/DELETE)
The compact graph store patterns provide and alternative, less encumbered means to address the resource and its content:
${STORE_URL}/${STORE_ACCOUNT}
/${STORE_REPOSITORY}/service
/${STORE_REPOSITORY}/service?default : the default graph
/${STORE_REPOSITORY}/service?graph=${STORE_IGRAPH} : an arbitrary indirect graph
/service?graph=urn:dydra:service-description : the repository SPARQL endpoint service description
/${STORE_REPOSITORY}/${STORE_RGRAPH} : graph relative to the repository base url
In addition to these paths, the account and repository metadata is located along a path distinct from possible repository linked-data resources:
${STORE_URL}/accounts/${STORE_ACCOUNT}
/repositories
/${STORE_REPOSITORY}
/settings : name, homepage, summary, description, and license url
/collaborations : enumerated collaborator account read/write privliges
/context_terms : respective extent of the default and named graps
/describe_settings : description mode and navigation depth
/prefixes : default namespace prefix bindings (cf. sesame namespaces)
/privacy : repository privacy setting
/provenance_repository : respective provenanace repository identifier
/service_description : the repository SPARQL endpoint service description
/undefined_variable_behaviour : disposition for queries with unbound variables
The scripts test a subset of the accept formats:
-
For repository content
- RDF/XML :
application/rdf+xml
- N-Triples :
text/plain, application/n-triples
- TriX :
application/trix
- JSON :
application/json
- N-Quads :
application/n-quads
- RDF/XML :
-
For query results and metadata
- XML :
application/sparql-results+xml
- JSON :
application/sparql-results+json
- XML :
The scripts cover variations of access privileges, content- and accept-type, and resource existence. Test successes are judged against either against the HTTP status code, or, for requests with response content, against result prototypes as canonicalized per xmllint and json_reformat. Test failures match against the HTTP status code.
The graph store support under sesame (archival link) provides two resource patterns.
<SESAME_URL>/repositories/<ID>/rdf-graphs/service
<SESAME_URL>/repositories/<ID>/rdf-graphs/<NAME>
The first, for which the path ends in service
, requires an additional graph
query argument
to designated the referenced graph indirectly, while in the second case, the request url itself
designates that graph.
Note that, given the discussion on the openrdf topic, the designator for a directly referenced named graph in a sesame request URI is the literal URL. That is, it includes the "/repositories" text.
The SPARQL 1.1 Graph Store HTTP Protocol is supported on a per-repository basis. The functionality is accessible at <SESAME_URL>/repositories/<ID>/rdf-graphs/service (for indirectly referenced named graphs), and <SESAME_URL>/repositories/<ID>/rdf-graphs/<NAME> (for directly referenced named graphs). A request on a directly referenced named graph entails that the request URL itself is used as the named graph identifier in the repository.
For a repository on a DYDRA host, the sesame request patterns manifest in terms of the host authority, the user account and the repository name
<HTTP-HOST>/<ACCOUNT-NAME>/repositories/<REPOSITORY-NAME>/rdf-graphs/service
<HTTP-HOST>/<ACCOUNT-NAME>/repositories/<REPOSITORY-NAME>/rdf-graphs/<GRAPH-NAME>
The consequence is that, in order to designate the repository as a whole, the sesame request URL must take a form
<HTTP-HOST>/<ACCOUNT-NAME>/repositories/<REPOSITORY-NAME>/rdf-graphs/service?graph=<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>
and the default graph is designated as
<HTTP-HOST>/<ACCOUNT-NAME>/repositories/<REPOSITORY-NAME>/rdf-graphs/service?default
While a request of the form
<HTTP-HOST>/<ACCOUNT-NAME>/repositories/<REPOSITORY-NAME>/rdf-graphs/<GRAPP-NAME>
designate exactly that named graph in the store.
The "SPARQL 1.1 Graph Store HTTP Protocol", is supported as per the W3C recommendation, with the several additions and restrictions. Each DYDRA repository constitutes a Graph Store Protocol endpoint which is identified by the resource
<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>/service
The tests for this facility are present in the directories sparql-graph-store-http-protocol
and
extensions/graph-store-protocol
.
A graph store request may include as content or specify as its response any of the following RDF document encodings
- application/n-triples
- application/n-quads
- application/turtle
- application/rdf+xml
Several forms are restricted
- application/rdf+json : supported for responses only
- application/trix : supported for responses only
Several forms are no longer supported, as they have been supplanted by registered media types
- text/plain
- application/rdf-triples
- text/x-nquads
- application/x-turtle
The multipart/form-data
request media type described in the graph store
protocol
is not supported. Each request must comprise a single document.
The application/x-www-form-url-encoded
request type is not supported by the graph store protocol.
It applies to SPARQL ˚POST˚ requests only, as described in the SPARQL
protocol for query
and update operations.
a request which omits a graph designator is understood to apply to the entire repository.
For a repository on a Dydra host, the native request patterns comprise just the host authority, the
user account and the repository name, with the service
path extension.
<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>/service
with respect to which, the default graph is designated as
<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>/service?default
and an indirect graph reference takes the form
<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>/service?graph=<graph>
In addition to the root repository graph, it is also possible to link directly to an arbitrary directly designated graph which extends beyon the root
<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>/<FURTHER>/<PATH>/<STEPS>
The graph store management operations which involve an RDF payload - PATCH
, POST
, and PUT
,
permit a request to target a specific graph as described above, as well as to transfer graph content
as TriX or N-Quads in order to stipulate the target graph for statements in the payload document itself.
The protocol and document specifications are not exclusive.
When both appear,
the protocol graph specifies which graph is to be cleared by a put and
that graph supersedes any specified in the document content
with respect to the destination graph.
Where no protocol graph is specified for a POST
request, a new graph is generated.
Where none is specified for other methods, the entire repository is the target.
With the following possible values for a graph:
default
: the default graphpost
: a unique UUID generated for a POST requeststatement
: the graph specified in the statement, or default for triples. The combinations yield the following effects forPATCH
,POST
andPUT
:
protocol graph designator | content type | effective graph |
---|---|---|
- | n-triples, rdf+xml | PATCH : default POST : post PUT : default |
n-quad, trix | statement | |
default
| n-triples, rdf+xml | default |
n-quads, trix | default | |
graph=protocol
| n-triples, rdf+xml | protocol |
n-quads, trix | protocol | |
protocol |
not supported |
The results for DELETE
and GET
operations are analogous to PUT
with respect to repository modifications
or response content.
A PATCH
operation without a protocol graph, in distinction to a PUT
, clears just the graphs present in the content.
In order to validate the results, one script exists for the POST
and PUT
operations for
each of the combinations, named according to the pattern
<method>-<contentTypes>-<protocolGraph>.sh
which performs a PUT
request of the respective content type and graph combination
and validates the content of a subsequent GET
as a reflections of the expected store content.
The combination features are indicated as
- method :
POST
PUT
- protocolGraph : none, direct, default, graph (indirect)
- contentType : n-triples, n-quads, rdf+xml, turtle, trix
whereby, just the combinations for PUT-ntriples+nquads
validate the full target graph complement and,
among these, the cases like PUT-ntriples+nquads-default
intend to demonstrate the
effect when the payload or request content type differs from the protocol target graph.
In addition, for n-triples and n-quads content types, the acutual document contains both triples and quads
in order to demonstrate the consequence of the statement's given content on its destination.
script | requirement |
---|---|
POST-ntriples+nquads-default.sh | Each statement is added to the default graph. Graph terms in content are suppressed. |
POST-ntriples+nquads-direct.sh | not supported |
POST-ntriples+nquads-graph.sh | Each statement is added to the target graph. Graph terms in content are supplanted. |
POST-ntriples+nquads.sh | When no protocol graph is specified, for declared triple media, each statement is added to a new, generated, graph and for declared quad content, each statement is added to its respective graph. |
PUT-ntriples+nquads-default.sh | The default graph is cleared. Each statement is added to the default graph. Graph terms in content are suppressed. |
PUT-ntriples+nquads-direct.sh | not supported |
PUT-ntriples+nquads-graph.sh | The protocol graph is cleared. Each statement is added to the target graph. Graph terms in content are supplanted. |
PUT-ntriples+nquads.sh | The entire repository is cleared. Each statement is added to the target graph. Graph terms in content are supplanted. |
In addition to the status code, each response includes several headers and a content body.
Header | content |
---|---|
Request-Id | the service request UUID |
Etag | the identifier for the new revision which resulted from the request operation |
The response content is a SPARQL result document which specifies
- the graph store endpoint url
- the service request UUID
- the client request id.
The standard processing mode for graph store requests involves a synchronous request/response exchange. This requires that the client serialize any requests in order to avoid conflicting write operations and any request which involves a large dataset introduces delays for unrelated requests. In order to avoid these limitations, a request can specify asynchronous processing. In order to invoke this mode, it should include the following headers
Header | content |
---|---|
Accept-Asynchronous | notify |
Asynchronous-Location | the url to which the response is to be sent upon completion |
Asynchronous-Method | the url to which the response is to be sent upon completion |
Asynchronous-Content-Type | the media type to be used to encode the status message.
It must be a SPARQL result content type.
The default is application/sparql-results+json |
Each DYDRA repository constitutes a SPARQL endpoint which is identified by the resource
<HTTP-HOST>/<ACCOUNT-NAME>/<REPOSITORY-NAME>/sparql
Requests which conform to the terms of a SPARQL request described in the
"SPARQL 1.1 Protocol" recommendation
are processed as SPARQL requests.
The tests for this facility are present in the directories sparql-protocol
and
extensions/sparql-protocol
.
Test scripts for account and repository management operations are present under the directory
accounts-api
.
The DYDRA service provides several extensions to standard SPARQL facilities:
- It implements the temporal datatypes
xsd:date
,xsd:dayTimeDuration
,xsd:time
,xsd:yearMonthDuration
and the atomic Gregorian datatypes and implements the respective constuctor, accessor and combination operators as described in "XPath and XQuery Functions and Operators 3.0". - It implements the math operators from the XPath recommendation.
- It implements native statement reification and provides operators to identify and locate statements and statement terms by content.
- It provides access to query operation meta-data.
- It affords access to repository revisions.
- It implements IRI component accessors.
Test scripts for these capabilities are present under the directory extensions