This service can perform data completeness and geometry validation operations on CityJSON / CityGML datasets.
This is a FastAPI application which uses TU Delft's val3dity to validate geometries, SHACL-based profiles to validate data completeness, and CityGML tools to convert CityGML files to CityJSON.
The resulting service is compatible with OGC API - Processes, conforming the following classes:
- Core
- JSON
- OGC Process Description
A separate process is defined for each of the validation profiles found on the data source. A list of the
available processes can be retrieved by querying the /processes
endpoint.
If you want to run the application locally, you need to have val3dity and CityGML tools installed as well.
# Create and activate virtual environment
python -m venv venv
. venv/bin/activate
# Install dependencies
python -m pip install -r requirements.txt
# Run the application
VAL3DITY=/path/to/val3dity CITYGML_TOOLS=/path/to/citygml-tools fastapi run # or 'fastapi dev' for development mode
You may also create a .env
file with the configuration instead of defining the variables.
val3dity and CityGML tools come prepackaged in the Docker image, so no dependencies are required.
docker run --pull=always -p 8080:8080 ghcr.io/ogcincubator/chek-data-completeness
The service will be available at https://localhost:8080.
If you need to serve the application from a path other than /
, you can pass it as the ROOT_PATH
environment
variable:
docker run --pull=always -p 8080:8080 -e ROOT_PATH=/my-subpath/ ghcr.io/ogcincubator/chek-data-completeness
The application can be configured by using environment variables and/or a .env
file (with the former taking
precedence). The following (case-insensitive) configuration variables are available:
Variable | Default value | Description |
---|---|---|
data_source | ./data/chek-profiles.ttl |
Data source for profiles. Can be a path or a URL to a Turtle file containing the definition of the profiles, or a SPARQL endpoint URL prefixed with sparql: , or a URL to an OGC Building Blocks register.json prefixed with bblocks: . Also supports a list of entries in JSON format. |
python3 | python3 |
Path to the Python 3 executable |
val3dity | /opt/val3dity/val3dity |
Path to val3dity executable |
citygml_tools | /opt/citygml-tools/citygml-tools |
Path to CityGML tools executable |
temp_dir | ./tmp |
Directory where temporary files will be stored |
A profile is composed of:
- An RDF description.
- One or more artifacts for validation (SHACL files).
The RDF descriptions for the profiles can be stored in a file (local or remote) or in a SPARQL endpoint (see Configuration). Each profile can then declare one or more resources (files with SHACL shapes) for validation.
The following example shows how to describe a profile:
@prefix chekp: <urn:chek:profiles/> .
@prefix prof: <http://www.w3.org/ns/dx/prof/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix role: <http://www.w3.org/ns/dx/prof/role/> .
@prefix sd: <https://w3id.org/okn/o/sd#> .
@prefix hydra: <http://www.w3.org/ns/hydra/core#> .
chekp:sample a prof:Profile, chekp:Profile ; # Only instances of checkp:Profile are processed, so this is required
dct:title "Sample profile for CHEK" ; # A title for the profile
dct:hasVersion "0.1" ; # Profile version
prof:isProfileOf chekp:chek ; # prof:isProfile of can be used for declaring inheritance
prof:hasToken "chek-ascoli-piceno" ; # A token is required and will be used to identify the profile
prof:hasResource [ # At least one resource must be described
a prof:ResourceDescriptior ;
prof:hasRole role:validation ; # The role must be role:validation
dct:format <https://w3id.org/mediatype/text/turtle> ; # Optional
dct:conformsTo <https://www.w3.org/TR/shacl/> ; # Conforming to https://www.w3.org/TR/shacl/ is *mandatory*
prof:hasArtifact <./ap-shapes.shacl> ; # Path or URL to SHACL shapes file
] ;
sd:hasParameter [ # Zero or more parameters can also be declared
dct:identifier "myParameter" ; # Identifier that will be used when running validations
dct:description "Sample argument" ; # An optional description for the parameter
sd:hasDataType "string" ; # Data type of the parameter
hydra:required false ; # Whether the parameter is required (true) or optional (false)
] ;
.
When defining Building Block profiles:
- The RDF description must be provided in the
data.ttl
file for the building block (or as a URI inside therdfData
array field inbblock.json
). - The SHACL shapes must be included in the
rules.shacl
file (or theshaclRules
field inbblock.json
). - The building block must contain the
chek-validation-profile
among its tags.
Building Block register resolution is enabled, so registers will be loaded recursively according to their imports
(bblocks-config.yaml
).
prof:isProfileOf
can be used to define an inheritance chain. If a profile is declared to be the profile of
another, the validation SHACL shapes in the latter will be included any time that the former is used. This allows
defining fine-grained rules for specific cases (e.g., cities, areas, building types, etc.) while leveraging already
existing sets of rules. For example, given the following profile hierarchy:
- Italy
- Marche region
- Ascoli Piceno province
- Ascoli Piceno municipality
- Ascoli Piceno Old town
- Ancona province
- Ascoli Piceno municipality
- Ascoli Piceno province
- Marche region
A validation run against the "Ascoli Piceno Old town" profile will include the rules for "Ascoli Piceno municipality", "Ascoli Piceno Province", "Marche region" and "Italy".
In the case of Building Block profiles, profile inheritance is signaled by using the dependsOn
field in
bblock.json
.
There are situations in which some information may be required at runtime to perform some type of validation. For example, the geographical extent of a dataset may be defined by using a radius from a point of interest, but while the radius is fixed and thus can be declared in the SHACL shape, the central point of interest depends on the specific area that needs to be validated.
Profiles can define parameters that will be used when running validations. Every parameter needs to have, at least:
- an identifier (
dct:identifier
) that will be used as a variable name when executing validation processes. - a description (
dct:description
) for users to know the purpose of the parameter. - a data type (
sd:hasDataType
). - optionally, a flag to mark the parameter as required (
hydra:required
).
When performing validations, parameter values are added to the RDF input data with the following format:
@prefix dct: <http://purl.org/dc/terms/> .
@prefix sd: <https://w3id.org/okn/o/sd#> .
[] a sd:Parameter ; # Instance of Parameter
dct:identifier "myParameter" ; # Identifier declared in the profile definition also with dct:identifier
sd:hasFixedValue "Value for the parameter" ; # The value provided by the user
Example: Checking that a building exists with a given identifier passed as a string:
<#PointOfInterest>
a sh:NodeShape ;
sh:targetNode _:dummy ;
sh:not [
sh:sparql [
sh:select """
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX sd: <https://w3id.org/okn/o/sd#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX city: <http://example.com/vocab/city/>
SELECT $this (rdf:type as ?path) (?buildingOfInterest as ?value) WHERE {
# Extract the parameter value from the graph
?buildingOfInterestParam a sd:Parameter ;
dct:identifier "buildingOfInterest" ;
sd:hasFixedValue ?buildingOfInterestValue ;
.
# Use the parameter value
?dataset city:hasObject/dct:identifier ?objectIdentifier .
FILTER(?objectIdentifier = ?buildingOfInterestValue)
}
""" ;
sh:message "Invalid point of interest" ;
sh:severity sh:Violation ;
]
]
.
When executing processes, parameter values are passed along the inputs:
POST /processes/my-profile/execution
{
"inputs": {
"cityFiles": [
{
"name": "dataset1",
"data_str": "..."
}
],
"myParameter": "Value for the parameter",
"myOtherParameter": "Value for the second parameter"
}
}
The work has been co-funded by the European Union and the United Kingdom under the Horizon Europe CHEK project (GA 101058559).