Skip to content

Validation reports

David Megginson edited this page May 15, 2018 · 7 revisions

Client software can download a validation report in the JSON format described on this page (a link appears at the top of the Validation page). A client can use the information in the JSON report to display the error status to the user, and to interactively guide the user through the various issues. It is also suitable for data analytics and other non-interactive applications.

A validation report uses a two-level hierarchy for each issue:

  1. General information about the issue type.
  2. A list of specific locations where the issue was found

This arrangement allows a client optionally to present a short summary at first, then to allow the user to drill down as desired. It is especially useful for very long error lists, where there may be only 5 or 10 types of issues, but hundreds of issue locations.

JSON format

(Property names in boldface are required to appear; others are optional.)

Top-level object

The top level JSON object includes following properties:

Property Description Type
validator Title of the validation engine. text
timestamp When the validation was run. text (ISO 8601 timestamp)
data_url Dataset being validated. text (URL)
schema_url Validation schema used (if any). See HXL schemas. text (URL)
is_valid True if the validation succeeded with no issues. boolean
stats Total number of locations where issues were found Stats object
issues List of all top-level issues found (may be an empty list) array of Issue objects

Stats object

Property Description Type
info Number of info-level issues (advisory only). int
warning Number of warning-level issues (non-fatal errors) int
error Number of error-level issues (fatal errors) int
total Total number of issues (info+warning+error) int

Issue object

Property Description Type
rule_id Unique identifier for the rule used (if available) text
tag_pattern The HXL tag pattern the rule was searching (if available) text HXL tag pattern (see tag patterns)
description Error message to display to the user text
severity Severity level of the issue. text (“info” | ”warning” | ”error”)
location_count Number of locations where this error was found. int
scope Applicability range of the error: does it apply to a specific row or column, or to the dataset in general? text (“dataset” | “column” | “row” | ”cell”)
locations List of all locations where the issue occurred. array of Location objects

Location object

(*) At least one of the row, col, and hashtag properties must be present in each object.

Key Description Type
row (*) 0-based number of the row where the issue occurred, counting from the first row after the HXL hashtags. Required for “row” and “cell” scope; optional otherwise. int
source_row 0-based number of the raw row in the source dataset, including headers and the hashtag row. int
col (*) 0-based number of the column where the issue occurred, counting from the left. Required for “column” and "cell" scope; optional otherwise. int
hashtag (*) HXL hashtag related to the issue (could match zero or multiple columns). Required for “document” scope; optional otherwise. text (HXL hashtag + attributes)
error_value The error value that appeared in the output, if available. text
suggested_value The value that the validation engine expected to find, if known. text

Example

{
    "validator": "HXL Proxy",
    "timestamp": "2018-04-11T18:29:13.483789",
    "data_url": "http://example.org/data.csv",
    "is_valid": false,
    "stats": {
        "info": 0,
        "warning": 2,
        "error": 0,
        "total": 2
    },
    "issues": [
        {
            "rule_pattern": "#adm1+name",
            "description": "Possible misspelling",
            "severity": "warning",
            "location_count": 2,
            "scope": "cell",
            "locations": [
                {
                    "row": 20,
                    "source_row": 22,
                    "col": 3,
                    "hashtag": "#adm1+name+i_en",
                    "error_value": "Kievv"
                    "suggested_value": "Kiev"
                },
                {
                    "row": 22,
                    "source_row": 24,
                    "col": 3,
                    "hashtag": "#adm1+name+i_en",
                    "error_value": "Ki ev",
                    "suggested_value": "Kiev"
                }
            ]
        }
    ]
}
Clone this wiki locally