JSON recipes

(For coders)

As an alternative to using the web forms linked from the Filters article, advanced users can pass a full recipe to the HXL Proxy as a URL-encoded JSON specification via the &recipe HTTP parameter. A recipe is simply a JSON list of filters, like this:

[
    {
        "filter": "with_rows",
        "queries": "org=unicef"
    },
    { 
        "filter": "count",
        "tags": "#adm1+name"
    }
]

For the Process JSON spec service, you embed the list of filters as the value of the recipe property in a JSON processing spec:

{
    "input": "http://example.org/dataset.csv",
    "recipe": [
        {
            "filter": "with_rows",
            "queries": "org=unicef"
        },
        { 
            "filter": "count",
            "tags": "#adm1+name"
        }
    ]
}

In both cases, the HXL Proxy will apply the filters in the order you list them, using the output from one as the input to the next. The output of the last filter will be the result of the processing chain.

Filter properties

Every filter specification includes a property filter set to a short name for the filter (e.g. "count"). The other properties, if any, depend on the filter type.

Add column filter properties

Unlike the web form, the JSON version allows you to add multiple columns in a single operation, so the name is add_columns (plural).

{ "filter": "add_columns",
  "specs": ["Country#country+name=Ghana"] }

Property	Required	Description	Example
filter	yes	The filter type	"add_columns"
specs	yes	A specification or list of specifications in the format header:tagspec=fixed value	"Country#country+code=SYR". Note that fixed values may include substitution patterns inside double braces to pull values from other columns, e.g. "X-{{#country+code}}"
before	no	If true, place the new column at the left rather than right of existing columns	True

Append datasets filter properties

{ "filter": "append",
  "append_source": "http://example.org/data.csv" }

Property	Required	Description	Example
filter	yes	The filter type	always "append"
append_source	yes	The URL or JSON source (see JSON data sources) for the dataset to append	"http://example.org/data.csv"
add_columns	no	If true, include columns that are in the append_source but not the original (default False).	True
queries	no	One or more Row queries to filter rows added from the append_source	"date>2010"

Append datasets filter (external list) properties

{ "filter": "append_external_list",
  "source_list_url": "http://example.org/list-of-sources.csv" }

Property	Required	Description	Example
filter	yes	The filter type	always "append_external_list"
source_list_url	yes	The URL or JSON source (see JSON data sources) for the list of dataset URLs to append (in a column tagged #x_source)	"http://example.org/list-of-sources.csv"
add_columns	no	If true, include columns that are in the append_source but not the original (default False).	True
queries	no	One or more Row queries to filter rows added from the append_source	"date>2010"

Cache filter properties

(This filter caches the complete dataset for repeated processing. Not generally used inside the HXL proxy.)

{ "filter": "cache" }

Property	Required	Description	Example
filter	yes	The filter type	Always "cache"

Clean data filter properties

{ "filter": "clean_data",
  "date": "date+start" }

Property	Required	Description	Example
filter	yes	The filter type	Always "clean_data"
whitespace	no	Tag patterns for columns that will have whitespace normalised (list of strings, or single string with patterns comma-separated).	"org+name"
upper	no	Tag patterns for columns that will be converted to all upper case.	"adm1+code,adm2+code"
lower	no	Tag patterns for columns that will be converted to all lower case.	["sector+code", "subsector+code"]
date	no	Tag patterns for columns that will have dates normalised.	"date"
date_format	no	strftime format string for output dates.	"%Y-%m-%d" (ISO 8601 date, e.g. "2018-01-05").
number	no	Tag patterns for columns that will have numbers normalised.	"inneed,affected,population"
number_format	no	printf-style format string for output numbers. Do not include a leading "%"	"0.2f" (two decimal places, e.g. 3.00).
latlon	no	Tag patterns for columns that will have lat/lon normalised. The attributes +lat, +lon, and +coord will control the normalisation type.	"geo"
purge	no	If true, remove any dates, numbers, or lat/lon that can't be cleaned during parsing.	True
queries	no	One or more Row queries to filter rows that will be cleaned.	"sector=wash"

Count rows filter properties

{ "filter": "count",
  "patterns": "adm1+name,org",
  "aggregators": "sum(#targeted) as Total targeted#targeted#total",
  "queries": "sector=WASH'  }

Property	Required	Description	Example
filter	yes	The filter type	Always "count"
patterns	yes	Tag patterns for the rows to count (list, or single string with patterns comma-separated).	"adm1,org"
aggregators	no	A single string or list of strings describing Aggregators. If omitted, the filter will use the aggregator "count() as Count#meta+count" by default, simply counting unique combinations of patterns.	"sum(#targeted)"
queries	no	One or more Row queries to filter rows that will be counted (ignore any that don't match at least one query).	"sector=wash"

Deduplicate rows filter properties

{ "filter": "dedup" }

Property	Required	Description	Example
filter	yes	The filter type	Always "dedup"
patterns	no	Tag patterns for the columns to consider for deduplication (if omitted, consider all columns).	"org,country,adm1,sector"
queries	no	One or more Row queries to filter rows that will be ignored during deduplication.	"sector=wash"

Explode data filter properties

{ "filter": "explode" }

Property	Required	Description	Example
filter	yes	The filter type	Always "explode"
header_attribute	no	The attribute to add for the former header (instead of "+header").	"age"
value_attribute	no	The attribute to add for the former dataset value (instead of "+value").	"total"

Fill data filter properties

{ "filter": "fill",
  "pattern": "#affected",
  "queries": "sector=wash"}

Property	Required	Description	Example
filter	yes	The filter type	Always "fill"
pattern	no	A tag pattern (see Tag patterns) matching the column(s) that should be filled (if omitted, fill all).	"#affected"
queries	no	One or more Row queries to filter rows that will be ignored during filling.	"sector=wash"

Explode data filter properties

{ "filter": "implode",
  "label_pattern": "#date+year",
  "value_pattern": "#affected" }

Property	Required	Description	Example
filter	yes	The filter type	Always "implode"
label_pattern	yes	Tag pattern to match the column holding labels.	"#date+year"
value_pattern	yes	Tag pattern to match the column holding values.	"#affected"

JSONPath filter properties

{ "filter": "jsonpath",
  "path": "a.b[2]",
  "pattern": "#affected",
  "queries": "sector=wash"}

Property	Required	Description	Example
filter	yes	The filter type	Always "jsonpath"
path	yes	A JSONPath expression	"data.values[3]"
pattern	no	A tag pattern (see Tag patterns) matching the column(s) where the JSONPath should be applied (if omitted, apply to all).	"#affected"
queries	no	One or more Row queries to filter rows that will be ignored during JSONPath extraction.	"sector=wash"

Merge columns filter properties

{ "filter": "merge_data",
  "keys": "adm1+code",
  "tags": ["affected", "population"] }

Property	Required	Description	Example
filter	yes	The filter type	Always "merge_data"
merge_source	yes	The URL or JSON source (see JSON data sources) for the dataset to from which to merge extra columns.	"http://example.org/data.csv"
keys	yes	Tag patterns for the columns to use as shared keys between the two datasets.	"adm1+code,adm2+code"
tags	yes	Tag patterns matching the columns that will be copied over from the merge_source.	"affected,population"
replace	no	If True, attempt to replace existing columns in the source dataset, instead of adding new ones.	False
overwrite	no	If True (and replace is also True), overwrite fields in the source dataset even when there's already a value in them (otherwise, write only when the field is empty).	True
queries	no	One or more Row queries to filter rows that considered from the merge_source.	"sector=wash"

Rename column filter properties

Unlike the web form, the JSON version allows you to rename multiple columns in a single operation. As a result, it is called rename_columns (plural).

{ "filter": "rename_columns",
  "specs": "meta+count:Number affected#affected" }

Property	Required	Description	Example
filter	yes	The filter type	Always "rename_columns"
specs	yes	A single specification or list of rename specifications in the format #old_tagspec:New header#new_tagspec	"#meta+count:Number affected#affected"

Replace data filter properties

{ "filter": "replace_data",
  "original": "Oxfm",
  "replacement": "Oxfam",
  "pattern": "org+name" }

Property	Required	Description	Example
filter	yes	The filter type	Always "replace_data"
original	yes	The original string or regular expression to replace.	"Oxfm"
replacement	yes	The new string to substitute.	"Oxfam"
pattern	no	a tag pattern (see Tag patterns) for the column(s) where the replacement should take place. If omitted, replace in all columns.
use_regex	no	If True, interpret pattern as a regular expression.	False
queries	no	One or more Row queries to filter rows where replacement can take place.	"sector=wash"

Replace data (mapping table) filter properties

{ "filter": "replace_data_map",
  "map_source": "http://example.org/replacement-map.csv" }

Property	Required	Description	Example
filter	yes	The filter type	Always "replace_data_map"
map_source	yes	The URL or JSON source (see JSON data sources) for the dataset containing a replacement map (see Replacement maps).	"http://example.org/replacement-map.csv"
queries	no	One or more Row queries to filter rows where replacement can take place.	"sector=wash"

Sort rows filter properties

{ "filter": "sort",
  "tags": "date,adm1+name" }

Property	Required	Description	Example
filter	yes	The filter type	Always "sort"
tags	no	Tag patterns for the columns to use as sort keys.	"date,adm1+name"
reverse	no	If True, reverse the sort order (defaults to False).	True

With columns filter properties

{ "filter": "with_columns",
  "includes": "org,sector,adm1+name" }

Corresponds to the Cut columns filter with the includes parameter specified.

Property	Required	Description	Example
filter	yes	The filter type	Always "with_columns"
includes	yes	Tag patterns for the columns to include.	"org,sector,adm1+name"

With rows filter properties

{ "filter": "with_rows",
  "queries": ["org=unicef", "date+year>2010"] }

Corresponds to the Select rows filter without the "Invert query" box checked.

Property	Required	Description	Example
filter	yes	The filter type	Always "with_rows"
queries	yes	Row queries for the rows to include (list or single string).	["org=unicef", "date+year>2010"]

Without columns filter properties

{ "filter": "without_columns",
  "excludes": "contact+email" }

Corresponds to the Cut columns filter with the excludes parameter specified.

Property	Required	Description	Example
filter	yes	The filter type	Always "without_columns"
excludes	yes	Tag patterns for the columns to exclude.	"contact+email"
skip_untagged	no	also remove all columns without HXL hashtags.	True

Without rows filter properties

{ "filter": "without_rows",
  "queries": "status!=active" }

Corresponds to the Select rows filter with the "Invert query" box checked.

Property	Required	Description	Example
filter	yes	The filter type	Always "without_rows"
queries	yes	Row queries for the rows to exclude (list or single string).	"status!=active"

Learn more about the HXL standard at http://hxlstandard.org

Home
Use cases
Core workflow
- Source page
- Tagger page
- Recipe page
  - Filters
- View page
Special topics
For coders
Additional API services

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON recipes

Filter properties

Add column filter properties

Append datasets filter properties

Append datasets filter (external list) properties

Cache filter properties

Clean data filter properties

Count rows filter properties

Deduplicate rows filter properties

Explode data filter properties

Fill data filter properties

Explode data filter properties

JSONPath filter properties

Merge columns filter properties

Rename column filter properties

Replace data filter properties

Replace data (mapping table) filter properties

Sort rows filter properties

With columns filter properties

With rows filter properties

Without columns filter properties

Without rows filter properties

Clone this wiki locally