Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for extendable layer schema #377

Open
nyurik opened this issue Sep 30, 2021 · 13 comments
Open

Add support for extendable layer schema #377

nyurik opened this issue Sep 30, 2021 · 13 comments

Comments

@nyurik
Copy link
Member

nyurik commented Sep 30, 2021

As discussed in openmaptiles/openmaptiles#1252, we need a way for users to extend which fields to include with a layer without modifying the actual code. Instead, users will modify the main yaml file (i.e. openmaptiles.yaml) to specify needed changes.

tileset:
  layers:
    # ...
    - file: layers/transportation/transportation.yaml
      add_tags:
          maxspeed:
            description: Original value of [`maxspeed`](http://wiki.openstreetmap.org/wiki/Key:maxspeed) tag, if exists.
          maxheight:
            description: Original value of [`maxheight`](http://wiki.openstreetmap.org/wiki/Key:maxheight) tag, if exists.
          tracktype:
            description: Original value of [`tracktype`](http://wiki.openstreetmap.org/wiki/Key:tracktype) tag, if exists.
  # ...

The above would only work for layers that expose tags HSTORE field at the top SQL layer, and have a magic keyword as part of its SQL statement. Resulting layer modifications:

Please update the following sections with the exact specification, or add comments with proposals:

Imposm mapping file

TODO: what changes should tools do to the mapping file(s), and how should tools determine which file/table/key should be modified

Add field declarations

Layer specification is updated dynamically: new fields with their descriptions are added to the layer.fields map.
TODO: decide if we should support values param, and if so, how.
TODO: decide if we want to support updates to existing fields, for example adding more enum values to class. Is this needed at all? E.g. if we add a value to transportation class, it will not be straightforward to change which of them are included in z12

Update main SQL statement

TODO: We currently use %%FIELD_MAPPING: class %% for SQL modifications. We could use something similar here, e.g. introduce a new %%CUSTOM_TAGS%%? Replacement might be in the form NULLIF(tags->'maxspeed', '') AS "maxspeed",, or we may want to allow users to provide their own SQL.

@zstadler
Copy link
Contributor

zstadler commented Oct 1, 2021

The requirements from the tools would be derived from the new structure of defining a layer.

The structure could be based on the following principals. The transition of the layers to the new structure can be done on a layer-by-layer basis, once the tool support is available.

  • Since the tags column stores all OSM tag information, it will be used as the primary source for their values.
    • The SQL of many layers can be simplified. This simplification is not mandatory and may be performed on a layer-by-layer basis.
  • All entries in the tags column will be added to the vector tile.
  • The use of the columns entry in mapping.yaml files will be restricted.
    • In all layers:
      • name: geometry: type: geometry or type: validated_geometry
      • name: tags: type: hstore_tags
    • Other non-type: string columns, when needed. For example:
      • name: osm_id: type: id
      • name: z_order: type: wayzorder
      • name: area: type: area
      • name: mapping_key: type: mapping_key
      • name: <tag>: type: mapping_value
    • As an exception:
      • If and when specific OSM tags are needed for SQL efficiency

The following tool-related enhancements could help implementing the above:

  • A {layer_tags} macro that expands all entries in tags
    • This macro will be used in the query entry of <layer>.yaml files.
    • Its output will be similar to the output of {name_languages}.
    • It will include the name-related entries in tags, so {name_languages} need not be used with it.
  • A delete_language_tags(tags) SQL macro for removing name-related tags from tags
    • It will probably be used only in the SQL of the transportation and water layers where names are not included in the layers themselves, but are extracted from OSM to be included in the transportation_name and waterway layers, resp.
    • It will probably be implemented using DELETE(tags, ARRAY['name', 'int_name', 'loc_name', 'wikidata', 'wikipedia', 'name:<lang>', ...]) to efficiently compute tags - slice_language_tags(tags).

@zstadler
Copy link
Contributor

zstadler commented Oct 1, 2021

A further improvement over the direct use of a {layer_tags} macro would be that a missing query entry of a layer will be computed from the fields defined in openmaptiles.yaml, the <layer>.yaml file.

It would roughly be:

    query:  (SELECT geometry, {layer_non_tags}, {layer_tags} FROM layer_<layer>(!bbox!, z(!scale_denominator!))) AS t

Where {layer_non_tags} is a macro that expands to all fields, such as class and brunnel, that are defined in the <layer>.yaml file but are not covered by {layer_tags}.

To be discussed: Should such SQL-generated fields be stored as separate columns, as entries in the tags column, or either way with an indication included in their <layer>.yaml entry.

Note: The {layer_tags} macro will still be useful when the query entry must be explicitly provided.

@ZeLonewolf
Copy link
Contributor

One of the challenges I see is that there is a fundamental limit of imposm mappings - you can only specify a single tags block that applies for the ENTIRE mapping. Thus, if you decide that a tag should be added to the tags hstore for one layer, it will get added for every single layer. This was fine when we're only using the tags hstore for name and name:xx value, but it could cause unexpected results such as fragmentation or unwanted extra fields when adding a tag to a layer.

Now, in general, the information domain helps us here. If we add maxspeed because we want that for road rendering, it's unlikely to show up in landuse because maxspeed is specific to the transportation domain.

Given this limitation of imposm, we would need to introduce a computed column (probably in a separate table, accessed via JOIN over primary keys to ensure a proper update triggering sequence) in between the table that imposm creates, and the rest of the layer logic. The purpose of column would be to hold a minimized version of the tags object that contains only the tags that the layer actually cares about. The column could be computed by a function which is generated by openmaptiles-tools as determined by information in the layer yaml. The table would also need an update trigger to recompute the column each time a row changes.

This would allow us to build arbitrary tags objects that can get replicated up through the table sequence and into the layers that is minimized to contain just the data that the layer needs, and prevent object fragmentation, as described in
openmaptiles/openmaptiles#1252 (comment).

@ZeLonewolf
Copy link
Contributor

We should have the ability in openmaptiles.yaml to specify:

  1. That a layer is disabled. For optional layers, this is better than simply commenting out option lines because it enables automated testing of optional layers by changing the yaml at runtime.
  2. That a layer should have its tables/SQL generated but not produce tiles. This would allow for us to have "base" layers that can be extended without creating unwanted entries in the tiles. Currently, if I want to render ONLY the transportation_name layer, for example, I can't do that without also generating the transportation layer in the tiles since one depends on the other.

@zstadler
Copy link
Contributor

zstadler commented Oct 1, 2021

One of the challenges I see is that there is a fundamental limit of imposm mappings - you can only specify a single tags block that applies for the ENTIRE mapping. Thus, if you decide that a tag should be added to the tags hstore for one layer, it will get added for every single layer. This was fine when we're only using the tags hstore for name and name:xx value, but it could cause unexpected results such as fragmentation or unwanted extra fields when adding a tag to a layer.

It seems to be more than that. The tags column contains every value of every key used by any imposm table, not just keys in the include section!

For example, the height key is used as a column only by the building layer in tables that do not have a tags column. Nevertheless, some rows of the tags column of the osm_highway_linestring SQL table have height keys and values.

An hstore-type column in the SQL seems to be very handy for adding fields to OMT layers. On the other hand, we don't really need to use the imposm-generated tags column. Instead, we can use another hstore column, say omt_tags, that will only store the tags of the specific layer. The tools could create a layer-specific macro, similar to the class macro, that will be used to populate the omt_tags column from the imposm-generated table.

@ZeLonewolf
Copy link
Contributor

An hstore-type column in the SQL seems to be very handy for adding fields to OMT layers. On the other hand, we don't really need to use the imposm-generated tags column. Instead, we can use another hstore column, say omt_tags, that will only store the tags of the specific layer. The tools could create a layer-specific macro, similar to the class macro, that will be used to populate the omt_tags column from the imposm-generated table.

Hmm. So you want to map them as regular columns, pack them into an hstore, and then explode the hstore in the layer function? We could make that work.

@zstadler
Copy link
Contributor

zstadler commented Oct 1, 2021

Yes!

@zstadler
Copy link
Contributor

My only concern with this approach is that once the new syntax is available in openmaptiles.yaml it will be applicable to all layers. It implies that the tags/hstore mechanism will need to be implemented in all of them - not a trivial task. This includes layers like aeroway which are rarely updated or extended.

@nyurik, @ZeLonewolf,
Any ideas on how to minimize such a migration effort?

@ZeLonewolf
Copy link
Contributor

ZeLonewolf commented Oct 13, 2021

Any ideas on how to minimize such a migration effort?

I suppose that work just needs to get done, one layer at a time.

@nyurik
Copy link
Member Author

nyurik commented Oct 13, 2021

@zstadler we don't need to migrate anything at first -- only the layers that we want to support such functionality will implement it, and others should raise an error if a user tries to extend them.

@ZeLonewolf
Copy link
Contributor

transportation is probably the layer that is the most desired for extension so far. I'll take a stab at injecting the tags hstore in the layer tables.

@ZeLonewolf
Copy link
Contributor

One issue I've discovered is that tags that are injected into the tags hstore don't get imposm's type processing. If you have something mapped as a boolean in imposm, it automates the conversion of yes/no/true/false etc to a boolean. Same presumably applies to other types, such as int.

@ZeLonewolf
Copy link
Contributor

Okay, actually this is not a problem at all thanks to postgres's typing:
https://www.postgresql.org/docs/9.2/datatype-boolean.html

So SELECT 'yes'::boolean for example will return true. So this might work after all.

nyurik added a commit that referenced this issue Feb 3, 2022
* implements openmaptiles/openmaptiles#1345
* partially implements #377

### Layer file changes
* Added support for an optional `min_buffer_size` value, which is only used in case of a global override, but if set, it must be less than or equal to `buffer_size`.
* Layer's `buffer_size` is now optional if `min_buffer_size` is set.

```yaml
layer:
  id: housenumber
  # Only one of these is mandatory
  min_buffer_size: 1
  buffer_size: 8
```

### Tileset file changes
* Added support for an optional `overrides` section (similar to `defaults`).
  * Only supports optional `overrides.buffer_size` int value for now. If set, this value will override the `buffer_size` set in the layer.  If layer has `min_buffer_size`, the largest of the two will be used.
* Added support for per layer overrides as described in #377. Only supports `buffer_size` and `min_buffer_size`
* An environment variable `TILE_BUFFER_SIZE` can be used instead of `overrides.buffer_size`, and it takes precedence.

This example will load two layers. Mountain peak will be used as is, and house numbers's min buffer zoom is modified. The global override=0 will set mountain peaks buffer to 0 (because it has no min_buffer_size defined inside the layer file), but the house number buffer size will be 5.

```yaml
tileset:
  layers:
    - mountain_peak/mountain_peak.yaml
    - file: housenumber/housenumber.yaml
      min_buffer_size: 5
  overrides:
    buffer_size: 0
```

### Breaking
* Remove deprecated tileset and layer properties (they have been deprecated for many years, and are not used)
francois2metz pushed a commit to indoorequal/openmaptiles-tools that referenced this issue Feb 7, 2022
* implements openmaptiles/openmaptiles#1345
* partially implements openmaptiles#377

### Layer file changes
* Added support for an optional `min_buffer_size` value, which is only used in case of a global override, but if set, it must be less than or equal to `buffer_size`.
* Layer's `buffer_size` is now optional if `min_buffer_size` is set.

```yaml
layer:
  id: housenumber
  # Only one of these is mandatory
  min_buffer_size: 1
  buffer_size: 8
```

### Tileset file changes
* Added support for an optional `overrides` section (similar to `defaults`).
  * Only supports optional `overrides.buffer_size` int value for now. If set, this value will override the `buffer_size` set in the layer.  If layer has `min_buffer_size`, the largest of the two will be used.
* Added support for per layer overrides as described in openmaptiles#377. Only supports `buffer_size` and `min_buffer_size`
* An environment variable `TILE_BUFFER_SIZE` can be used instead of `overrides.buffer_size`, and it takes precedence.

This example will load two layers. Mountain peak will be used as is, and house numbers's min buffer zoom is modified. The global override=0 will set mountain peaks buffer to 0 (because it has no min_buffer_size defined inside the layer file), but the house number buffer size will be 5.

```yaml
tileset:
  layers:
    - mountain_peak/mountain_peak.yaml
    - file: housenumber/housenumber.yaml
      min_buffer_size: 5
  overrides:
    buffer_size: 0
```

### Breaking
* Remove deprecated tileset and layer properties (they have been deprecated for many years, and are not used)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants