Use tags = NULL in middle tables if object doesn't have any tags #2099
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This doesn't make much of a difference for the ways and rels table, but if we store all nodes in the database, it does make a huge difference, because most nodes don't have any tags. For a current planet, disk usage for the nodes table goes from 476 GB down to 409 GB saving 67 GB or nearly 15%.
Additionally it makes use of that table simpler. If you want to do any queries on tags, you need an index on the tags column on the nodes/ways/rels tables like this:
CREATE INDEX ON planet_osm_ways USING gin (tags);
But that is wasteful, because of the empty tags. We probably want to generate them as
CREATE INDEX ON planet_osm_ways USING gin (tags) WHERE tags != '{}'::jsonb;
But now all queries on those tables have to include that extra condition so that the query planner will use the index.
SELECT * FROM planet_osm_ways WHERE tags ? 'highway' AND tags != '{}'::jsonb;
If we use NULLs, the index can be created as:
CREATE INDEX ON planet_osm_ways USING gin (tags) WHERE tags IS NOT NULL;
And now the query becomes simpler, because the NOT NULL is automatically taken into account by the query planner:
SELECT * FROM planet_osm_ways WHERE tags ? 'highway';
Note that this is an incompatible change to the new format middle tables, but they are still marked as experimental, so we can do this.
This PR also contains a second commit for future proofing the members list of the rels middle table in case we want to do a similar change for that column in the future.