-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 2373624
Showing
39 changed files
with
6,520 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/node_modules |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Import regions from OpenStreetMap | ||
|
||
The total process takes about 1 hour. | ||
|
||
## Prerequisites | ||
|
||
### Land polygons | ||
|
||
Download "WGS 84, Large polygons not split" from osmdata, unpack and store the shapefile in | ||
|
||
- `00-static-data/land-polygons-complete-4326` | ||
|
||
Direct link: https://osmdata.openstreetmap.de/download/land-polygons-complete-4326.zip | ||
|
||
Landing page: https://osmdata.openstreetmap.de/data/land-polygons.html | ||
|
||
You may want to open the file in mapshaper and check that the polygons do not self-intersect, because clipping with self-intersecting polygons will corrupt data. For example, clipping with these polygons removes most of Japan from the output: | ||
|
||
<img src="screenshot_mapshaper.png" alt="Screenshot" width="640"> | ||
|
||
You can also use | ||
[this snapshot of land polygons](https://nzz-q-assets-stage.s3.amazonaws.com/q-locator-map/land-polygons-complete-4326_2019-11-18.zip) | ||
with correct polygons. | ||
|
||
### Natural earth | ||
|
||
Natural earth data (1:10m Cultural Vectors) is used for zoom levels 0 to 4 for compatibility with OpenMapTiles. | ||
|
||
Download "countries" and "states and provinces" and unpack and store the shapefiles in | ||
|
||
- `00-static-data/ne_10m_admin_0_countries` | ||
- `00-static-data/ne_10m_admin_1_states_provinces` | ||
|
||
Direct links: | ||
|
||
- https://naciscdn.org/naturalearth/10m/cultural/ne_10m_admin_0_countries.zip | ||
- https://naciscdn.org/naturalearth/10m/cultural/ne_10m_admin_1_states_provinces.zip | ||
|
||
Landing page: https://www.naturalearthdata.com/downloads/10m-cultural-vectors/ | ||
|
||
## Steps | ||
|
||
Run this script to execute all steps listed below: | ||
|
||
```bash | ||
import-osm/import-osm.sh | ||
``` | ||
|
||
#### 1. Query list of countries (Overpass) | ||
|
||
Input: Nothing. | ||
Output: List of countries with ISO3166-1 codes. | ||
|
||
#### 2. Query regions by country (Overpass) | ||
|
||
Input: List of countries with ISO3166-1 codes. | ||
Output: For every country, one GeoJSON file with country and subdivision polygons. | ||
|
||
Also store raw data only download if raw data is not available. | ||
|
||
#### 3. Clip with land polygons | ||
|
||
Input: For every country, one GeoJSON file with country and subdivision polygons. | ||
Output: For every country, one GeoJSON file with country and subdivision polygons. | ||
|
||
#### 4. Reduce regions (remove small disconnected parts, e.g. remove French Guiana from France) | ||
|
||
Input: For every country, one GeoJSON file with country and subdivision polygons. | ||
Output: For every country, one GeoJSON file with country and subdivision polygons. | ||
|
||
#### 5. Split by region | ||
|
||
Input: For every country, one GeoJSON file with country and subdivision polygons. | ||
Output: For every region, one GeoJSON file. | ||
|
||
#### 6. Simplify regions | ||
|
||
Input: For every region, one GeoJSON file. | ||
Output: For every region, one GeoJSON file. | ||
|
||
#### 7. Merge regions | ||
|
||
Input: For every country, one GeoJSON file with country and subdivision polygons. | ||
Output: One GeoJSON file with all countries, one GeoJSON file with all subdivisions. | ||
|
||
#### 8. Generate vector tiles | ||
|
||
Input: One GeoJSON file with all countries, one GeoJSON file with all subdivisions. | ||
Output: mbtiles file with 2 layers (countries, subdivisions). | ||
|
||
#### 9. Convert natural earth data to GeoJSON | ||
|
||
Input: Shapefiles with countries and states/provinces. | ||
Output: GeoJSON files with countries and states/provinces. | ||
|
||
#### 10. Generate vector tiles (natural earth) | ||
|
||
Input: GeoJSON files with countries and states/provinces. | ||
Output: mbtiles file with 2 layers (countries, subdivisions). | ||
|
||
#### 11. Join tiles | ||
|
||
Input: mbtiles files from steps 8/10. | ||
Output: mbtiles file with 2 layers (countries, subdivisions), using natural earth data for zoom levels 0-4 and Openstreetmap data for zoom levels 5-10. | ||
|
||
### Clean up | ||
|
||
Run this to remove all `output` folders: | ||
|
||
```bash | ||
import-osm/remove-outputs.sh | ||
``` | ||
|
||
# Preview vector tiles | ||
|
||
Run this script to preview the vector tiles generated in step 11. | ||
|
||
```bash | ||
import-osm/preview-tiles.sh | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
* | ||
!.gitignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/output |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
#!/bin/bash | ||
set -o errexit | ||
set -o nounset | ||
|
||
step_root=$(dirname "$0") | ||
output_dir="$step_root/output" | ||
|
||
mkdir -p "$output_dir" | ||
|
||
curl https://overpass-api.de/api/interpreter \ | ||
--compressed \ | ||
--data 'data=[out:csv(::"id", "ISO3166-1", wikidata, name, "name:de", "name:en")]; relation[boundary=administrative][admin_level=2]["ISO3166-1"]; out;' \ | ||
| npx tsv2json \ | ||
| npx prettier \ | ||
--parser json \ | ||
> "$output_dir/countries.json" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/output |
171 changes: 171 additions & 0 deletions
171
import-osm/02-query-regions/query-regions-by-country.js
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,171 @@ | ||
const fs = require("fs"); | ||
const { geoBounds } = require("d3-geo"); | ||
const queryOverpassWithCallback = require("query-overpass"); | ||
const turf = require("@turf/turf"); | ||
|
||
async function queryRegionsByCountry(countryCode, overpassResult) { | ||
const query = ` | ||
[out:json]; | ||
( | ||
relation | ||
[boundary=administrative] | ||
["ISO3166-1"="${countryCode}"]; | ||
relation | ||
[boundary=administrative] | ||
["ISO3166-2"~"^${countryCode}"]; | ||
); | ||
out; >; out skel;`; | ||
const keepTags = [ | ||
"ISO3166-1", | ||
"ISO3166-2", | ||
"admin_level", | ||
"wikidata", | ||
"name", | ||
"name:de", | ||
"name:en" | ||
]; | ||
|
||
if (overpassResult) { | ||
console.log("Reuse existing data"); | ||
} else { | ||
overpassResult = await queryOverpass(query); | ||
} | ||
|
||
const geojson = await parseOverpassResult( | ||
overpassResult, | ||
keepTags, | ||
countryCode | ||
); | ||
|
||
return { | ||
geojson, | ||
rawData: overpassResult | ||
}; | ||
} | ||
|
||
function queryOverpass(query) { | ||
return new Promise(resolve => { | ||
const runQuery = () => { | ||
queryOverpassWithCallback(query, (error, data) => { | ||
if (error) { | ||
if (error.statusCode === 429) { | ||
console.log("Too many requests, will retry in 30 seconds..."); | ||
sleep(30).then(runQuery); | ||
} else if (error.statusCode === 504) { | ||
console.log("Gateway timeout, will retry in 30 seconds..."); | ||
sleep(30).then(runQuery); | ||
} else { | ||
throw error; | ||
} | ||
} else { | ||
resolve(data); | ||
} | ||
}); | ||
}; | ||
runQuery(); | ||
}); | ||
} | ||
|
||
function sleep(seconds) { | ||
return new Promise(resolve => setTimeout(resolve, seconds * 1000)); | ||
} | ||
|
||
async function parseOverpassResult(overpassResult, keepTags, countryCode) { | ||
const geojson = turf.clone(overpassResult); | ||
|
||
// Add bounding box | ||
geojson.bbox = getBbox(geojson); | ||
|
||
// Keep only Polygon and MultiPolygon features | ||
geojson.features = geojson.features.filter(feature => { | ||
const { type } = feature.geometry; | ||
return type === "Polygon" || type === "MultiPolygon"; | ||
}); | ||
|
||
geojson.features.forEach(feature => { | ||
// Keep only a subset of tags | ||
const { tags } = feature.properties; | ||
const properties = {}; | ||
keepTags.forEach(keepTag => { | ||
if (tags[keepTag] === undefined) { | ||
properties[keepTag] = null; | ||
} else { | ||
properties[keepTag] = tags[keepTag]; | ||
} | ||
}); | ||
|
||
// Add OSM relation id as property and remove feature id | ||
properties.osmRelationId = parseInt(feature.id.split("/")[1]); | ||
delete feature.id; | ||
|
||
// Set type to country / subdivision | ||
// --- | ||
// Some regions (usually "dependent territories") are both countries and subdivisions and also | ||
// have a separate ISO3166-1 country code, in addition to the ISO3166-2 subdivision code. | ||
// For example American Samoa has these codes: | ||
// ISO3166-1: AS, ISO3166-2: US-AS | ||
// These regions will be labeled as "subdivision" here. | ||
// See https://en.wikipedia.org/wiki/ISO_3166-2#Subdivisions_included_in_ISO_3166-1 | ||
if (properties["ISO3166-1"] === countryCode) { | ||
properties.type = "country"; | ||
} else if (properties["ISO3166-2"]) { | ||
properties.type = "subdivision"; | ||
properties["ISO3166-1"] = countryCode; | ||
} | ||
|
||
feature.properties = properties; | ||
}); | ||
|
||
// Remove duplicate wikidata entries | ||
const featuresByWikidata = {}; | ||
geojson.features.forEach(feature => { | ||
const { wikidata } = feature.properties; | ||
if (wikidata) { | ||
if (!featuresByWikidata[wikidata]) { | ||
featuresByWikidata[wikidata] = []; | ||
} | ||
featuresByWikidata[wikidata].push(feature); | ||
} else { | ||
console.warn( | ||
"Discarded feature without wikidata tag", | ||
JSON.stringify(feature.properties) | ||
); | ||
} | ||
}); | ||
geojson.features = []; | ||
Object.values(featuresByWikidata).forEach(features => { | ||
features.sort( | ||
(a, b) => a.properties.admin_level - b.properties.admin_level | ||
); | ||
geojson.features.push(features[0]); | ||
if (features.length > 1) { | ||
console.log( | ||
`Discarded ${features.length - 1} features with duplicate wikidata tags` | ||
); | ||
} | ||
}); | ||
|
||
return geojson; | ||
} | ||
|
||
function getBbox(geojson) { | ||
// D3 required the opposite of the standard (RFC 7946) GeoJSON winding order: | ||
// The exterior ring for polygons must be clockwise. | ||
const geojsonClockwise = turf.rewind(geojson, { | ||
reverse: true | ||
}); | ||
// D3 instead of turf is used to get a correct bounding box for countries that | ||
// cross the antimeridian (180° east/west), for example Russia, United States | ||
return geoBounds(geojsonClockwise).flat(); | ||
} | ||
|
||
if (require.main === module) { | ||
queryRegionsByCountry("CH").then(({ geojson }) => { | ||
fs.writeFileSync("CH-regions.json", JSON.stringify(geojson)); | ||
}); | ||
} | ||
|
||
module.exports = queryRegionsByCountry; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
const queryRegionsByCountry = require("./query-regions-by-country"); | ||
const fs = require("fs"); | ||
const path = require("path"); | ||
|
||
async function queryRegions(countriesFile, outputDir) { | ||
const countries = JSON.parse(fs.readFileSync(countriesFile)); | ||
const allRegions = new Set(); | ||
for (const country of countries) { | ||
const countryCode = country["ISO3166-1"]; | ||
console.log(`Querying regions for country ${countryCode}...`); | ||
|
||
const rawDataPath = path.join(outputDir, "raw", `${countryCode}.json`); | ||
let oldRawData; | ||
if (fs.existsSync(rawDataPath)) { | ||
oldRawData = JSON.parse(fs.readFileSync(rawDataPath)); | ||
} | ||
|
||
const { geojson, rawData } = await queryRegionsByCountry( | ||
countryCode, | ||
oldRawData | ||
); | ||
if (!oldRawData) { | ||
fs.writeFileSync(rawDataPath, JSON.stringify(rawData)); | ||
} | ||
const outputFile = path.join(outputDir, `${countryCode}.json`); | ||
fs.writeFileSync(outputFile, JSON.stringify(geojson)); | ||
|
||
geojson.features.forEach(({ properties: { wikidata } }) => { | ||
allRegions.add(wikidata); | ||
}); | ||
} | ||
const listFile = path.join(outputDir, `list/list.json`); | ||
fs.writeFileSync(listFile, JSON.stringify(Array.from(allRegions).sort())); | ||
} | ||
|
||
const [countriesFile, outputDir] = process.argv.slice(2); | ||
queryRegions(countriesFile, outputDir); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash | ||
set -o errexit | ||
set -o nounset | ||
|
||
step_root=$(dirname "$0") | ||
countries_file="$step_root/../01-list-countries/output/countries.json" | ||
output_dir="$step_root/output" | ||
|
||
mkdir -p "$output_dir" | ||
mkdir -p "$output_dir/raw" | ||
mkdir -p "$output_dir/list" | ||
|
||
node "$step_root/query-regions.js" "$countries_file" "$output_dir" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/output |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/bin/bash | ||
set -o errexit | ||
set -o nounset | ||
|
||
step_root=$(dirname "$0") | ||
input_dir="$step_root/../02-query-regions/output" | ||
output_dir="$step_root/output" | ||
|
||
land_polygons="$step_root/../00-static-data/land-polygons-complete-4326/land_polygons.shp" | ||
|
||
mkdir -p "$output_dir" | ||
|
||
# Split by admin_level before clipping to work around a bug (?) with overlapping geometries. | ||
npx mapshaper \ | ||
-i "$input_dir"/*.json combine-files no-topology -merge-layers \ | ||
-split admin_level \ | ||
-clip "$land_polygons" \ | ||
-merge-layers \ | ||
-split ISO3166-1 \ | ||
-o format=geojson "$output_dir" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
/output |
Oops, something went wrong.