Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser issue for some features, in particular communes #22

Open
ThomasG77 opened this issue Nov 9, 2021 · 6 comments
Open

Parser issue for some features, in particular communes #22

ThomasG77 opened this issue Nov 9, 2021 · 6 comments

Comments

@ThomasG77
Copy link
Contributor

ThomasG77 commented Nov 9, 2021

Some communes are not parsed correctly (as least for the commune polygon). We confirmed it using another parser, GDAL with Edigeo driver.

You can see how to reproduce the issue https://gist.github.com/ThomasG77/f75f50356d50b9e428dc01c076f6574a

Only 49 communes are concerned but we've seen other type of layers are affected like TLINE

We probably need to combine approach between current parser and GDAL Edigeo Driver as current parser was done to bypass some GDAL limitations https://blog.geo.data.gouv.fr/cadastre-millesime-janvier-2018-nouveautes-perspectives-a657d471a178

Look also at https://github.com/DoFabien/edigeoToGeojson

@ThomasG77
Copy link
Contributor Author

ThomasG77 commented May 11, 2022

Solved with some GDAL post-processing for communes https://twitter.com/datagouvfr/status/1521067883022979072 instead of touching parser.

Same issues for sections but currently unsolved.

It affects the DVF application as there is an empty section https://www.data.gouv.fr/fr/datasets/demandes-de-valeurs-foncieres-geolocalisees/#discussion-627b860b8ac61b099fc46a86
So, it make "Section Cadastrale" dropdown list part no showing the section (as not available) and the display does not show the section also.

Done by comparing current GeoJSON output https://cadastre.data.gouv.fr/data/etalab-cadastre/2022-04-01/geojson/communes/80/80695/ with output from https://cadastre.data.gouv.fr/data/dgfip-pci-vecteur/2022-04-01/edigeo/feuilles/80/80695/ after using GDAL on the THF file within edigeo-806950000D01.tar.bz2 with command ogr2ogr -f GeoJSON section-d.geojson -dialect SQLite -sql "SELECT * FROM SECTION_id" -lco RFC7946=YES E0000D01.THF

Recipe

wget https://cadastre.data.gouv.fr/data/dgfip-pci-vecteur/2022-04-01/edigeo/feuilles/80/80695/edigeo-806950000D01.tar.bz2
unp edigeo-806950000D01.tar.bz2
ogr2ogr -f GeoJSON section-d.geojson -dialect SQLite -sql "SELECT * FROM SECTION_id" -lco RFC7946=YES E0000D01.THF

We are able to find out issues by parsing the output logs of the edigeo-parser with paste <(cut -c1-5 nohup.out) <(cut -c14- nohup.out) |grep SECTION | sort | uniq

We got a feedback about Saint-Just-Luzac (INSEE 17351) where we got the same issue...

nohup.out is a file produced by running in background the following processing https://github.com/etalab/cadastre#extraction-des-donn%C3%A9es-du-pci-vecteur-et-production-des-fichiers-communaux

@ThomasG77
Copy link
Contributor Author

ThomasG77 commented Jun 8, 2022

For fixing sections

  • Try to get the commune with sections issues
  • Get it area
  • Get the sections
  • Merge the section
  • Do the area difference to get an idea of the area issue
  • Do the difference and get all parcelles within
  • Get the section letter from the parcelles within the difference to see if only one.
    • In this case, fill the missing section
    • If more than one, need more work to deduce how to solve.

@ThomasG77
Copy link
Contributor Author

ThomasG77 commented Sep 6, 2022

To fix issues, I've taken the approach to solve by type of errors. At the moment, for sections, 6 types of errors

  1. has-crossing-holes
  2. has-exterior-holes
  3. has-self-intersection
  4. ring-has-duplicate-vertices
  5. (The input polygon may not have duplicate vertices (except for the first and last vertex of each ring)) It seems a side effect of a turf check. The message is not available in the core code of the lib but in turf code
  6. (Unable to build valid polygon coordinates)

For has-exterior-holes, I've solved it mainly with branch https://github.com/etalab/edigeo-parser/tree/fix-section-reading-1 but it seems some cases are not fixed. Then, I use the cadastre branch to see in production if effective https://github.com/etalab/cadastre/tree/update-pkg

Matching tests cases (number matches with above list number)

# 1
08339000ZV01:Objet_567765(SECTION) => geometry ignored (has-crossing-holes)
15231000ZB01:Objet_1442135(SECTION) => geometry ignored (has-crossing-holes)

# 2
571510002201:Objet_649933(SECTION) => geometry ignored (has-exterior-holes, has-self-intersection)
571510002202:Objet_649933(SECTION) => geometry ignored (has-exterior-holes, has-self-intersection)

# 3
52432111ZK01:Objet_675114(SECTION) => geometry ignored (ring-has-duplicate-vertices, has-self-intersection)
571510002201:Objet_649933(SECTION) => geometry ignored (has-exterior-holes, has-self-intersection)

# 4
52432111ZK01:Objet_675114(SECTION) => geometry ignored (ring-has-duplicate-vertices, has-self-intersection)
577320003301:Objet_1479958(SECTION) => geometry ignored (ring-has-duplicate-vertices, has-self-intersection)

#5
274485100B01:Objet_463563(SECTION) => geometry ignored (The input polygon may not have duplicate vertices (except for the first and last vertex of each ring))

#6
395110000U01:Objet_1314984(SECTION) => geometry ignored (Unable to build valid polygon coordinates)

@ThomasG77 ThomasG77 mentioned this issue Sep 23, 2022
@ThomasG77
Copy link
Contributor Author

Related issues with parcelles (parsing issues, hence not visible and not provided in our etalab-cadastre delivery)

@ThomasG77
Copy link
Contributor Author

ThomasG77 commented Oct 3, 2022

Overall issues list (including polygons, labels, linestring)

errors count %
010100000A01:Objet_2512020(TLINE) => geometry ignored (Too many linked arcs to build a single LineString) 2429550 91,9608666210689
06088000OL01:Objet_126251(SUBDFISC) => geometry ignored (Unable to build valid polygon coordinates) 107689 4,07613499024769
Impossible de relier la subdivision fiscale à sa parcelle 84167 3,18580406284929
ring-has-duplicate-vertices 15030 0,568900341756566
has-exterior-holes 2303 0,087170824156046
Impossible de relier parcelle et numéro de voie 1204 0,0455725889204861
Failed to deintersect polygon: significant secondary polygon 977 0,0369804147635506
The input polygon may not have duplicate vertices 424 0,0160488186896064
has-crossing-holes 282 0,0106739784680873
deintersectPolygon: unexpected error 180 0,00681317774558762
Too many linked faces to build a single Polygon 62 0,00234676122348018
arc.left.endsWith is not a function 45 0,0017032944363969
found non-noded intersection between 10 0,000378509874754868
JSTS union has failed: retrying with mapshaper 8 0,000302807899803894
Missing required files in EDIGÉO bundle 8 0,000302807899803894

@ThomasG77
Copy link
Contributor Author

Exemple nouveau de problème parcelle ZB 170 recouvrant ZB 170 sur la commune 14191
Sélection_990

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant