Skip to content

Commit

Permalink
bump
Browse files Browse the repository at this point in the history
  • Loading branch information
paleolimbot committed Oct 25, 2023
1 parent 9178527 commit 14a9902
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion R/read.R
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,8 @@ process_cpl_read_ogr_stream = function(x, default_crs, num_features, fid_column_
st_set_crs(x, crs)
})

# Prefer "geometry" as the geometry column name instead of "wkb_geometry"
# Prefer "geometry" as the geometry column name instead of "wkb_geometry",
# which is what happens when the geometry column doesn't have a name
if (any(is_geometry_column) && !("geometry" %in% names(df))) {
geometry_column_name <- names(df)[which(is_geometry_column)[1]]
if (identical(geometry_column_name, "wkb_geometry")) {
Expand Down

4 comments on commit 14a9902

@edzer
Copy link
Member

@edzer edzer commented on 14a9902 Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what happens when the geometry column does not have a name, but when it has a name it does not happen: https://github.com/r-spatial/sf/blob/main/src/gdal_read.cpp#L76-L84

So with this PR, e.g. GPKG or PostGIS tables are read with a geometry column called geom or wkb_geometry when use_stream=FALSE (see code mentioned), but called geometry when use_stream=TRUE (this PR). I don't think that is good design, or why users can/should expect that.

The reason to keep geometry names the way they come is the ability to do roundrips file -> sf -> file, without loss of information.

@paleolimbot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is good design, or why users can/should expect that.

I agree that the geometry column shouldn't have its name changed, but if I don't include the changes in this PR, users will get a different answer when reading a shapefile with use_stream = TRUE and use_stream = FALSE? Is there a better way to ensure that the output of the two matches without the code I've added here?

@edzer
Copy link
Member

@edzer edzer commented on 14a9902 Oct 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is currently being set to wkb_geometry, where does that happen? AFAICT, GDAL returns no name on shapefiles (link above), so we should be able to check this in CPL_read_gdal_stream and set it there to geometry?

@paleolimbot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is currently being set to wkb_geometry, where does that happen?

That's the name that comes through on the stream (I didn't put it there!). I have an idea about that though (basically: I can check the geometry column name in the same way in C++ and add a less fragile name substitution).

Please sign in to comment.