Results change with sorting order for IntersectNodes that share the same X&Y? #126

alexisnaveros · 2022-07-13T08:13:24Z

alexisnaveros
Jul 13, 2022

Hi Angus, thanks for the nice library!

I have noticed that when you have multiple IntersectNodes sharing the same X and Y values in your ->intersect_nodes_ list, then the sorted order for these is undefined (sort isn't stable and IntersectListSort() doesn't express a preference), but the results differ depending on the order being picked for these IntersectNodes sharing the same X and Y values.

The output appears significantly different, like a whole contour being absent.

I wonder, could it reflect a deeper problem? Intuitively, if your algorithm doesn't care about the sorting order of intersections sharing the same X and Y, then the result shouldn't change with a different order for these (you can switch between std::sort and std::stable_sort and occasionally catch a difference).

In case you are curious why I would bother with these internal details, I'm rewriting the code in C, as a preamble before seeing if I can reasonably make it SIMD-friendly and malloc-free, for a later port to CUDA (GPUs). If my sorts (radix, merge, etc.) pick a different order than std::sort for IntersectNodes with identical X and Y, then final results differ, and red flags are raised (I use your library as reference implementation, expecting identical results).

If you need more details, I'm basically doing:
#define FACTORMUL (1000)
srand( 3 );
for( i = 0 ; i < 512 ; i++ )
{
x = rand() & 63;
if( !( i & 1 ) )
y = rand() & 63;
input0.push_back( BuildPoint64( x*FACTORMUL, y*FACTORMUL ) );
}
subject.push_back( input0 );
clipper.AddSubject( subject );
clipper.Execute( Clipper2Lib::ClipType::Union, Clipper2Lib::FillRule::Negative, solution, open_paths );

And, for example, Clipper2's final results differ depending if:
std::sort(intersect_nodes_.begin(), intersect_nodes_.end(), IntersectListSort);
or
std::stable_sort(intersect_nodes_.begin(), intersect_nodes_.end(), IntersectListSort);
was called in clipper.engine.cpp

The sample above assumes the Linux glibc's rand() implementation, but I'm sure you can reproduce that easily just by picking a different srand() seed until results change.

Thanks for your time, let me know if I can help with anything!

AngusJohnson · 2022-07-13T08:56:41Z

AngusJohnson
Jul 13, 2022
Maintainer

Hi Alexis.

Firstly, good luck with your C port. Please let me know f you get it working reliably as I may well display a link to it from here.

WRT differing results with stable vs unstable IntersectNode sorts, I'm OK with that as long as there's no significant difference in solution coverage (filling). If there is a significant difference, then this suggests there's a bug somewhere that needs fixing.

Cheers A.

0 replies

alexisnaveros · 2022-07-13T09:39:17Z

alexisnaveros
Jul 13, 2022
Author

Hi Angus, thanks for the quick reply.

Interesting. I think I see what you mean by "no significant difference in solution coverage"... Between std::sort and std::stable_sort, I can see a bunch of vertices are missing from a contour, but then they appear to have been stitched to another nearby contour instead. That would be fine, I'll keep an eye out if it's always the case.

The C code seems to be working reliably (results are identical to Clipper2 if I switch your sorts to std::stable_sort). Optimization is next, along with seeing how CUDA-friendly it can get. It's already a little faster than the C++ version, thanks to details like switching to a radix sort (especially when millions of items are involved), which you could also certainly do in C++/Delphi/others.

Thanks!

0 replies

alexisnaveros · 2022-07-19T03:19:04Z

alexisnaveros
Jul 19, 2022
Author

Hi Angus, I'm posting under the same discussion as it's still mostly the same topic.

I think we have established that the order of intersections sharing the same X and Y slightly alters the results, but it's (probably) inconsequential. Now, is it possible that the sorting order on X (when Y is equal) is also irrelevant, as long as the Y ordering is respected? I feel like I now understand the whole code, and my current understanding tells me that it should never matter. (And that makes a big difference: radix sort on only 64 bits instead of 128, or a branchless merge sort that can use conditional move instructions)

I'm still crunching on optimization for the C/CUDA rewrite/port as it would be used in a time-sensitive context. I am a bit annoyed to see bug fixes being committed, which I now also have to fix, but that's life ;)

You may still be hunting/fixing bugs, but... here are some optimizations done in the C rewrite:

Switching to a radix sort whenever the sort count is large (only the 64 bits of Y for IntersectNodes)
After sorting all the local minima, the overhead of insertion into a heap (the priority queue) can be avoided, it can be stored in-order as the input is already sorted
Avoiding a bunch of heavy pointers everywhere by storing uint32_t for active edges. So the input size is slightly limited to avoid getting above 4 billion active edges, but it reduces memory usage and is therefore faster (less bandwidth, more stuff in cache, etc.)

With a random vertex soup (not a realistic scenario), >50% of the time is spent in BuildIntersectList(), so I want to see if anything can be done about that. Something really bugs me about building, sorting and flushing all the intersection nodes at every scanbeam... Any thoughts?

Thanks!

13 replies

alexisnaveros Jul 20, 2022
Author

The solutions from these batched ops (presumably union, but can also work with intersection) should now have very many fewer edges.

Ah right, that's the source of the misunderstanding. When merging these 12 million shapes out of OpenStreetMaps, the input has 950 million edges, and the final output holds 850 million edges. So it's not that much lighter... And I would be worried about performance in that very last step of merging two shapes of 425 million edges each.

To improve scalability, I have had other ideas but I think the best one so far was described above under Scalability brainstorming v0.02. I don't see how it could not work... Ideally, I would prefer you to try it out (since I'll do a bunch of mistakes with unforeseen interactions with the rest of the code), but if not, I think I'll give it a go.

AngusJohnson Jul 20, 2022
Maintainer

v0.02 ... (in other words, it's impossible for the edge to "switch places" with neighbors before that point)
To improve scalability, I have had other ideas but I think the best one so far was described above under Scalability brainstorming v0.02.

I honestly can't see how the overhead of determining when an edge will be crossed (not just by neighbours but by any edge in the AET, and those not yet in the AET) would improve performance.

Edit: OK, I can see that in your particular case, edge crossing is likely only from nearest neighbours. But I want the library to work under all conditions.

the input has 950 million edges, and the final output holds 850 million edges.

OK, now I also understand 😁. But now I don't understand why you'd need to clip all 850M edges. Surely, when displaying the whole map you can cull polygons below a given area? Isn't this more an issue of data storage and retrieval (which I imagine is in an SQL database) where polygons are stored together with a host of other information including their rectangular bounds?

alexisnaveros Jul 20, 2022
Author

I honestly can't see how the overhead of determining when an edge will be crossed (not just by neighbours but by any edge in the AET, and those not yet in the AET) would improve performance.

Simply because, when the data set is huge, the vast majority of scanbeams causes no intersection for the vast majority of active edges. So if we can skip checking these at all (as the scanbeam hasn't yet reached the Y where they might perhaps get an intersection), that would be a big performance boost.

Edit: OK, I can see that in your particular case, edge crossing is likely only from nearest neighbours. But I want the library to work under all conditions.

The "nearest Y of possible intersection" of all edges would have to be updated whenever there are local changes, so that it keeps working in all conditions. But this is a local update, so scalability is preserved (assuming you have a fast way to grab all edges overlapping the span between minx and maxx). I really think this should work.

the input has 950 million edges, and the final output holds 850 million edges.

OK, now I also understand grin. But now I don't understand why you'd need to clip all 850M edges. Surely, when displaying the whole map you can cull polygons below a given area? Isn't this more an issue of data storage and retrieval (which I imagine is in an SQL database) where polygons are stored together with a host of other information including their rectangular bounds?

Yeah, but it's not for display... All the world's water coverage is unified as one gigantic fully connected triangle mesh (kind-of Delaunay), height/depth information is added, and it's used for hydrography analysis, flow/current predictions, path finding, and other stuff. I also prepare little tiles of triangle mesh for display, but that's very secondary.

AngusJohnson Jul 20, 2022
Maintainer

Edit: OK, I can see that in your particular case, edge crossing is likely only from nearest neighbours. But I want the library to work under all conditions.
The "nearest Y of possible intersection" of all edges would have to be updated whenever there are local changes, so that it keeps working in all conditions. But this is a local update, so scalability is preserved (assuming you have a fast way to grab all edges overlapping the span between minx and maxx). I really think this should work.

Yes, thanks for persisting, I'm now seeing the possibility there 😁.
This is something I'll look at again once I've gotten the current code out of beta.

Yeah, but it's not for display... All the world's water coverage is unified as one gigantic fully connected triangle mesh (kind-of Delaunay), height/depth information is added, and it's used for hydrography analysis, flow/current predictions, path finding, and other stuff. I also prepare little tiles of triangle mesh for display, but that's very secondary.

OK, not for display 😁. But however you're presenting this data, noone can process that level of detail all at once. So I still don't understand why you can't cull smaller polygons when presenting the whole map.

alexisnaveros Jul 20, 2022
Author

Yes, thanks for persisting, I'm now seeing the possibility there grin. This is something I'll look at again once I've gotten the current code out of beta.

🥳🥳 Woohoo! 🥳🥳 Thank you. I very much look forward to it. 😁

Yeah, but it's not for display... All the world's water coverage is unified as one gigantic fully connected triangle mesh (kind-of Delaunay), height/depth information is added, and it's used for hydrography analysis, flow/current predictions, path finding, and other stuff. I also prepare little tiles of triangle mesh for display, but that's very secondary.

OK, not for display grin. But however you're presenting this data, noone can process that level of detail all at once. So I still don't understand why you can't cull smaller polygons when presenting the whole map.

Well, one billion vertices really isn't that bad... (it gets a lot worse after inserting all the worldwide height/depth information, but then it gets simplified a bit). Let's just say the processing isn't done on a laptop. 😉 And the details matter. A couple hundred gigabytes of RAM is advised.

(My stated goal of porting Clipper2 to CUDA is for a different project, where the geometry is more modest, but I need answers in milliseconds. In the end, I would love to use the same code base for all polygon processing everywhere.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results change with sorting order for IntersectNodes that share the same X&Y? #126

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 13 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Results change with sorting order for IntersectNodes that share the same X&Y? #126

alexisnaveros Jul 13, 2022

Replies: 3 comments · 13 replies

AngusJohnson Jul 13, 2022 Maintainer

alexisnaveros Jul 13, 2022 Author

alexisnaveros Jul 19, 2022 Author

alexisnaveros Jul 20, 2022 Author

AngusJohnson Jul 20, 2022 Maintainer

alexisnaveros Jul 20, 2022 Author

AngusJohnson Jul 20, 2022 Maintainer

alexisnaveros Jul 20, 2022 Author

alexisnaveros
Jul 13, 2022

Replies: 3 comments 13 replies

AngusJohnson
Jul 13, 2022
Maintainer

alexisnaveros
Jul 13, 2022
Author

alexisnaveros
Jul 19, 2022
Author

alexisnaveros Jul 20, 2022
Author

AngusJohnson Jul 20, 2022
Maintainer

alexisnaveros Jul 20, 2022
Author

AngusJohnson Jul 20, 2022
Maintainer

alexisnaveros Jul 20, 2022
Author