Refactor transform handling in flatten & support miter join #410

armansito · 2023-11-10T22:12:54Z

This PR implements the miter join style and adds additional test scenes for stroking under non-uniform scale and skew transforms. This includes some changes to transform handling in the flatten stage to keep stroking correct under these transforms:

Flattening (including for fills, offset curves, caps, and joins) is now performed in a curve's local coordinate space (pre-transform) and the transform is applied at the time a line gets output (i.e. LineSoup time). This may be unnecessary for fills but the shader currently uses the same code for flattening fills and strokes.
The line count estimate for subdivision uses local control point coordinates and makes use of the scale factor which is decomposed from the transform matrix.
The bounding box computation is now based on the union of output line segments, which get accumulated at LineSoup time.

This PR is based on #408

armansito · 2023-11-10T22:52:51Z

shader/flatten.wgsl

            // See https://www.iquilezles.org/www/articles/ellipses/ellipses.htm
            // This is the correct bounding box, but we're not handling rendering
            // in the isotropic case, so it may mismatch.
-            stroke = 0.5 * linewidth * vec2(length(transform.mat.xz), length(transform.mat.yw));
-            bbox += vec4(-stroke, stroke);
+            stroke = offset * vec2(length(transform.mat.xz), length(transform.mat.yw));


stroke is now unused. I've been thinking about removing this field from the cubic struct or possibly replacing Cubic with CubicPoints + just the path_ix.

DJMcNab

Does this also need to be reflected in the CPU shaders, or will those continue to use the kurbo based stroke expansion?

Otherwise, a couple of tiny style questions

DJMcNab · 2023-11-11T10:14:35Z

shader/flatten.wgsl

+            let c = tan_prev.x * tan_next.y - tan_prev.y * tan_next.x;
+            let d = dot(tan_prev, tan_next);
+            let hypot = length(vec2f(c, d));
+            let miter_limit = unpack2x16float(style_flags & STYLE_MITER_LIMIT_MASK).x;


I'd probably choose to use the indexing form of vector access (i.e. unpack2x16float(...)[0]) here, as this isn't semantically an x point but just the first 16 bits.

But that's not based on any checking what the style is elsewhere

We have some uses of the unpack intrinsics in the fine shader that perform a swizzle using the xyzw accessors, though I suppose changing those to use the color (rgba) accessors would be semantically more correct.

Using [0] here sounds good to me.

DJMcNab · 2023-11-11T10:26:26Z

shader/flatten.wgsl

+    p10: vec2f, p11: vec2f,
+    transform: Transform
+) {
+    let line_ix = atomicAdd(&bump.lines, 2u);


Is this semantically equivalent to output_line_with_transform called twice, and only written this way to reduce atomic traffic?

That's exactly right. The only reason I added this variant is to reduce the atomic traffic and I was outputting a pair of lines frequently enough for joins that adding this seemed worthwhile.

Thanks, that makes sense.

I'd probably add a comment to that effect, or possible add a function (output_line_into_allocated_space?) which these both call

I did notice one case where ~~four~~ three lines were handled in a row (when handling miter joins) . Handling that case all in one bump could make sense (and could additionally avoid creating some variables). But that also adds would seem to add quite a lot of noise for tiny gains

I think this is a good idea. I'm not sure yet how significant the impact of the additional atomic here is but I suspect it might matter. At any rate, I like the idea of adding a output_line_into_allocated_space helper as I need something to that effect for round caps and joins where the line count is typically a lot higher.

armansito · 2023-11-13T19:08:25Z

Does this also need to be reflected in the CPU shaders, or will those continue to use the kurbo based stroke expansion?

Yes, I'm planning to make the changes to the CPU shaders in a follow-up PR very soon (I'm planning to get that done over the next couple weeks). The kurbo-based stroke expansion will only be used in the "CPU stroke mode" (for lack of a better term) and for dashing in the short term.

The CPU shaders themselves are a CPU incarnation of the GPU shaders with a better test environment, so I'm planning to mirror these changes there pretty much 1:1.

- Miter join style is now supported by the GPU stroker - Slightly tweaked the miter limit test cases to to exercise a miter limit boundary condition and test inserting the miter join on both sides of a stroke within the same path.

* Flattening (including for fills, offset curves, caps, and joins) is now performed in a curve's local coordinate space (pre-transform) and the transform is applied at the time a line is output. This may be unnecessary is for fills but it keeps the code mostly uniform. * The line count estimate for subdivision factors in the scale factor which is decomposed from the transform matrix. * The bounding box computation is now precise and purely based on the union of output line segments.

…tyles

raphlinus

I'm commenting on the transforms only, as I've done a deep dive. I'll dig into the rest a bit later.

In addition to what's inline:

The handling of scale_factor and transform should be different in the fill and stroke case. In the fill case, the scale_factor should be 1 and the cubic should be transformed before going into flatten_cubic. That gets near-optimal flattening even in the presence of non-angle-preserving transform.

In the stroke case, scale_factor should be computed as stated, and the transform applied post-stroking. That gets correct computation of miters, elliptical pen for round, etc. It's also not optimal flattening (more so the more eccentric the transform), but we've decided the extra complexity for optimal flattening in that case is not worth it.

Let me know if any of this is not clear.

raphlinus · 2023-11-14T20:51:22Z

shader/flatten.wgsl

@@ -52,15 +52,15 @@ fn approx_parabola_inv_integral(x: f32) -> f32 {
    return x * sqrt(1.0 - B + (B * B + 0.5 * x * x));
 }

-fn estimate_subdiv(p0: vec2f, p1: vec2f, p2: vec2f, sqrt_tol: f32) -> SubdivResult {
+fn estimate_subdiv(p0: vec2f, p1: vec2f, p2: vec2f, sqrt_tol: f32, transform_scale: f32) -> SubdivResult {


I don't think this needs an additional transform_scale argument, and the scaling is subtly incorrect with it added.

The val return has this meaning: an optimal subdivision of the quadratic into val / 2 subdivisions (which may be fractional) will have an error sqrt_tol^2 (ie the desired tolerance). There are two cases. In the non-cusp case, the error scales as the inverse square of val (doubling val causes the error to be one fourth), so the tolerance is actually not needed for the calculation (which is why it's applied in the caller). In the cusp case, this scaling breaks down and the tolerance parameter is needed to compute the correct result.

In both cases, it's better to handle the scaling in the caller; it's not helpful to do it here.

I don't think this needs an additional transform_scale argument, and the scaling is subtly incorrect with it added.

The val return has this meaning: an optimal subdivision of the quadratic into val / 2 subdivisions (which may be fractional) will have an error sqrt_tol^2 (ie the desired tolerance). There are two cases. In the non-cusp case, the error scales as the inverse square of val (doubling val causes the error to be one fourth), so the tolerance is actually not needed for the calculation (which is why it's applied in the caller). In the cusp case, this scaling breaks down and the tolerance parameter is needed to compute the correct result.

Thank you, this is very illuminating as to how val gets computed (as I didn't understand before why the tolerance was only considered when the signs are different).

I copied your comment into the code as it's a very useful read.

raphlinus · 2023-11-14T21:03:00Z

shader/flatten.wgsl

@@ -139,7 +139,9 @@ fn flatten_cubic(cubic: Cubic) {
    let Q_ACCURACY = ACCURACY * 0.1;
    let REM_ACCURACY = ACCURACY - Q_ACCURACY;
    let MAX_HYPOT2 = 432.0 * Q_ACCURACY * Q_ACCURACY;
-    var n_quads = max(u32(ceil(pow(err * (1.0 / MAX_HYPOT2), 1.0 / 6.0))), 1u);
+    let scale = vec2(length(transform.mat.xz), length(transform.mat.yw));


The correct math for this is 0.5 * length(vec2(mat.x + mat.w, mat.y - mat.z)) + length(mat.x - mat.w, mat.y + mat.z)) (using a bit of shorthand). This is not substantially more complicated or inefficient than what you have, so I'd like to see the real thing.

To see the difference, consider the skew matrix [[1 0] [1 1]]. The correct answer is the golden mean ≈1.618, the formula given is sqrt(2) ≈1.414.

The formula given is accurate for uniform scaling and rotation, and also for nonuniform axis-aligned scaling, so it's not hard to see why the incorrect equation persists.

That makes sense. I had this initially written similarly to the correct version but the existing formula looked right under my test cases (as backed up by your explanation) and this slipped through the cracks. Fixed this now to do the correct thing.

raphlinus · 2023-11-14T21:06:15Z

shader/flatten.wgsl

-        }
-        let params = estimate_subdiv(qp0, qp1, qp2, tol);
+        // TODO: Estimate an accurate subdivision count for strokes, handling cusps.
+        let tol = sqrt(REM_ACCURACY);


I believe the correct way to do scaling is:

let scaled_sqrt_tol = sqrt(REM_ACCURACY / scale_factor); ... apply fudge factor here, which will later be reverted... let params = estimate_subdiv(qp0, qp1, qp2, scaled_sqrt_tol); ... let n = max(u32(ceil(val * (0.5 / scaled_sqrt_tol))), 1u);

dfrg

Daniel and Raph already looked at this so I just did an additional sanity check and this all looks reasonable to me. Let's land it!

armansito · 2023-11-21T18:35:00Z

Daniel and Raph already looked at this so I just did an additional sanity check and this all looks reasonable to me. Let's land it!

Thanks!

armansito changed the base branch from gpu-shitty-strokes to main November 10, 2023 22:13

armansito changed the base branch from main to gpu-shitty-strokes November 10, 2023 22:49

armansito commented Nov 10, 2023

View reviewed changes

armansito requested a review from raphlinus November 10, 2023 22:53

DJMcNab reviewed Nov 11, 2023

View reviewed changes

Base automatically changed from gpu-shitty-strokes to main November 14, 2023 00:29

armansito added 3 commits November 13, 2023 16:34

[flatten]: Implement miter join

35fdf1e

- Miter join style is now supported by the GPU stroker - Slightly tweaked the miter limit test cases to to exercise a miter limit boundary condition and test inserting the miter join on both sides of a stroke within the same path.

[test_scenes] Add skew and non-uniform transform variants of stroke_s…

a86e827

…tyles

armansito force-pushed the stroke-join-styles branch from 13fe722 to a86e827 Compare November 14, 2023 00:40

raphlinus reviewed Nov 14, 2023

View reviewed changes

armansito added 3 commits November 14, 2023 15:43

Increase skew factor in stroke styles test

4f84ac1

Address review comments; document flattening

2de6c49

Address review comment about indexing into the result of unpack

d4fd839

armansito mentioned this pull request Nov 15, 2023

CPU-side dashing for GPU strokes & encoding-time filtering of zero-length segments #412

Merged

dfrg approved these changes Nov 21, 2023

View reviewed changes

armansito merged commit 4eb0f51 into main Nov 21, 2023
4 checks passed

armansito deleted the stroke-join-styles branch November 21, 2023 18:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor transform handling in flatten & support miter join #410

Refactor transform handling in flatten & support miter join #410

armansito commented Nov 10, 2023 •

edited

Loading

armansito Nov 10, 2023

DJMcNab left a comment

DJMcNab Nov 11, 2023

armansito Nov 13, 2023

DJMcNab Nov 11, 2023

armansito Nov 13, 2023

DJMcNab Nov 13, 2023 •

edited

Loading

armansito Nov 14, 2023

armansito commented Nov 13, 2023

raphlinus left a comment

raphlinus Nov 14, 2023

armansito Nov 14, 2023 •

edited

Loading

raphlinus Nov 14, 2023

armansito Nov 14, 2023

raphlinus Nov 14, 2023 •

edited

Loading

dfrg left a comment

armansito commented Nov 21, 2023

Refactor transform handling in flatten & support miter join #410

Refactor transform handling in flatten & support miter join #410

Conversation

armansito commented Nov 10, 2023 • edited Loading

Choose a reason for hiding this comment

DJMcNab left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DJMcNab Nov 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

armansito commented Nov 13, 2023

raphlinus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

armansito Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raphlinus Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

dfrg left a comment

Choose a reason for hiding this comment

armansito commented Nov 21, 2023

armansito commented Nov 10, 2023 •

edited

Loading

DJMcNab Nov 13, 2023 •

edited

Loading

armansito Nov 14, 2023 •

edited

Loading

raphlinus Nov 14, 2023 •

edited

Loading