-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor transform handling in flatten & support miter join #410
Conversation
// See https://www.iquilezles.org/www/articles/ellipses/ellipses.htm | ||
// This is the correct bounding box, but we're not handling rendering | ||
// in the isotropic case, so it may mismatch. | ||
stroke = 0.5 * linewidth * vec2(length(transform.mat.xz), length(transform.mat.yw)); | ||
bbox += vec4(-stroke, stroke); | ||
stroke = offset * vec2(length(transform.mat.xz), length(transform.mat.yw)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stroke
is now unused. I've been thinking about removing this field from the cubic
struct or possibly replacing Cubic
with CubicPoints
+ just the path_ix
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also need to be reflected in the CPU shaders, or will those continue to use the kurbo based stroke expansion?
Otherwise, a couple of tiny style questions
shader/flatten.wgsl
Outdated
let c = tan_prev.x * tan_next.y - tan_prev.y * tan_next.x; | ||
let d = dot(tan_prev, tan_next); | ||
let hypot = length(vec2f(c, d)); | ||
let miter_limit = unpack2x16float(style_flags & STYLE_MITER_LIMIT_MASK).x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably choose to use the indexing form of vector access (i.e. unpack2x16float(...)[0]
) here, as this isn't semantically an x
point but just the first 16 bits.
But that's not based on any checking what the style is elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some uses of the unpack
intrinsics in the fine shader that perform a swizzle using the xyzw
accessors, though I suppose changing those to use the color (rgba
) accessors would be semantically more correct.
Using [0]
here sounds good to me.
p10: vec2f, p11: vec2f, | ||
transform: Transform | ||
) { | ||
let line_ix = atomicAdd(&bump.lines, 2u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this semantically equivalent to output_line_with_transform
called twice, and only written this way to reduce atomic traffic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly right. The only reason I added this variant is to reduce the atomic traffic and I was outputting a pair of lines frequently enough for joins that adding this seemed worthwhile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense.
I'd probably add a comment to that effect, or possible add a function (output_line_into_allocated_space
?) which these both call
I did notice one case where four three lines were handled in a row (when handling miter joins) . Handling that case all in one bump could make sense (and could additionally avoid creating some var
iables). But that also adds would seem to add quite a lot of noise for tiny gains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good idea. I'm not sure yet how significant the impact of the additional atomic here is but I suspect it might matter. At any rate, I like the idea of adding a output_line_into_allocated_space
helper as I need something to that effect for round caps and joins where the line count is typically a lot higher.
Yes, I'm planning to make the changes to the CPU shaders in a follow-up PR very soon (I'm planning to get that done over the next couple weeks). The kurbo-based stroke expansion will only be used in the "CPU stroke mode" (for lack of a better term) and for dashing in the short term. The CPU shaders themselves are a CPU incarnation of the GPU shaders with a better test environment, so I'm planning to mirror these changes there pretty much 1:1. |
- Miter join style is now supported by the GPU stroker - Slightly tweaked the miter limit test cases to to exercise a miter limit boundary condition and test inserting the miter join on both sides of a stroke within the same path.
* Flattening (including for fills, offset curves, caps, and joins) is now performed in a curve's local coordinate space (pre-transform) and the transform is applied at the time a line is output. This may be unnecessary is for fills but it keeps the code mostly uniform. * The line count estimate for subdivision factors in the scale factor which is decomposed from the transform matrix. * The bounding box computation is now precise and purely based on the union of output line segments.
13fe722
to
a86e827
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm commenting on the transforms only, as I've done a deep dive. I'll dig into the rest a bit later.
In addition to what's inline:
The handling of scale_factor
and transform should be different in the fill and stroke case. In the fill case, the scale_factor should be 1 and the cubic should be transformed before going into flatten_cubic. That gets near-optimal flattening even in the presence of non-angle-preserving transform.
In the stroke case, scale_factor should be computed as stated, and the transform applied post-stroking. That gets correct computation of miters, elliptical pen for round, etc. It's also not optimal flattening (more so the more eccentric the transform), but we've decided the extra complexity for optimal flattening in that case is not worth it.
Let me know if any of this is not clear.
shader/flatten.wgsl
Outdated
@@ -52,15 +52,15 @@ fn approx_parabola_inv_integral(x: f32) -> f32 { | |||
return x * sqrt(1.0 - B + (B * B + 0.5 * x * x)); | |||
} | |||
|
|||
fn estimate_subdiv(p0: vec2f, p1: vec2f, p2: vec2f, sqrt_tol: f32) -> SubdivResult { | |||
fn estimate_subdiv(p0: vec2f, p1: vec2f, p2: vec2f, sqrt_tol: f32, transform_scale: f32) -> SubdivResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs an additional transform_scale
argument, and the scaling is subtly incorrect with it added.
The val
return has this meaning: an optimal subdivision of the quadratic into val
/ 2 subdivisions (which may be fractional) will have an error sqrt_tol^2 (ie the desired tolerance). There are two cases. In the non-cusp case, the error scales as the inverse square of val (doubling val causes the error to be one fourth), so the tolerance is actually not needed for the calculation (which is why it's applied in the caller). In the cusp case, this scaling breaks down and the tolerance parameter is needed to compute the correct result.
In both cases, it's better to handle the scaling in the caller; it's not helpful to do it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs an additional
transform_scale
argument, and the scaling is subtly incorrect with it added.The
val
return has this meaning: an optimal subdivision of the quadratic intoval
/ 2 subdivisions (which may be fractional) will have an error sqrt_tol^2 (ie the desired tolerance). There are two cases. In the non-cusp case, the error scales as the inverse square of val (doubling val causes the error to be one fourth), so the tolerance is actually not needed for the calculation (which is why it's applied in the caller). In the cusp case, this scaling breaks down and the tolerance parameter is needed to compute the correct result.
Thank you, this is very illuminating as to how val
gets computed (as I didn't understand before why the tolerance was only considered when the signs are different).
I copied your comment into the code as it's a very useful read.
shader/flatten.wgsl
Outdated
@@ -139,7 +139,9 @@ fn flatten_cubic(cubic: Cubic) { | |||
let Q_ACCURACY = ACCURACY * 0.1; | |||
let REM_ACCURACY = ACCURACY - Q_ACCURACY; | |||
let MAX_HYPOT2 = 432.0 * Q_ACCURACY * Q_ACCURACY; | |||
var n_quads = max(u32(ceil(pow(err * (1.0 / MAX_HYPOT2), 1.0 / 6.0))), 1u); | |||
let scale = vec2(length(transform.mat.xz), length(transform.mat.yw)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct math for this is 0.5 * length(vec2(mat.x + mat.w, mat.y - mat.z)) + length(mat.x - mat.w, mat.y + mat.z))
(using a bit of shorthand). This is not substantially more complicated or inefficient than what you have, so I'd like to see the real thing.
To see the difference, consider the skew matrix [[1 0] [1 1]]. The correct answer is the golden mean ≈1.618, the formula given is sqrt(2) ≈1.414.
The formula given is accurate for uniform scaling and rotation, and also for nonuniform axis-aligned scaling, so it's not hard to see why the incorrect equation persists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I had this initially written similarly to the correct version but the existing formula looked right under my test cases (as backed up by your explanation) and this slipped through the cracks. Fixed this now to do the correct thing.
shader/flatten.wgsl
Outdated
} | ||
let params = estimate_subdiv(qp0, qp1, qp2, tol); | ||
// TODO: Estimate an accurate subdivision count for strokes, handling cusps. | ||
let tol = sqrt(REM_ACCURACY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the correct way to do scaling is:
let scaled_sqrt_tol = sqrt(REM_ACCURACY / scale_factor);
... apply fudge factor here, which will later be reverted...
let params = estimate_subdiv(qp0, qp1, qp2, scaled_sqrt_tol);
...
let n = max(u32(ceil(val * (0.5 / scaled_sqrt_tol))), 1u);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Daniel and Raph already looked at this so I just did an additional sanity check and this all looks reasonable to me. Let's land it!
Thanks! |
This PR implements the miter join style and adds additional test scenes for stroking under non-uniform scale and skew transforms. This includes some changes to transform handling in the
flatten
stage to keep stroking correct under these transforms:Flattening (including for fills, offset curves, caps, and joins) is now performed in a curve's local coordinate space (pre-transform) and the transform is applied at the time a line gets output (i.e. LineSoup time). This may be unnecessary for fills but the shader currently uses the same code for flattening fills and strokes.
The line count estimate for subdivision uses local control point coordinates and makes use of the scale factor which is decomposed from the transform matrix.
The bounding box computation is now based on the union of output line segments, which get accumulated at LineSoup time.
This PR is based on #408