Kafka 17792 header parsing times out processing and using large quantities of memory if the string looks like a number #17510

msillence · 2024-10-15T14:02:35Z

We have trace headers such as:

"X-B3-SpanId": "74320e6e26adc8f8"

if however the value happens to be: "407127e212797209"

This is then treated as a numeric value and it tries to convert this as a numeric representation and an exact value using BigDecimal
Specifically the calls to setScale with floor and ceiling seem to cause the problem

testing this the behaviour of biginteger and setScale calls shows there are already limits on the number range possible sadly my particular number falls in the range of about 25 minutes and 4 gig of memory rather than quick failure

1e+1 time 0.0 totalMemory 532676608
1e+10 time 0.0 totalMemory 532676608
1e+100 time 0.0 totalMemory 532676608
1e+1000 time 0.0 totalMemory 532676608
1e+10000 time 0.005 totalMemory 532676608
1e+100000 time 0.035 totalMemory 532676608
1e+1000000 time 0.228 totalMemory 532676608
1e+10000000 time 4.308 totalMemory 926941184
1e+100000000 time 117.119 totalMemory 3221225472
1e+1000000000 time 0.0 totalMemory 3221225472 BigInteger would overflow supported range
1e+10000000000 time 0.001 totalMemory 3221225472 Too many nonzero exponent digits.
1e+100000000000 time 0.0 totalMemory 3221225472 Too many nonzero exponent digits.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

mimaison · 2024-10-15T17:01:53Z

Thanks for the PR. Please make sure it only contains changes related to the issue this is addressing. Currently I see a lot of other changes including some non-Apache Kafka files.

msillence · 2024-10-16T10:01:47Z

Thanks for the PR. Please make sure it only contains changes related to the issue this is addressing. Currently I see a lot of other changes including some non-Apache Kafka files.

I'm not quite sure what I've done wrong, I cloned the repo and applied my change again and forced pushed it to my fork and it still looks wrong

The change is here msillence@3720951

msillence · 2024-10-16T10:03:14Z

Oh now I see the problem I've mixed up repos this change is against github.com:confluentinc/kafka

msillence · 2024-10-16T10:25:15Z

actually there is a simpler solution https://stackoverflow.com/a/12748321

msillence · 2024-10-17T19:37:34Z

I feel like the correct fix is the stackoverflow test:
bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0
sadly there are quite a few tests that expect the perhaps incorrect behavior where Float.MAX_VALUE which is 3.4028235E38 looks like a integer to me but the ceil/floor calls failed due to the number of zeros so it fell though to another part of the parsing and returned a float.
To further confound the issue the very small numbers such as 1e-100000000 will with the new integer test fall though to the float conversion which only checks the upper range and will simply convert to zero as float doesn't have the precision. Fixing that breaks yet more tests.

gharris1727

@msillence Thanks for the PR!

I did not know that setScale with very large numbers was so costly. I see that the old logic (before #15469) used RoundingMode.ROUND_UNNECESSARY, which probably had some guards in place to avoid consuming resources for an impossible operation.

sadly there are quite a few tests that expect the perhaps incorrect behavior where Float.MAX_VALUE which is 3.4028235E38 looks like a integer to me but the ceil/floor calls failed due to the number of zeros so it fell though to another part of the parsing and returned a float.

That number is indeed an integer, but it's too large to fit in a LONG64, which has ~19 decimals of precision. The tests are correct that this should be parsed as a FLOAT32.

While I understand that having mostly STRINGs with an occasional number-look-alike makes it natural to suggest changing this into a STRING, I don't think that's backwards compatible or safe. Another user could currently have a flow which parses all Decimals, and the current implementation would have some of their decimals fall back to STRING.

To avoid accidental Decimal parsing, I would recommend either an alternative HeaderConverter implementation, or avoiding using IDs that can be confused for numbers, such as with an alphabetic prefix. Even with this patch, you will still get some accidental Decimal, Float, or Double parsing when the e lands in some part of the value.

gharris1727 · 2024-10-25T19:36:58Z

connect/api/src/main/java/org/apache/kafka/connect/data/Values.java

@@ -1041,6 +1043,10 @@ private static SchemaAndValue parseAsNumber(String token) {
        }

        private static SchemaAndValue parseAsExactDecimal(BigDecimal decimal) {
+            BigDecimal abs = decimal.abs();
+            if (abs.compareTo(TOO_BIG) > 0 || (abs.compareTo(TOO_SMALL) < 0 && BigDecimal.ZERO.compareTo(abs) != 0)) {
+                throw new NumberFormatException("outside efficient parsing range");


At this stage in parsing, we've successfully parsed a BigDecimal. This method tries to losslessly cast the BigDecimal to a primitive integer type. If the cast would be lossy, it returns null and lets the caller figure out whether it's a float32, float64, or has to be an arbitrary precision decimal.

Rather than throwing NumberFormatException and making this a STRING, we should return null, and let the other fallback logic happen, so this shows up as an arbitrary precision Decimal.

I also suspect that the comparison could be a little tighter in that case: TOO_BIG could be LONG_MAX, and TOO_SMALL could be ONE, eliminating calling setScale for 1e20 which cannot possibly fit in a LONG.

showuon · 2025-01-21T09:59:18Z

@gharris1727 , do you have any further comments for this?

gharris1727

@msillence Thanks for changing this to fallback to the Float/Double/Decimal logic! We're interested in getting this into the upcoming release, so please let us know if you're able to update the PR soon.

@showuon This PR is currently not mergeable and would cause build/test failures. But the changes needed to get it to mergeable seem minor.

gharris1727 · 2025-01-21T16:47:22Z

connect/api/src/main/java/org/apache/kafka/connect/data/Values.java

+    private static BigDecimal TOO_BIG = new BigDecimal("1e1000000");
+    private static BigDecimal TOO_SMALL = new BigDecimal("1e-1000000");


These constants are unused.

gharris1727 · 2025-01-21T16:54:06Z

connect/api/src/test/java/org/apache/kafka/connect/data/ValuesTest.java

+    public void avoidCpuAndMemoryIssuesConvertingExtremeBigDecimals() {
+    	long s = System.currentTimeMillis();
+        String veryBig = "1e+100000000"; // new BigDecimal().setScale(0, RoundingMode.FLOOR) takes around two minutes and uses 3GB;
+        System.out.println("time taken" + System.currentTimeMillis() - s)


I think we should remove this debug print, and have a Timeout annotation on this method to cause fails on regressions.

gharris1727 · 2025-01-21T16:54:26Z

connect/api/src/test/java/org/apache/kafka/connect/data/ValuesTest.java


+        String verySmall = "1e-100000000";
+        assertEquals(new SchemaAndValue(Schema.STRING_SCHEMA, verySmall), Values.parseString(verySmall));
+        assertFalse(true);


This will always fail.

gharris1727 · 2025-01-21T17:01:02Z

connect/api/src/main/java/org/apache/kafka/connect/data/Values.java


        private static SchemaAndValue parseAsExactDecimal(BigDecimal decimal) {
+            BigDecimal abs = decimal.abs();


This value is unused.

I think I've addressed this now

gharris1727 · 2025-01-21T17:01:22Z

connect/api/src/main/java/org/apache/kafka/connect/data/Values.java

@@ -1039,30 +1041,34 @@ private static SchemaAndValue parseAsNumber(String token) {
                return new SchemaAndValue(schema, decimal);
            }
        }
+
+    	public static boolean isBigInteger(BigDecimal bd) {


This uses tabs, which is disallowed by our linter.

gharris1727 · 2025-01-21T17:14:56Z

connect/api/src/test/java/org/apache/kafka/connect/data/ValuesTest.java

+    	long s = System.currentTimeMillis();
+        String veryBig = "1e+100000000"; // new BigDecimal().setScale(0, RoundingMode.FLOOR) takes around two minutes and uses 3GB;
+        System.out.println("time taken" + System.currentTimeMillis() - s)
+        assertEquals(new SchemaAndValue(Schema.BYTES_SCHEMA, veryBig), Values.parseString(veryBig));


This shouldn't be a plain BYTES schema, it should be a connect Decimal type. And veryBig needs to be parsed into a BigDecimal.

gharris1727 · 2025-01-21T17:18:34Z

connect/api/src/test/java/org/apache/kafka/connect/data/ValuesTest.java


+        String verySmall = "1e-100000000";
+        assertEquals(new SchemaAndValue(Schema.STRING_SCHEMA, verySmall), Values.parseString(verySmall));


This actually gets parsed as a FLOAT32. Arguably that's a bug and we're losing precision, but it's the legacy behavior and we shouldn't change it, or at least have a separate change for this. For now, we should just update this test to match the behavior.

gharris1727 · 2025-01-21T17:19:26Z

connect/api/src/test/java/org/apache/kafka/connect/data/ValuesTest.java

@@ -45,6 +41,8 @@
 import static org.junit.jupiter.api.Assertions.assertThrows;
 import static org.junit.jupiter.api.Assertions.assertTrue;
 import static org.junit.jupiter.api.Assertions.fail;
+import org.junit.jupiter.api.Test;


If you run into problems with the imports, run ./gradlew spotlessApply

gharris1727 · 2025-01-21T17:33:16Z

connect/api/src/main/java/org/apache/kafka/connect/data/Values.java

@@ -1039,30 +1041,34 @@ private static SchemaAndValue parseAsNumber(String token) {
                return new SchemaAndValue(schema, decimal);
            }
        }
+
+    	public static boolean isBigInteger(BigDecimal bd) {
+    		return bd.signum() == 0 || bd.scale() <= 0 || bd.stripTrailingZeros().scale() <= 0;


This stripTrailingZeros covers the cases that the ceil/floor comparison was supposed to cover, great find!

gharris1727 · 2025-01-21T17:33:36Z

connect/api/src/test/java/org/apache/kafka/connect/data/ValuesTest.java

@@ -1154,7 +1152,17 @@ public void shouldParseFractionalPartsAsIntegerWhenNoFractionalPart() {
        assertEquals(new SchemaAndValue(Schema.INT32_SCHEMA, 66000), Values.parseString("66000.0"));
        assertEquals(new SchemaAndValue(Schema.FLOAT32_SCHEMA, 66000.0008f), Values.parseString("66000.0008"));
    }
+    @Test


nit: missing a newline above this test.

sillencem · 2025-01-24T08:32:29Z

Sorry I'd been busy with other things. I took another look and I think I've worked out the issues with the long durations now and eliminated all the ceil/floor calls

…ities of memory if the string looks like a number

sillencem · 2025-01-24T16:38:36Z

I rebased and the build is now clean

github-actions bot added streams core Kafka Broker tools connect build Gradle build or GitHub Actions clients labels Oct 15, 2024

msillence force-pushed the KAFKA-17792 branch from dd47ebc to 3720951 Compare October 16, 2024 09:56

msillence force-pushed the KAFKA-17792 branch from 3720951 to 4191d5a Compare October 16, 2024 10:07

github-actions bot added the small Small PRs label Oct 16, 2024

msillence force-pushed the KAFKA-17792 branch 2 times, most recently from f8af0e1 to 4b4a8f1 Compare October 17, 2024 19:27

mimaison requested a review from gharris1727 October 22, 2024 13:01

gharris1727 reviewed Oct 25, 2024

View reviewed changes

mjsax removed streams core Kafka Broker tools build Gradle build or GitHub Actions clients labels Nov 2, 2024

gharris1727 added the ci-approved label Jan 21, 2025

gharris1727 reviewed Jan 21, 2025

View reviewed changes

msillence added 3 commits January 24, 2025 10:23

KAFKA-17792 header parsing times out processing and using large quant…

07a4bd3

…ities of memory if the string looks like a number

fix code and tests

91639cf

KAFKA-17792 header parsing times out processing and using large quant…

bb00f70

…ities of memory if the string looks like a number

msillence added 3 commits January 24, 2025 10:36

still convert to int

37fae21

spotless

5225155

add missing new line

b407e58

msillence force-pushed the KAFKA-17792 branch from ca0750d to b407e58 Compare January 24, 2025 10:37

spotless

a4d376b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka 17792 header parsing times out processing and using large quantities of memory if the string looks like a number #17510

Kafka 17792 header parsing times out processing and using large quantities of memory if the string looks like a number #17510

msillence commented Oct 15, 2024

mimaison commented Oct 15, 2024

msillence commented Oct 16, 2024

msillence commented Oct 16, 2024

msillence commented Oct 16, 2024

msillence commented Oct 17, 2024

gharris1727 left a comment

gharris1727 Oct 25, 2024

showuon commented Jan 21, 2025

gharris1727 left a comment

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

sillencem Jan 24, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

gharris1727 Jan 21, 2025

sillencem commented Jan 24, 2025

sillencem commented Jan 24, 2025

		private static BigDecimal TOO_BIG = new BigDecimal("1e1000000");
		private static BigDecimal TOO_SMALL = new BigDecimal("1e-1000000");


		private static SchemaAndValue parseAsExactDecimal(BigDecimal decimal) {
		BigDecimal abs = decimal.abs();


		String verySmall = "1e-100000000";
		assertEquals(new SchemaAndValue(Schema.STRING_SCHEMA, verySmall), Values.parseString(verySmall));

Kafka 17792 header parsing times out processing and using large quantities of memory if the string looks like a number #17510

Are you sure you want to change the base?

Kafka 17792 header parsing times out processing and using large quantities of memory if the string looks like a number #17510

Conversation

msillence commented Oct 15, 2024

Committer Checklist (excluded from commit message)

mimaison commented Oct 15, 2024

msillence commented Oct 16, 2024

msillence commented Oct 16, 2024

msillence commented Oct 16, 2024

msillence commented Oct 17, 2024

gharris1727 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

showuon commented Jan 21, 2025

gharris1727 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sillencem commented Jan 24, 2025

sillencem commented Jan 24, 2025