Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't tolerate wrong te headers #3260

Merged
merged 1 commit into from
Aug 6, 2024
Merged

don't tolerate wrong te headers #3260

merged 1 commit into from
Aug 6, 2024

Conversation

benoitc
Copy link
Owner

@benoitc benoitc commented Aug 6, 2024

Just follow the new specification here and accept to introduce a breaking change. Also support multiple encoding on same line.

@benoitc
Copy link
Owner Author

benoitc commented Aug 6, 2024

cc @pajod @tilgovi

@benoitc benoitc changed the title don't tolerate wrong te heade don't tolerate wrong te headers Aug 6, 2024
changes:

- Just follow the new TE specification (https://datatracker.ietf.org/doc/html/rfc9112#name-transfer-encoding)
 here and accept to introduce a breaking change.
- gandle multiple TE on one line

** breaking changes ** : invalid  headers and position will now return
an error.
@benoitc benoitc merged commit ff2109e into master Aug 6, 2024
46 checks passed
if chunked:
raise InvalidHeader("TRANSFER-ENCODING", req=self)
self.force_close()
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now empty headers (or leading/trailing comma) are still reported as Unsupported, while you deliberately changed the other cases where the spec sanctions our outright refusal to Invalid. Maybe worth adding back the extra branch to avoid the (not wrong, but also not immediately clear) Unsupported transfer coding: "" message?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean we need to support empty value for Transfer-Encoding?

Ie. Put back this:

              elif value.lower() == "":
                     # lacking security review on this case
                     # offer the option to restore previous behaviour, but refuse by default, for now
                     self.force_close()

Shouldn't we rather handle a special parsing for trailers instead ? Though why an empty value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not suggest a change in what we accept and what we refuse (here). I merely suggest revisiting the choice of Exception.

Some cases could be (theoretically) valid per the spec, yet unsupported by us. Others are Invalid. I reckon the first instance of encountering an empty val is always a case of Invalid, as the rfc9112 ABNF for Transfer-Encoding does not allow for empty headers (or consecutive commas outside quoting). So that branch (otherwise processed in else) should be added back. Not changing the fact that we refuse those. Merely swapping the exception.

@benoitc
Copy link
Owner Author

benoitc commented Aug 7, 2024

we will ditch compatibility. It's a major version. Either we follow the spec or not imo. I don't think it will create a major issue especially if people uses a proxy in front of gunicorn , which is encouraged. I prefer the simplicity there/. We need to be clear about the change in the changelog about it.

For "", it's still handled since when you split by "," it can then either be supported (chuunked, identity, deflate, ... or not. Do you mean we need to handle the return diffreently for this case? To which part of the spec does it refers? We can add back this case if needed .

if not self.cfg.tolerate_dangerous_framing:
# T-E can be a list
# https://datatracker.ietf.org/doc/html/rfc9112#name-transfer-encoding
vals = [v.strip() for v in value.split(',')]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was wrong earlier. This is not immediately incorrect (as everything containing quotes is rejected below). Might send a patch later to re-verbosify the comments warning about the complex nature of the list.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pajod o for me. I think it would be goo dif we can make a release later this week :)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what complex nature of the list though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benoitc we split the list in unexpected places, because of the comma in the quoted-string inside transfer-parameter inside transfer-coding, I suggested comments in #3273

# safe option: nuke it, its never needed
if chunked:
raise InvalidHeader("TRANSFER-ENCODING", req=self)
elif val.lower() in ('compress', 'deflate', 'gzip'):
Copy link
Contributor

@pajod pajod Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not see how this leads to anything but trouble. We are reading a list of encodings, handle one (even though they all are, as hop-by-hop headers our responsibility only), act like the other does not matter.. then leave it up to the app to figure out how to interpret the body. How could an application possibly correctly deal with this behavior?

WSGI servers must handle any supported inbound “hop-by-hop” headers on their own

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants