-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't tolerate wrong te headers #3260
Conversation
changes: - Just follow the new TE specification (https://datatracker.ietf.org/doc/html/rfc9112#name-transfer-encoding) here and accept to introduce a breaking change. - gandle multiple TE on one line ** breaking changes ** : invalid headers and position will now return an error.
if chunked: | ||
raise InvalidHeader("TRANSFER-ENCODING", req=self) | ||
self.force_close() | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now empty headers (or leading/trailing comma) are still reported as Unsupported, while you deliberately changed the other cases where the spec sanctions our outright refusal to Invalid. Maybe worth adding back the extra branch to avoid the (not wrong, but also not immediately clear) Unsupported transfer coding: ""
message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean we need to support empty value for Transfer-Encoding?
Ie. Put back this:
elif value.lower() == "":
# lacking security review on this case
# offer the option to restore previous behaviour, but refuse by default, for now
self.force_close()
Shouldn't we rather handle a special parsing for trailers instead ? Though why an empty value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not suggest a change in what we accept and what we refuse (here). I merely suggest revisiting the choice of Exception.
Some cases could be (theoretically) valid per the spec, yet unsupported by us. Others are Invalid. I reckon the first instance of encountering an empty val is always a case of Invalid, as the rfc9112 ABNF for Transfer-Encoding does not allow for empty headers (or consecutive commas outside quoting). So that branch (otherwise processed in else
) should be added back. Not changing the fact that we refuse those. Merely swapping the exception.
we will ditch compatibility. It's a major version. Either we follow the spec or not imo. I don't think it will create a major issue especially if people uses a proxy in front of gunicorn , which is encouraged. I prefer the simplicity there/. We need to be clear about the change in the changelog about it. For "", it's still handled since when you split by "," it can then either be supported (chuunked, identity, deflate, ... or not. Do you mean we need to handle the return diffreently for this case? To which part of the spec does it refers? We can add back this case if needed . |
if not self.cfg.tolerate_dangerous_framing: | ||
# T-E can be a list | ||
# https://datatracker.ietf.org/doc/html/rfc9112#name-transfer-encoding | ||
vals = [v.strip() for v in value.split(',')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was wrong earlier. This is not immediately incorrect (as everything containing quotes is rejected below). Might send a patch later to re-verbosify the comments warning about the complex nature of the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pajod o for me. I think it would be goo dif we can make a release later this week :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what complex nature of the list though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# safe option: nuke it, its never needed | ||
if chunked: | ||
raise InvalidHeader("TRANSFER-ENCODING", req=self) | ||
elif val.lower() in ('compress', 'deflate', 'gzip'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do not see how this leads to anything but trouble. We are reading a list of encodings, handle one (even though they all are, as hop-by-hop headers our responsibility only), act like the other does not matter.. then leave it up to the app to figure out how to interpret the body. How could an application possibly correctly deal with this behavior?
- https://peps.python.org/pep-3333/#other-http-features
- https://datatracker.ietf.org/doc/html/rfc9110#section-7.6.1
WSGI servers must handle any supported inbound “hop-by-hop” headers on their own
Just follow the new specification here and accept to introduce a breaking change. Also support multiple encoding on same line.