Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.eclipse.jetty.http converts incoming content type "application/json; charset=utf-8" to uppercase charset=UTF-8 #12267

Open
gjoshi86 opened this issue Sep 12, 2024 · 8 comments
Labels

Comments

@gjoshi86
Copy link

Jetty 9.4.50.v20221201

OpenJDK 8u292 (1.8.0_292-b10)

When client sends POST call with Content-Type "application/json; charset=utf-8", it reaches our application which uses Jetty 9.4.50.v20221201 and converts it to "application/json; charset=UTF-8" with uppercase.

I debugged the Jetty-http project and found that org.eclipse.jetty.http.HttpParser class has CACHE field. While parsing Content-Type, it uses getBest() method, to get the best match and returns charset=UTF-8 with uppercase.

I know, I am using older version of Jetty which is end of support. I just need your inputs on following queries.

  1. Need to know why it is returning the uppercase UTF-8, even if client has send with utf-8 lowercase?
  2. What are the implication of setting org.eclipse.jetty.http.HttpParser.STRICT to true which is compliance mode = LEGACY
  3. Are there any other ways, we can get the UTF-8 in same format it was sent in the request from the client?
@olamy
Copy link
Member

olamy commented Sep 12, 2024

Hi,
Just to let you know
Jetty 9.x is EOL see #7958
Jetty 10/11 is EOL as well #10485

Can you try to reproduce your issue with Jetty 12?

For commercial support of Jetty, see above listed issues.

@gregw
Copy link
Contributor

gregw commented Sep 13, 2024

I think you have answered your own question. It is a case insensitive cache of common header values. There are compliance modes that you can use to bypass the cache and keep the case.... But you should not need to add charsets should be case insensitive.

Note there are fine grained compliance mode controls, so you don't need to go all the way to fill Legacy mode.

That's about all we can say for an end of life release

@joakime
Copy link
Contributor

joakime commented Sep 13, 2024

Also note, that the mime-type application/json has no charset, and using a charset on it has no meaning.
It is always UTF-8, 100% of the time, in all cases.

@gjoshi86
Copy link
Author

@gregw @joakime Thank you for your response! This is helpful.

I have couple of questions before I close this ticket.
1. I just need confirmation that the CACHE implementation in org.eclipse.jetty.http.HttpParser is for performance
optimization. Is that right?
2. I have a question around "Note there are fine grained compliance mode controls, so you don't need to go all the way to
fill Legacy mode." - I tried different compliance mode like RFC7230, RFC2616 etc but it works only in case of LEGACY
compliance mode. I think the property (org.eclipse.jetty.http.HttpParser.STRICT = true) kicks in only in case if LEGACY
compliance mode. Secondly, Is it possible to set LEGACY mode only for specific header like Content-Type?

@gregw
Copy link
Contributor

gregw commented Sep 23, 2024

@gjoshi86 The cache is indeed an optimization to avoid many copies of the same string being created and also to allow fast lookup of the actual semantics.

For fine grained compliance in jetty-9, you will need to use one of the CUSTOM modes configured with a system property. See the HttpCompliance class for more detail

@seemasjoshi
Copy link

@gregw We have similar situation where we need to support LEGACY mode only for HttpComplianceSection.CASE_INSENSITIVE_FIELD_VALUE_CACHE. How can we use the CUSTOM mode to support this? If possible, please share an example.

Also just for my understanding will you please share the reason for choosing upper case to store content types in cache instead of lower case? I have observed that most of the older APIs use lower case for Content types. Hence looking for reason, if any.

@joakime
Copy link
Contributor

joakime commented Oct 9, 2024

@seemasjoshi
Copy link

Thank you! I will try these examples.

It will be helpful if you can also share the reasoning behind the design choice of storing upper case values in cache instead of lower case. This will help us better communicate the change with our customers and ensure to align with best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants
@olamy @gregw @joakime @gjoshi86 @seemasjoshi and others