Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue throwing exception on 1.15.3 #1873

Closed
dorfri opened this issue Dec 26, 2022 · 1 comment
Closed

Encoding issue throwing exception on 1.15.3 #1873

dorfri opened this issue Dec 26, 2022 · 1 comment
Assignees
Labels
bug Confirmed bug that we should fix fixed
Milestone

Comments

@dorfri
Copy link

dorfri commented Dec 26, 2022

Posted also on StackOverflow: https://stackoverflow.com/questions/74917912/url-encoding-in-jsoup-not-working-properly

When I changed version from 1.11.3 to 1.15.3 I started getting MalformedUrlException when fetching URLs with characters that need encoding, like: https://im-creation-assets.s3-us-west-2.amazonaws.com/CelebrityCars[DE]/20221208JuliaRobertsCarJvo/juliayoung-1___native_1200-627.jpg)
Because of the '[' and ']' in the URL...

The exception comes from org.jsoup.helper.CookieUtil#asUri - and was added somewhere between those versions.

I see that your code tries to encode in org.jsoup.helper.HttpConnection#encodeUrl - but this encoding does not work on this URL (and many more).
I can do the encoding myself before calling org.jsoup.Jsoup#connect - the problem is that if there is a redirect to such a URL, I get back to this error.

We do encoding, by the way, using springframework, and it works pretty well, something like:

import org.springframework.web.util.UriComponentsBuilder;
final String encodedUrl = UriComponentsBuilder
.fromUriString(url)
.build()
.encode()
.toUri()
.toString();

(I know I can avoid redirects - do them myself - and encode every URL... but maybe it is possible to fix this encoding issue for everyone...).

Thanks!

@jhy jhy self-assigned this Jan 5, 2023
@jhy jhy added the bug Confirmed bug that we should fix label Jan 5, 2023
@jhy jhy added this to the 1.15.4 milestone Jan 5, 2023
@jhy jhy added the fixed label Jan 5, 2023
@jhy
Copy link
Owner

jhy commented Jan 5, 2023

Thanks for the report! Fixed with 45ed002, will be in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bug that we should fix fixed
Projects
None yet
Development

No branches or pull requests

2 participants