Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should "+" in query parameters be parsed as space? #32

Closed
EvenAR opened this issue Jul 2, 2019 · 11 comments
Closed

Should "+" in query parameters be parsed as space? #32

EvenAR opened this issue Jul 2, 2019 · 11 comments
Labels
breaking would require a MAJOR release request

Comments

@EvenAR
Copy link

EvenAR commented Jul 2, 2019

Our front page has a standard html search form. If the user types a search string, eg. "hello world" and submit the form, the user is taken to /search?query=hello+world where the elm application is located. When parsing the query parameter I expect the output to be "hello world" - however the actual output is "hello+world". Is this intentional or is this something that should be fixed? I haven't found specific documentation on this, but it seems to be common practice to handle + in query paramaters as space.


Example (21 Mar 2021):

let
    queryParser = 
        Url.Parser.s "search" <?> Url.Parser.Query.string "q"
in
Url.fromString "https://www.example.com/search?q=how+much+is+1%2B1%3F"
    |> Maybe.andThen (Url.Parser.parse queryParser)

-- Result:    Just (Just "how+much+is+1+1?")
-- Expected:  Just (Just "how much is 1+1?")
@rlefevre
Copy link
Member

rlefevre commented Aug 2, 2019

In the query part, it exactly means a space:
https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1

jamieklassen pushed a commit to concourse/concourse that referenced this issue Oct 22, 2019
work around elm/url#32.

#4313

Signed-off-by: Jamie Klassen <[email protected]>
@malaire
Copy link

malaire commented Jun 30, 2020

In the query part, it exactly means a space:
https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1

That refers to forms, not to URLs. URL standard - parsing does not mention that + should be handled specially in query.

And if you look at URL equivalence, it's clear that + and space are not equivalent and must be handled as being different.

So this issue should be closed.

@emarthinsen
Copy link

Standard or no standard, if you create a vanilla HTML form and use it to submit some text (via GET) with a space in it, then you'll get the a + in your URL. If you go to Google and enter a search string that has a space, you'll also see the +. We can debate whether the standard allows a plus or whether it should be handled differently, but the default behavior of the most popular browsers is to encode a plus as a space in a querystring. It would be nice if the Url library took that into account and parsed querystrings accordingly. I think the main reason it does not is because the underlying JS function that powers this doesn't decode a + into a space, but this feels like a good opportunity to add this feature and create something more useful and consistent.

@malaire
Copy link

malaire commented Jul 1, 2020

When submitting a form with built-in <form action="...">, then space is encoded into +.

But that is just one way to use query strings and not the only way. Query strings can also be used for other things which have nothing to do with submitting forms - and in those cases URL standard says that space is not equivalent to +.

Also forms can be submitted without using <form action="...">, and then also space is not equivalent to +.

@malaire
Copy link

malaire commented Jul 1, 2020

We can debate whether the standard allows a plus or whether it should be handled differently, but the default behavior of the most popular browsers is to encode a plus as a space in a querystring.

COMPLETELY WRONG. There isn't a single browser which encodes space as plus in query string when not submitting a form. Also standard does not allow that but specifically forbids that when not submitting a form.

@evancz evancz added breaking would require a MAJOR release request labels Feb 9, 2021
@CSDUMMI
Copy link

CSDUMMI commented Mar 19, 2021

How is it problematic or breaking?
I mean, it is not very hard to write a program that implements compatibility
with this behavior.

toForm : String -> String
toForm query = String.replace "+" " " query

@EvenAR
Copy link
Author

EvenAR commented Mar 21, 2021

I mean, it is not very hard to write a program that implements compatibility
with this behavior.

toForm : String -> String
toForm query = String.replace "+" " " query

Note that this should be done beforeUrl.Parser.parse, when the query string hasn't been "URL-decoded" yet. Otherwise you will also replace "+"-signs that are supposed to be there.

Url.Parser.parse { url | query = Maybe.map (String.replace "+" "%20") url.query }

@CSDUMMI
Copy link

CSDUMMI commented Mar 23, 2021

Nonetheless, this does not seem to be an issue with elm/url.

@CSDUMMI
Copy link

CSDUMMI commented Apr 1, 2021

@EvenAR is this still an issue?
Do you have the problem still?

@EvenAR
Copy link
Author

EvenAR commented Apr 1, 2021

I think this issue can be closed. If this way of encoding spaces is specific to Html forms and not part of standard URL encoding, I agree this is not an issue with elm/url. It's not too hard to handle it explicitly when needed.

@EvenAR EvenAR closed this as completed Apr 7, 2021
@maxime-didier
Copy link

maxime-didier commented Jul 15, 2022

This issue should be reopened.

The browser is not the only software that implements the URL spec rather than the raw URI RFC. In my case, the URLs that point to my Elm app are generated by a Java web server with the javax.ws.rs.core.UriBuilder class which is standard.

I also cannot correctly workaround this on the Elm side since:

  • I do not have access to the original URL String or JS URL object ; Browser.application only gives me the Url.Url Elm value which only contains fully decoded values.
  • Since I only have access to the value as decoded by this package, I cannot distinguish bar+baz from bar%2Bbaz in a query string, which should be decoded as bar baz and bar+baz respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking would require a MAJOR release request
Projects
None yet
Development

No branches or pull requests

7 participants