Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric array key handling #37

Open
benagricola opened this issue Apr 19, 2012 · 6 comments
Open

Numeric array key handling #37

benagricola opened this issue Apr 19, 2012 · 6 comments

Comments

@benagricola
Copy link

Noticed an interesting issue with trying to parse a query string from Facebook - when an app is added to a Facebook tab, Facebook redirects the user to a url you give, and appends something like:

?tabs_added[209631315815975]=1&tabs_added[2096313158545932]=0

And... well this simply doesn't parse - the output is logged as { tabs_added: [ ] }

Have looked into why, and of course it's to do with the way integer keys are handled by querystring.

A querystring of ?foo[10]=1&foo[20]=0 will produce
{ foo: [ , , , , , , , , , , '1', , , , , , , , , , '0' ] }.

Fine - but no use with the numbers returned by Facebook (which I guess are too big to be represented), and if you specify smaller, but still valid integers, you can basically cause node to crash by allocating too much memory, for example: ?foo[429496729]=1 will just cause it to chew up more and more memory until it fatals (on my test machine w/ default max-stack-size, I don't know how much memory that will actually use allocating an array of that length).

If a non-numeric array key is specified alongside the numeric array keys, then an object is used to represent the query string which, while not ideal in every situation, will not cause behaviour that can cause node to crash.

I don't really know what the right option is with regards to fixing this, but needs to be mentioned anyway due to the fact that you can basically pass that query string to any express app and have it either fall over or take ages just to parse the query string from the request.

edit: If the isint regexp is changed to var isint = /^[0-9]{1,5}$/;, then you can essentially represent 99,999 items in an array, but longer keys (which may be liable to cause issues with memory usage / speed) will be treated as string keys and represented as an object. It sucks, but at least it means the query string is parsed without causing node to explode.

@tj
Copy link
Owner

tj commented Apr 19, 2012

yeah that's no good, sparse arrays ftw. The support for foo[n] should definitely be capped, I cant imagine any reasonable query using more than 10 (maybe cap around 100 to be safe?), others we could consider string keys

@benagricola
Copy link
Author

Another option would be to assume that given ?foo[]=1&foo[]=2&foo[]=10 then an array is used to represent, but if any keys are specified, regardless of type (integer or string) then an object representation will be used instead? (I don't think it makes much sense to have a 200 item array just to represent 1 value from a query string because its' key is 200).

Then again, if you then went on to mix keyed and non-keyed (?foo[]=1&foo[]=2&foo[a]=3), I have no idea how that would even be represented (or if it should even be allowed / valid).

Edit: the second case with keyed and non-keyed might simply be represented as { 0: 1, 1:2, a: 3}, which is better than losing the non-numeric key.

@tj
Copy link
Owner

tj commented Apr 19, 2012

that's what we originally had but quite a few libs/devs use numerics expecting them to be arrays

@benagricola
Copy link
Author

How about if both modes of operation were possible (and a limit were put in place for array keys), the current one by default, with a switch available which would make everything work in string-key mode unless that specific var had no keys?

I guess it's a pretty app / lib / dev-centric problem in that different approaches work in different situations, so exposing the two ways to do it makes sense when thinking just about node-querystring on its own (its' use in connect / express is another matter since connect.query is implicit in the express application, i.e. how would one go about choosing the query parsing mode?).

@tj
Copy link
Owner

tj commented Apr 19, 2012

yeah in express it's implied. hmm.. pretty tough call, if others were not already using it I would say the object representation is fine

@benagricola
Copy link
Author

Implemented a fix for this in my fork, obviously doesn't solve the express issue but has an extra benefit of allowing the array-with-keys functionality to be turned completely off if required (by setting the max allowed key to 0). When off it will represent everything except unkeyed arrays as an object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants