-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input name validation RegEx makes some pretty rigid assumptions #81
Comments
Thanks for your excellent post.
Well put, and you reverse-engineered the original intent perfectly. When I originally wrote this, I didn't see the broader use case of the utility and got sucked into the mindset of writing a "sensible" default behavior. Since, I've learned about the initiatives of the HTML JSON form submission spec. It also suggests that the priority should be ensuring that the form data is serialized entirely. I haven't had a lot of time to continue work on this plugin and that spec is sadly no longer maintained, though I believe that spec to be a very good goal for this plugin still. You seem like you have a good head on your shoulders and I'd appreciate any help you could offer on getting this plugin some of the critical updates it's been needing for a while. |
Hey @macek Glad to hear we're on the same page on this.
That was an interesting read. I was not aware of that project and I agree that it would be nice to have a JS lib that would allow you to do just that. Dealing with nested data structures in HTML forms is far more tedious than it needs to be.
Thanks. I am going to add a fork of your repo to my project, so I can shoot you a PR if I come up with something I'd consider useful. Incidentally, I am writing a form builder plugin, so your plugin will probably be pretty handy. So what do you think should happen now? Is there any harm in not enforcing the JS object syntax in future versions? I am tempted to just strip out most of the constraints in the regex, since I cannot see a good reason to keep it in. I have noticed some oddities with nesting (and nested array keys) when using symbols, so the regex patterns for those will probably need some testing as well |
Oh well, I haven't noticed that you're allowing custom regex patterns through your API, so I could probably do anything I need without even touching your code. I should have scrolled down your readme a bit longer, I guess. Are you even interested in changing the default behaviour with that in mind? |
Correct. However, this API was added after some people were unhappy with the default behavior. It should've been an indication that there was a more serious issue to deal with. Hacking complex regexp patterns is not very friendly and shouldn't be a burden of the common user.
With 3.x I did intend on changing the default behavior of the plugin. And on that note...
I still intended (my mind is open at this point) for this plugin to support building a nested structure, but I would prefer that data the 2.x branch finds "invalid" to somehow be represented in the 3.x output. No data should be lost.
I was tinkering with a set of regexp last night but the testing suite showed a lot of failures that proved this issue is not easily solved by lone manipulation of the internal regexps. Some other internals will have to be reworked and some tests might get thrown away as they were written to enforce old behavior.
To get us on the same page, I think you should show me an example HTML form and the JS(ON) representation of it. What features do you think the plugin should support?
Lateral idea: Do we separate concerns within this project ? I can see two distinct concerns.
|
Removing the "compliance" tag to keep this open to any ideas. I don't want you to feel like our discussion is already cornered with this label. I'm totally open to discussing any possible future for this plugin. And as it stands, there really is nothing for us to comply to as there is no official standard. |
Yes, as long as there is no specific standard for JSON serialization of HTML forms, the HTML spec for forms is the closest we're going to get: Don't get me wrong, though. I'm not saying those should be followed to a tee. But I think it is not too far-fetched to assume that your plugin is expected to be usable for AJAX form submission, especially in cases when jQuery's builtin tools don't provide enough flexibility. I'm certainly not against adding sugar on top, but the base requirement should be ensuring that the generated data is identical to what you'd get from submitting the selected inputs traditionally. So to comment on the feature ideas:
Yep
Yep
No, at least not at the most basic level of functionality. HTML does not do this, so neither should we. But there might be a "JSON mode" that does this
No, unless the mentioned enhanced mode is used
So what about using data attributes to trigger enhanced options of the plugin. <form data-json-enabled>
<input type="number" name="foo" value="123" data-type="number">
<input type="checkbox" name="bar" value="1" data-type="bool">
</form> To be perfectly honest, though, I think there's limited use from the data type stuff. Number and Boolean are the only really interesting types supported by JSON and those might even be deduced by the input elements and their values - without any further configuration. A form attribute triggering the use of extended functionality like automatic Array creation for duplicate keys sounds pretty interesting though. It would also draw a direct connection between the HTML and the JS, meaning that we can be quite sure that the user _knows_ that magic stuff happens when he uses your plugin with that specific form. For reference, here's the custom regex I used to get my nested input names full of symbols working. Nothing special, just putting it here for safe-keeping: $.extend(FormSerializer.patterns, {
validate: /^.*(?:\[(?:\d*|.+)\])*$/i,
key: /[^\[\]]+|(?=\[\])/gi,
named: /^.+$/i
}); Since you asked, I extracted an example of my inputs: <input
name="_objects[children][1][children][0][components][0][Foo\\Bar\\Baz][name]" type="text"
value="Name" placeholder="dummy_name">
<input type="hidden"
name="_objects[children][1][children][0][components][0][Foo\\Bar\\Baz][__id]"
value="56be3f6c26130">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][options][]"
type="text" placeholder="Default value" value="Foo">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][options][]"
type="text" placeholder="Default value" value="Bar">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][options][]"
type="text" placeholder="Default value" value="Baz">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][name]"
type="text">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][label]"
type="text" value="Dropdown:">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][placeholder]"
type="text" value="My Input" placeholder="My Input">
<input type="hidden"
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][active]"
value="false">
<input
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][active]"
type="checkbox" value="true">
<input type="hidden"
name="_objects[children][1][children][0][components][1][Foo\\Bar\\Baz\\Inputs\\Dropdown][__id]"
value="56be3f6c26e20">
<select>
<option>Foo</option>
<option>Bar</option>
<option>Baz</option>
</select> As you can see, it makes heavy use of nesting and uses namespaces in array keys (nothing of this will ever see the frontend, no worries :D ). |
@Biont sorry to leave you hanging. Some stuff came up and I've barely had time at my desk. I'll follow up here soon. |
@Biont I'm happy with your write-up and agree with the decisions you're making. I like that the defaults would be very sensible/basic and any extended features would be enabled by running the serializer with specific options. As for the data types, I also agree it's not remarkably interesting, but I'd still like it. I'd like to avoid type inference where <input name="username" value="1234"> would be serialized as If this isn't part of the default behavior, it could be passed as a // rough idea for typeEncoder option
$('#myForm').serializeObject({
typeEncoder: (input, value) => {
switch (input.data('type') || input.attr('type') || 'text') {
case 'number': return parseInt(value, 10);
case 'boolean': return value === '1' or value === 'true';
default: return value;
}
}); As for duplicate keys, a similar technique could be used with a // rough idea for mergeDuplicateKey option
$('#myForm').serializeObject({
mergeDuplicateKey: (existingValue, newValue) => (
Array.isArray(existingValue)
? existingValue.concat(newValue)
: [existingValue, newValue]
)
}); I'm open to implementing these however you want, but at first glance, I like that it lets the end user decide how these things are handled instead of us forcing a convention on them. Thoughts ? |
I was trying to figure out why
serializeObject()
always returned an empty object when trying to serialize my form data.It turned out to be an issue with the name validation which works with the following regular expression :
My input naming scheme is the following:
_foo[bar\\baz]
. This causes the validation to fail, resulting in an empty object. The reason for that is the use of backslashes.Now, I know using backslashes could be considered nonstandard and I might think about replacing them, but I can't help but wonder why this RegEx is written the way it is to begin with. I tried to find some HTML specs on valid input names and they don't seem to correspond with that expression.
According to HTML4 spec, ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
You're allowing an underscore as the first character, which would be invalid in HTML4. At the same time, the underscore is the only non-letter character allowed, which is too rigid
HTML5 seems to be free-for-all: Any non-empty value for name is allowed, but the names "charset" and "isindex" are special, so even my backslashes should be fine.
My conclusion is that you are using the regular expression to ensure that the JS object syntax can still be used (->
obj.attr
as opposed toobj['attr']
, which would allow symbols other than underscores), which -to me- should have a far lower priority than ensuring the form data is serialized entirely. In fact, I don't think peculiarities of JavaScript syntax should have any say in what and how form data is handled as long as the form data itself is valid HTMLI'd like to hear your thoughts on this. Do you think dropping support for JS object syntax is justifyable in favor of proper input name support? Should there be an argument that permits symbols when serializing data?
As an ugly temporary workaround, I have added backslashes to the RegEx patterns so that i can continue developing for now. Thank you for creating this library.
The text was updated successfully, but these errors were encountered: