Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escape characters on xml attributes #217

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dt-jean-baptiste-lemee
Copy link
Contributor

@dt-jean-baptiste-lemee dt-jean-baptiste-lemee commented Aug 18, 2023

Issue: #216

@@ -7,26 +7,31 @@ const Mime = {
'text/html': {
docType: '<!DOCTYPE html>',
ignoreCase: true,
isXml: false,
voidElements: /^(?:area|base|br|col|embed|hr|img|input|keygen|link|menuitem|meta|param|source|track|wbr)$/i
},
'image/svg+xml': {
docType: '<?xml version="1.0" encoding="utf-8"?>',
ignoreCase: false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ignoreCase could be remove in favor of isXml

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this MR is just about using isXml instead of ignoreCase ? I am not sure what I am looking at in here, but I am sure the MR could be way simpler without adding oddly-cased fields (XML is XML, not Xml) and slow/XSS-prone transformers as commented

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nop, this PR is about fixing this issue : #216 I think the test speaks for itself. I commented this, just to say that we could remove ignoreCase boolean since it has to ignoreCase only on non-xml so it's redondant here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've named it as such due XHTML but again I am not sure why we need to change that ... it could be breaking if anyone out there brand-check that property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed isXml to isXML and XmlAttr to XMLAttr

Tell me if there's anything else I could improve on this MR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a regression on &quot;. I've push a solution but again I'm chaining escape and a replace. I wondering why don't we escape all html entities (https://www.w3schools.com/html/html_entities.asp) on Attr.toString through the escape function ? This way we do not need XMLAttr nor Mime.isXML

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we escape all html entities

I think you are better off with JSDOM there ... it's 100% standard, and 100% slower than LinkeDOM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆 I'm trying to switch from JSDom to linkedom because it is 100 000% slower (JSDom takes sometime 5minutes to manage xml document when Linkedom takes 1second on the same document !) But yes, still we need a bit more "standardization". I don't think it's a bad idea for linkedom, but your the boss, it's your choice. We can use our fork. Thank for this lib and the work, it's huge.

Copy link
Owner

@WebReflection WebReflection Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be fair, XML is not the main target here and you are suggesting XML related changes and paths that would slow the common use case by far, as example suggesting to escape all html entities ... I understand this is valuable for your business but this project is about perf ... perf for most common use cases, which is not XML, so apologies if my answers are not the most welcoming one, but I am trying to preserve the original idea of this project which is: work for most common cases and as fast as possible ... your quest feels a bit against that initial/original goal, hence my nitpicking in here. Hope you can understand, and hope your fork will work great for your use case too!

@dt-jean-baptiste-lemee dt-jean-baptiste-lemee marked this pull request as ready for review August 18, 2023 19:52
@dt-jean-baptiste-lemee dt-jean-baptiste-lemee force-pushed the escape-charaters-on-xml-attributes branch 2 times, most recently from 3e2579b to 0ce8770 Compare August 21, 2023 20:23
@dt-jean-baptiste-lemee dt-jean-baptiste-lemee force-pushed the escape-charaters-on-xml-attributes branch 6 times, most recently from 2a7512b to eb4d4e6 Compare August 28, 2023 17:59
@dt-jean-baptiste-lemee dt-jean-baptiste-lemee marked this pull request as draft August 28, 2023 19:12
@dt-jean-baptiste-lemee dt-jean-baptiste-lemee marked this pull request as ready for review August 28, 2023 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants