Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report: truncate long attribute values in HTML snippets #10984

Merged
merged 9 commits into from
Jul 15, 2020

Conversation

Beytoven
Copy link
Contributor

@Beytoven Beytoven commented Jun 17, 2020

Addresses #10717.

Before:
Screen Shot 2020-07-14 at 10 49 09 AM

After:
Screen Shot 2020-07-14 at 10 41 18 AM

@Beytoven Beytoven requested a review from a team as a code owner June 17, 2020 22:02
@Beytoven Beytoven requested review from connorjclark and removed request for a team June 17, 2020 22:02
@@ -123,6 +123,14 @@ function getOuterHTMLSnippet(element, ignoreAttrs = []) {
ignoreAttrs.forEach(attribute =>{
clone.removeAttribute(attribute);
});
for (const attributeName of clone.getAttributeNames()) {
let attributeValue = clone.getAttribute(attributeName);
if (attributeValue.length > 100) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd pull this 100 up to a const towards the top of the method

for (const attributeName of clone.getAttributeNames()) {
let attributeValue = clone.getAttribute(attributeName);
if (attributeValue.length > 100) {
attributeValue = attributeValue.slice(0, 97) + '...';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protip: option-; on mac is which is a single character ellipsis.

@brendankenny
Copy link
Member

before/after screenshots? :)

@brendankenny
Copy link
Member

before/after screenshots? :)

oh, I see, not limiting which attributes, just making sure none of them are ridiculously long. I guess that's also not that hard to imagine :)

@connorjclark
Copy link
Collaborator

This is technically a breaking change.

Any way we could truncate in just the report but leave snippet untouched?

@Beytoven
Copy link
Contributor Author

This is technically a breaking change.

Any way we could truncate in just the report but leave snippet untouched?

I could look into making this change in the snippet renderer instead. I think eventually we'd want to do it in page-functions for the next major release.

@connorjclark
Copy link
Collaborator

connorjclark commented Jun 23, 2020 via email

Copy link
Collaborator

@connorjclark connorjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. pending approval on seeing a before/after (cnn.com has nice big ones).

@connorjclark
Copy link
Collaborator

Maybe we should also have a maxTotalAttributeLength (random pick: 600) and once we hit it we elide more/all of the value?

@Beytoven
Copy link
Contributor Author

Maybe we should also have a maxTotalAttributeLength (random pick: 600) and once we hit it we elide more/all of the value?

That would be nice, but then we run the risk of more important identifiers being cut off because they fall behind the other super long data attributes. I think the reason we opted for truncating per attribute was so that wouldn't happen. An alternative is that we could omit data-* attributes. They tend to be long and I don't know if they're actually useful in regards to element identification.

@connorjclark
Copy link
Collaborator

spoke offline:

  • let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with
  • considered making sure we process class and id (most important for identification) first, but we figure that those attributes are typically already first in the markup, so we can just rely on the order of attributes correlating to importance.
  • As a follow up PR, @Beytoven wants to explore making the node detail type expandable in the report, such that we could show the entire HTML, unelided.

@paulirish
Copy link
Member

  • let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with

do we need to? since this adds so much more complexity i'd rather defer until we across a real example that necessitates it.

@connorjclark
Copy link
Collaborator

  • let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with

do we need to? since this adds so much more complexity i'd rather defer until we across a real example that necessitates it.

Can you elaborate? the example in the OP is exactly why this is necessary. Just having a cap on the size of individual attributes does not prevent a lengthy node snippet when there are many attributes.

@Beytoven
Copy link
Contributor Author

So the approach I ended up taking here is to establish an "attribute character budget" of sorts. We simply keep count of the number of characters used for attribute names/values and once we've exceeded that budget we remove the remaining attributes. I'm seeing 50-60% reductions in snippet length for some of the longer snippets (1200+ characters).

const reOpeningTag = /^[\s\S]*?>/;
const match = clone.outerHTML.match(reOpeningTag);
if (match && match[0] && charCount > SNIPPET_CHAR_LIMIT) {
return match[0].slice(0, match[0].length - 1) + '…>';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some tests for this part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, currently working on it

charCount += attributeName.length + attributeValue.length;
}
}

const reOpeningTag = /^[\s\S]*?>/;
const match = clone.outerHTML.match(reOpeningTag);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe replace with

const [match] = clone.outerHTML.match(reOpeningTag) || [];

and then drop all the match && match[0] for just match ?

Copy link
Collaborator

@connorjclark connorjclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits, but marking my approval anyway

@connorjclark connorjclark changed the title core(page-functions): truncate long attribute values in HTML snippets report: truncate long attribute values in HTML snippets Jul 14, 2020
@Beytoven Beytoven merged commit bce9930 into master Jul 15, 2020
@Beytoven Beytoven deleted the shorten-html-snippet branch July 15, 2020 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants