Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML JS Beautify incorrectly adds newlines to list items #736

Closed
garretwilson opened this issue Jan 8, 2016 · 14 comments
Closed

HTML JS Beautify incorrectly adds newlines to list items #736

garretwilson opened this issue Jan 8, 2016 · 14 comments
Assignees
Milestone

Comments

@garretwilson
Copy link
Contributor

Go to http://jsbeautifier.org/ and paste the following in the Beautify text box:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html><html  lang="en-us"  xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Test</title>
  </head>
  <body>
    <ul>
      <li><code>foo:bar</code></li>
    </ul>
</html>

The official js-beautify will give you the following:

<?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE html>
    <html lang="en-us" xmlns="http://www.w3.org/1999/xhtml">

    <head>
        <title>Test</title>
    </head>

    <body>
        <ul>
            <li><code>foo:bar</code></li>
        </ul>

    </html>

But paste the same starting source code into Atom 1.3.3, install atom-beautify and make no changes, and hit Ctrl+Alt+B. atom-beautify will give you:

<?xml version="1.0" encoding="utf-8"?>
  <!DOCTYPE html>
  <html lang="en-us" xmlns="http://www.w3.org/1999/xhtml">

  <head>
    <title>Test</title>
  </head>

  <body>
    <ul>
      <li>
        <code>foo:bar</code>
      </li>
    </ul>

  </html>

Note how atom-beautify has incorrectly added newlines to the <li>. atom-beautify must be sending different options to js beautifier and/or doing extra processing.

@garretwilson
Copy link
Contributor Author

I really don't know where you're getting this behavior. This doesn't happen at http://jsbeautifier.org/, so @Glavin001 , I don't see how you can blame it on the underlying library.

@Glavin001
Copy link
Owner

Please follow the guidelines outlined at https://github.com/Glavin001/atom-beautify/blob/master/CONTRIBUTING.md#new-issues-bugs-questions-etc specifically the steps regarding running command Atom Beautify: Help Debug Editor and providing a Gist with your results. This will help us diagnose the true cause of these problems, by showing what your beautifier options are and much more.

JS Beautify is an interesting case because it accepts all of the options and passes them directly to JS Beautify: https://github.com/Glavin001/atom-beautify/blob/master/src/beautifiers/js-beautify.coffee#L31
Atom Beautify does little to get in the way of the beautification process of JS Beautify, and so more often than not, it is a JS Beautify bug.
So if using JS Beautify with Atom Beautify creates different results than usage at jsbeautifier.org it can be because:

  1. your options are different than what is being used at jsbeautifier.org
  2. JS Beautify dependency could be out of date for Atom Beautify, in which case uninstall and reinstall Atom Beautify will trigger a fresh install of the latest dependencies, including JS Beautify and Pretty Diff.

Hope that helps.

@garretwilson
Copy link
Contributor Author

I installed atom-beautify just this morning. No other settings were changed.

Look, I'll update the description to show you how to exactly reproduce the issue with a fresh atom-beautify installation. If you can't reproduce it, sure I can go through all this stuff of debug, gist, debug.md, etc. But there should be no need. It should be reproducible fresh out of the box. I'll update the description with a small test case.

@garretwilson
Copy link
Contributor Author

I've updated the bug description.

I've also realized that it is possible to supply a special .jsbeautifyrc configuration to the plugin, so in the meantime I'll explore which setting may be causing the problem.

@garretwilson
Copy link
Contributor Author

@Glavin001 , seriously, this is not working. Tell me what I'm doing wrong. I've created the following .jsbeautifyrc file:

{
    "eol": "\r\n",
    "indent_size": 2,
    "indent_with_tabs": true,
    "preserve_newlines": true,
    "max_preserve_newlines": 2,
    "indent_handlebars": true,
    "wrap_line_length": 0
}

Unfortunately the weird newlines I reported are still being added. (In fact my .jsbeautifyrc should be a moot issue---according to the site, wrap_line_length should default to 0 anyway.)

It seems to think that things like <code> are block elements and should be wrapped. This does not happen on the site! What am I missing?

@Glavin001
Copy link
Owner

What am I missing?

This comment: #736 (comment) 😜
Run the Atom Beautify: Help Debug Editor command and follow the contribution guidelines that were wrote for helping to diagnose cases exactly like this.

atom-beautify must be sending different options to js beautifier and/or doing extra processing.

Atom Beautify passes the options as-is directly to js-beautify. See code at
https://github.com/Glavin001/atom-beautify/blob/master/src/beautifiers/js-beautify.coffee#L41-L42
And see https://github.com/Glavin001/atom-beautify/blob/master/src/beautifiers/js-beautify.coffee#L8 HTML: true which indicates that all options are supported and passed through without modifications.

If you run Atom Beautify: Help Debug Editor, you could see what the final options are before being passed to js-beautify. That is why I spent the time to implement the Help Debug Editor command, for this very scenario.

After you have the final options as shown by Help Debug Editor:

The conclusion will be one or more of the following:

  • Atom Beautify is overriding your options somewhere and it will be very clearly shown in the results of Atom Beautify: Help Debug Editor.
  • The wrong option is being used. JS-Beautify supported options found at https://github.com/beautify-web/js-beautify#options
  • JS Beautify has a bug / does not support some feature. Submit an Issue to them.

@garretwilson
Copy link
Contributor Author

Right, OK, fair enough. I was hoping it would just be something easy you could point out that I was missing.

It's been a long, long, long day. I'll attack this tomorrow---after all the other work. Good night.

@garretwilson
Copy link
Contributor Author

OK, here you go @Glavin001 . https://gist.github.com/garretwilson/20f83510591908ba4c87

(A maybe-related question: is the atom-beautify wrap line length overriding that in my .jsbeautifyfc file? I hope not. But it shouldn't affect the result here.)

@Glavin001
Copy link
Owner

I can confirm the final options:

{
    "indent_size": 2,
    "indent_char": " ",
    "indent_with_tabs": true,
    "eol": "\r\n",
    "preserve_newlines": true,
    "max_preserve_newlines": 2,
    "indent_handlebars": true,
    "wrap_line_length": 0
}

So I think it could be that an option is missing being set.

Try the option HTML Unformatted. See docs: https://github.com/Glavin001/atom-beautify/blob/master/docs/options.md#html---unformatted

{
    "html": {
        "unformatted": [
            "a",
            "sub",
            "sup",
            "b",
            "i",
            "u",
            "li" // <-- try adding this
        ]
    }
}

@garretwilson
Copy link
Contributor Author

Try ...

Your suggestion essentially to start randomly pushing buttons to see what changes is missing the bigger point here.

I can take the original HTML presented in this issue (and in the Gist) and paste it at http://jsbeautifier.org/ and it formats correctly. atom-beautify does not format it correctly. Your assertion I thought was that if I don't send something strange in the config, then I should get the default beautify js output. But I'm not.

So it remains to be seen why your plugin is getting different formatting results than the default beautify js output. There is no evidence that http://jsbeautifier.org/ is using any special "unformatted tags" settings.

Moreover this is not an issue of turning off formatting for <li>. The <li> formats correctly until it comes to a <code> element, and there it adds an incorrect linefeed. This also happens when <abbr> is encountered. Something is causing those things to get extra linefeeds. The problem is not turning off wrapping on the outer element.

Please investigate and find out what settings http://jsbeautifier.org/ is using and find out what settings atom-beautify is erroneously sending (or not sending) to break the beautify output. This works on the http://jsbeautifier.org/ site with no indication that there they are using special lists of excluded elements. It breaks on atom-beautify. Without further evidence it would appear that atom-beautify is doing something wrong. Please investigate.

@Glavin001
Copy link
Owner

Your suggestion essentially to start randomly pushing buttons to see what changes is missing the bigger point here.

I am in the middle of class right now and cannot should not investigate thoroughly. I have provided you with suggestions and links to code and other resources to help your own investigation. I am trying to help you the best I can given I have other priorities going on right now.
The alternative is that you wait and we discuss this upcoming weekend if I have more time then. I had hoped that my quick suggestions were more beneficial than radio silence however this thought may have been ill-conceived and I will remember that for next time.


Moreover this is not an issue of turning off formatting for <li>. The <li> formats correctly until it comes to a <code> element, and there it adds an incorrect linefeed.

The description of the option unformatted is: List of tags (defaults to inline) that should not be reformatted.
See https://github.com/beautify-web/js-beautify#css--html
This sure sounds like the option we want to investigate.

A quick search turns up the following js-beautify code: https://github.com/beautify-web/js-beautify/blob/bc5b7b5a3310df19cfe44489f3d0731d0c9b49cf/js/lib/beautify-html.js#L120-L122

 unformatted = options.unformatted || ['a', 'span', 'img', 'bdo', 'em', 'strong', 'dfn', 'code', 'samp', 'kbd',
            'var', 'cite', 'abbr', 'acronym', 'q', 'sub', 'sup', 'tt', 'i', 'b', 'big', 'small', 'u', 's', 'strike',
            'font', 'ins', 'del', 'pre', 'address', 'dt', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6'];

So the default options for unformatted option is:

['a', 'span', 'img', 'bdo', 'em', 'strong', 'dfn', 'code', 'samp', 'kbd',
            'var', 'cite', 'abbr', 'acronym', 'q', 'sub', 'sup', 'tt', 'i', 'b', 'big', 'small', 'u', 's', 'strike',
            'font', 'ins', 'del', 'pre', 'address', 'dt', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']`

Take note of this as we will get back to it later.

I then looked at their website and found output = html_beautify(source, opts);: https://github.com/beautify-web/js-beautify/blob/bc5b7b5a3310df19cfe44489f3d0731d0c9b49cf/index.html#L271

I ran Chrome debugger and it showed that opts was set to:

brace_style: "collapse"
break_chained_methods: false
comma_first: false
e4x: false
end_with_newline: false
indent_char: " "
indent_inner_html: false
indent_scripts: "normal"
indent_size: "4"
jslint_happy: false
keep_array_indentation: false
max_preserve_newlines: "5"
preserve_newlines: true
space_before_conditional: true
unescape_strings: false
wrap_line_length: "0"

So the final piece to check would be the final options being passed to JS-Beautify. This happens at https://github.com/Glavin001/atom-beautify/blob/master/src/beautifiers/js-beautify.coffee#L38

            text = beautifyHTML(text, options)

Now let's assuming there is a bug in Atom Beautify -- I would agree there is something strange going on.
Let's debug that line and find what the true value of options is.

Guess what! The results are:

{
  "indent_size": 2,
  "indent_char": " ",
  "indent_with_tabs": false,
  "indent_inner_html": false,
  "brace_style": "collapse",
  "indent_scripts": "normal",
  "wrap_line_length": 250,
  "wrap_attributes": "auto",
  "wrap_attributes_indent_size": 2,
  "preserve_newlines": true,
  "max_preserve_newlines": 10,
  "unformatted": [
    "a",
    "sub",
    "sup",
    "b",
    "i",
    "u"
  ],
  "end_with_newline": false,
  "extra_liners": [
    "head",
    "body",
    "/html"
  ]
}

It turns out that the Help Debug Editor had a bug and it was not properly calculating the Final Options -- that's what I get for quickly adding that feature. It was working for the most part, however the default values were not being taken into consideration. This would not have been a problem except that the default value for the unformatted option was out-of-date in Atom Beautify!

Furthermore, it also turns out that the real issue is the default option for unformatted option.
The default option was determined from https://github.com/beautify-web/js-beautify/blob/master/js/lib/beautify-html.js#L66

      'unformatted': ['a', 'sub', 'sup', 'b', 'i', 'u'],

And the default option for unformatted for Atom Beautify is set at https://github.com/Glavin001/atom-beautify/blob/master/src/languages/html.coffee#L77

Note the difference between:

['a', 'sub', 'sup', 'b', 'i', 'u'],

and

['a', 'span', 'img', 'bdo', 'em', 'strong', 'dfn', 'code', 'samp', 'kbd',
            'var', 'cite', 'abbr', 'acronym', 'q', 'sub', 'sup', 'tt', 'i', 'b', 'big', 'small', 'u', 's', 'strike',
            'font', 'ins', 'del', 'pre', 'address', 'dt', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']`

The <li> formats correctly until it comes to a <code> element, and there it adds an incorrect linefeed. This also happens when <abbr> is encountered.

See above that code and abbr are both included in JS-Beautify's default for unformatted and not in Atom Beautify's default.

In summary, the bug in Atom Beautify is the defaults which were extracted from JS-Beautify's documentation at https://github.com/beautify-web/js-beautify/blob/master/js/lib/beautify-html.js#L66
The bug can be found at https://github.com/Glavin001/atom-beautify/blob/master/src/languages/html.coffee#L77

The temporary solution until I publish a new version?

Try [Use] the option HTML Unformatted. See docs: https://github.com/Glavin001/atom-beautify/blob/master/docs/options.md#html---unformatted

And set it to:

['a', 'span', 'img', 'bdo', 'em', 'strong', 'dfn', 'code', 'samp', 'kbd',
            'var', 'cite', 'abbr', 'acronym', 'q', 'sub', 'sup', 'tt', 'i', 'b', 'big', 'small', 'u', 's', 'strike',
            'font', 'ins', 'del', 'pre', 'address', 'dt', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']`

When a new release is published, this default option will be updated to reflect the new default of JS-Beautify's unformatted option.

Your suggestion essentially to start randomly pushing buttons to see what changes is missing the bigger point here.

If you think you have a better suggestion, then please feel free to debug on your own -- which you should be attempting already.

My suggestion was to try an educated guess which takes little time to prove it's effectiveness. After you rejected my suggestion I tested it for myself, recognized that the assumption was correct and the unformatted option was a culprit. Furthermore, it was in fact adding code to unformatted that generated the appropriate results. This lead me to conclude that default options were out of sync between Atom Beautify and JS Beautify and lead me to investigate all of the above which confirmed my assumptions.

Anyway, I have to get back to my class.

@Glavin001 Glavin001 added the bug label Jan 12, 2016
@Glavin001 Glavin001 added this to the v0.29.0 milestone Jan 12, 2016
@Glavin001 Glavin001 self-assigned this Jan 12, 2016
@Glavin001
Copy link
Owner

Published fix to v0.28.21

@garretwilson
Copy link
Contributor Author

Hi, @Glavin001 . First I want to say I greatly appreciate your taking the time to address this. I can certainly empathize with your being busy with classes. In fact the reason why I am overwhelmed myself is that I am teaching a class on Java, and moreover I am writing all the lesson materials. (The lesson materials are in XHTML5 and I have been struggling for weeks just to get the source code formatted, because I run into bugs in everyone's libraries at every turn.)

As I had provided you sufficient information for you to reproduce the bug in your library, I imagined it would be more efficient for you to investigate your own code (as it was in fact) than for me to go on a "fishing expedition" trying changes in a codebase and library unfamiliar to me. I'm not being lazy---I would have looked into it more, but I was writing the lesson materials for the class I will give in two hours.

Anyway, it looks like you found your bug, so I will give it a spin and see how far it goes. Thanks again and good luck with your class.

@garretwilson
Copy link
Contributor Author

(FYI, it looks like even js-beautify needs to update its list of inline elements; see beautifier/js-beautify#840 .)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants