Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Maintainers]: Please update your 3rd party grammar repo to the latest spec #3008

Closed
joshgoebel opened this issue Feb 15, 2021 · 24 comments
Closed
Assignees
Labels
3rd_party enhancement An enhancement or new feature good first issue Should be easier for first time contributors help welcome Could use help from community language package/build Issues relating to npm or packaging

Comments

@joshgoebel
Copy link
Member

joshgoebel commented Feb 15, 2021

To ensure your grammar continues to function smoothly with future releases of Highlight.js you may need to make a few small changes. The good news is you can make all these changes today - and your grammar will continue to work with the v10 build system as well. You'll just be ready for v11 early.

Following our official spec makes it easy to generate a CDN ready dist folder and also allows advanced users/integrators to build custom versions of Highlight.js that include your grammar in the main minified bundle with very little effort.

If you have some disagreement with the official spec or build process (and would also like to invest your time in finding a better solution) please continue that discussion here: Improving testing / integration / usage for 3rd party grammars This is not the thread for that larger discussion.

That topic died because at the time no one seemed interested in discussing or working on it.

So is v11 changing how 3rd party grammars work or are built?

Not really. Some early 3rd party grammars (perhaps your own) were invented side-by-side with the official spec and have always done things a "bit differently". The v10 build system thus has several hacks to "accommodate" these early grammars - allowing them to work as-is. These hacks will be removed from the v11 build system, requiring all grammar modules to provide a single export that can be understood by the Highlight.js build system (rollup).

Summary

  • Your primary JS file should set only the default export, nothing further.
    • ie, either export default or module.exports = ...
    • this means no more definer exports, your primary JS file should not worry about the web browser context, that is what the dist CDN builds are for
    • no more shims: var module = module ? module : {};
  • Your repository should include a dist folder with a CDN ready build of your grammar (built using the Highlight.js build system) See 3RD_PARTY_QUICK_START.md.md

An example grammar already meeting all requirements:

Spec Checklist

You can copy this template into your own project's issues area on GitHub:

- [ ] Your primary JS file should only set a default export, no others
- [ ] Remove any other definer and/or other additional exports
- [ ] Your repository should include a `dist` folder with pre-built CDN assets. Reference: [3RD_PARTY_QUICK_START.md.md](https://github.com/highlightjs/highlight.js/blob/master/extra/3RD_PARTY_QUICK_START.md)
- [ ] Remove any browser shims from your raw JS source
- [ ] Consider adding a `package.json` and publishing your grammar on [NPM](https://www.npmjs.com) if you aren't already
- [ ] Please update your `README.md` if necessary to reflect these changes
- [ ] Remove `charset="UTF-8"` from your examples, it should no longer be necessary with `dist` builds produced by our build tools

Browser shims

If your grammar has any shims like:

var module = module ? module : {};     // shim for browser use

Remove the shims entirely. The dist CDN distributable is meant to solve this issue.


Simply export default

If you have a very specifically named define function:

function hljsDefineRpmSpecfile(hljs) {
  return {
    name: "RPM specfile"

This isn't necessary. It's actually not necessary to name this function at all.

What a simple export looks like (ES modules):

export default function(hljs) {
  return {
    name: "RPM specfile"

What a simple export looks like (CJS):

module.exports = function (hljs) {
  return {
    name: "RPM specfile"

Whether you use ES or CJS may depend on how you expect node users to use your library (and whether you still want to support Node v10). The Highlight.js build system can build CDN distributable regardless of whether your source is in ES module or CJS format.


Exporting other things

If your grammar main JS file includes code like:

module.exports = function(hljs) {
    hljs.registerLanguage('rpm-specfile', hljsDefineRpmSpecfile);
};

module.exports.definer = hljsDefineRpmSpecfile;

All this boilerplates can be removed. This is what the CDN ready distributable in dist does for you, handles browser concerns. Just remove all this and export the single function, as shown above.

@joshgoebel joshgoebel added language enhancement An enhancement or new feature package/build Issues relating to npm or packaging labels Feb 15, 2021
@joshgoebel joshgoebel added this to the 11.0 milestone Feb 15, 2021
@joshgoebel joshgoebel self-assigned this Feb 15, 2021
@joshgoebel joshgoebel changed the title [Grammar Maintainers]: Update your 3rd party grammar to v11 standard [Grammar Maintainers]: Please update your 3rd party grammar to v11 standard Feb 15, 2021
@joshgoebel joshgoebel changed the title [Grammar Maintainers]: Please update your 3rd party grammar to v11 standard [Grammar Maintainers]: Please update your 3rd party grammar to the latest standard Feb 15, 2021
@joshgoebel joshgoebel changed the title [Grammar Maintainers]: Please update your 3rd party grammar to the latest standard [Maintainers]: Please update your 3rd party grammar to the latest standard Feb 15, 2021
@frangio
Copy link

frangio commented Feb 17, 2021

Is it really good practice to include build artifacts like dist in the repository? I have personally always avoided this in the projects I maintain.

@joshgoebel
Copy link
Member Author

joshgoebel commented Feb 17, 2021

Some 3rd party grammar authors are just learning Git/GitHub so I think the idea was "what's simple" not what's "best practice". I agree in general (that it's suboptimal), but I also don't think it's terrible in this case (of 3rd party grammars). You could also perhaps use a GitHub Action to upload it as a file under Releases (when a new release is tagged). If someone wanted to put together a GitHub Action that did a full build and then uploaded the built files in dist as a Release - that'd be awesome.

I'm open to other suggestions also, but many 3rd party grammars are already following this dist convention, so there is some momentum there already...

The goal here is making the final build product (a standard CDN ready JS file that registers itself) easier to download and avoiding weird tricks attempting to make a single JS file work in npm and browser contexts.

@frangio
Copy link

frangio commented Feb 24, 2021

Allowing you to maintain a 3rd party module WITHOUT it being inside of a highlight-js checkout (this requires discussion though)

I personally don't think it's a reasonably requirement to maintain 3rd party modules inside a highlight-js checkout. It makes the setup too complex. A 3rd party module repository should be self contained. If using highlight-js's build tools is a requirement, they should be published to npm so 3rd party modules can declare them as dev dependencies.

@joshgoebel
Copy link
Member Author

don't think it's a reasonably requirement

The goal here is to make all 3rd party grammars integrate consistently - therefore making them easier to use by end users. I'm not quite sure we're speaking in terms of "requirements" yet... but if we were I should be clear that the requirement would be including a standard minified CDN module (like all those we include in our CDN build), in addition to a npm library. The fastest way to do that - is use the build system we provide for free. One could of course setup their own build scripts, etc...

It makes the setup too complex.

Complex? How so? The "setup" is literarily two commands cd and git checkout (for both repos). It takes less than a minute to get the structure setup on a modern system.

If using highlight-js's build tools is a requirement, they should be published to npm so 3rd party modules can declare them as dev dependencies.

The main goal of the library build system is to make it easier to build "monolith" builds of our own library... and when doing that (building Highlight.js) you obviously need a checkout of Highlight.js. Building 3rd party CDN dist modules (as a convenience) was added because it was nice, and easy to do. Building a bunch of separate build tools and then publishing those to npm is probably out of scope at this time - considering how easy the existing setup is to use.

If someone else wants to put in all the time and effort to figure out what that might look like then we could take a closer look, I'm not 100% opposed - but I'm also not sure it's "clearly better" (for our use case). You could get the CDN dist probably with just a rollup build, but that wouldn't give you the built-in ReDOS testing that our build system also provides for free.

@joshgoebel
Copy link
Member Author

joshgoebel commented Mar 1, 2021

CC @highlightjs/grammar-maintainers

Pinging all grammar maintainers on changes you may need to make to come into alignment with our official 3rd party grammar spec and continue to be compatible with version 11. For some of you that may mean using our build process for the first time to provide a dist CDN build with your repo. That's not the only way you could generate such a file, but it's likely the easiest.

Using our build process to build your dist also runs security scans on your code to guard against run-away regular expressions that can lead to ReDOS attacks, etc... it's very easy to code these types of bugs without intending to. When we first added these checks we found over 20 of these issues in the core library itself which led to our 2nd security disclosure: GHSA-7wwv-vh3v-89cq

The spec is essentially encapsulated in the copy-and-paste checklist above and you can see our example repo as well.

Please keep this thread focused on specific questions about the HOW of implementation. If you'd like to discuss the WHY or have your own better ideas about the 3rd party development process please see my comment above:

If you have any disagreement with the official spec or build process (and would also like to invest your time in finding a better solution) please continue that discussion here: Improving testing / integration / usage for 3rd party grammars. This is not the thread for that larger discussion.

@joshgoebel
Copy link
Member Author

For a while now when a new PR comes in to add a repo to SUPPORTED_LANGUAGES we will first check that it follows our official spec, includes dist, solid README, etc... this issue is just our finally turning our focus to 3rd party grammars that were added to SUPPORTED_LANGUAGES in the more distant past, before we knew what the spec was and exactly what a grammar repo should look like - and now also bringing those repos current as well.

Thank you all for helping out. It's greatly appreciated.

@joshgoebel joshgoebel changed the title [Maintainers]: Please update your 3rd party grammar to the latest standard [Maintainers]: Please update your 3rd party grammar repo to the latest spec Mar 1, 2021
@gusbemacbe
Copy link
Contributor

In response to the last two comments, do I need to update my repo?

@joshgoebel
Copy link
Member Author

@gusbemacbe You're looking pretty good (if we're talking cypher). You shouldn't need charset="UTF-8" in your examples anymore if you're building your dist with the latest version of Highlight.js. We properly downsample everything to ASCII now during build and have for a few versions I think.

@joshgoebel
Copy link
Member Author

The disagreement is about whether the repository is the means of distribution, along with what I consider the normal distribution channels like GitHub Releases or packages on the npm registry.

No disagreement here about GitHub Releases vs dist... The only goal is:

  • Make it trivial to download CDN ready build of your grammar that self-registers

I'm not sure npm qualifies as "distribution" though (in this context). Many people use us outside the entire Node ecosystem. I suppose if the README linked to the very latest https://unpkg.com version for a download URL that might suffice...

Honestly I don't think anyone yet has tried solving this problem other than dist. So the real requirement I think is to make a CDN ready distributable trivially simple to download.

@StoneCypher
Copy link
Contributor

StoneCypher commented Mar 17, 2021 via email

@joshgoebel
Copy link
Member Author

joshgoebel commented Mar 17, 2021

The npm norm has always been, since day 1, to have any
materialized deliverables in dist.

Sure, but even so - that doesn't ALSO necessarily imply that build assets must be checked into version control. That was the original question I believe and I still agree with the intention of it... I just don't think for 3rd party language repos it's "that bad" of a problem. :)

@jf990
Copy link
Contributor

jf990 commented Mar 17, 2021

I think the issue is what are the requirements for distribution of a 3rd party grammar integrated with highlights, and that is provided by this dist set up. I think this is a reasonable requirement of the way this tooling works, and is in addition to any other package distribution a language contributor thinks makes sense.

@Serhioromano
Copy link

Can you explain how to test it? I created markup and detect folders just like in robots-txt example. But how to run test command just for my language I cannot find.

@joshgoebel
Copy link
Member Author

The same way that you do for Highlight.js (and from inside the HLJS top-level folder).

npm run test :-)

Your project of could must be in the extra folder inside of our source tree (or symlinked, etc).

@Serhioromano
Copy link

@joshgoebel it tests all built-in languages, I want only test my lanuage.

Also extension of files in the markup and detect folders have to be .txt or extension of my language?

@joshgoebel
Copy link
Member Author

All the files are txt, correct.

I want only test my lanuage.

Set the ONLY_EXTRA environmental variable to anything. Works for markup tests.

@joshgoebel joshgoebel removed this from the 11.0 milestone Mar 18, 2021
@Serhioromano
Copy link

@joshgoebel I've added my own test JS file. That was simpler. Now I'll PR for description file.

@joshgoebel
Copy link
Member Author

I've added my own test JS file.

Ok. Running the built-in test suite (in full) also gives you the regex protection tests for free to make sure you aren't using any runaway regex, etc...

@joshgoebel
Copy link
Member Author

Closing this as we've decided to have a slightly hands-off approach with 3rd party grammars going forward. If the grammar isn't in core and isn't causing any problems then we plain to be less concerned with how these repos are setup, structured, etc. There may be exceptions (some unusual circumstance, support burden, security issue, etc) but for now do as you feel best...

Feel free to close the issues I've opened (or not). The "official" approach is still recommended (for allowing users to easily build your grammar into a monolith using our build tools - and for the additional security scanning our build system provides), but I'm not going to try and force it on anyone anymore. If any of you might be interested in working on a GitHub action to make it easier to automate builds (with our security scanning) for those desiring them, get it touch.

CC @highlightjs/grammar-maintainers
CC @highlightjs/core

Thank you all again so very much for contributing in the first place. :-)


For PR requests to add a new 3rd party grammar to ADDITIONAL_LANGUAGES I think we'll only enforce some simple usability requirements:

  • A minimal useful README showing how to use the grammar (or linking to such docs)
  • A mention of which versions of Highlight.js the grammar it works with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd_party enhancement An enhancement or new feature good first issue Should be easier for first time contributors help welcome Could use help from community language package/build Issues relating to npm or packaging
Projects
None yet
Development

No branches or pull requests

7 participants