Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force cheerio to use our own custom htmlparser2 #582

Closed
yamgent opened this issue Jan 15, 2019 · 2 comments · Fixed by #948
Closed

Force cheerio to use our own custom htmlparser2 #582

yamgent opened this issue Jan 15, 2019 · 2 comments · Fixed by #948
Labels
a-Build s.UnderDiscussion The team will evaluate this issue to decide whether it is worth adding

Comments

@yamgent
Copy link
Member

yamgent commented Jan 15, 2019

Occurs when using node v8.15.0 (which bundles npm v6.4.1).

When setting up the repo for the first time and using npm install, npm will install cheerio, one of our dependencies. cheerio in turn is dependent on htmlparser2.

Due to issues with the way the original htmlparser2 parses the angular brackets, we had to fork our own custom version of htmlparser2 in order to fix the issue.

When running npm install under npm v6.4.1, npm would not use our custom htmlparser2, but uses the one by the official version. This will cause an error for MarkBind v1.16.0 (provided by @Xenonym):

PS C:\Users\pzy5a\Documents\GitHub\markbind\docs> markbind build
  __  __                  _      ____    _               _
 |  \/  |   __ _   _ __  | | __ | __ )  (_)  _ __     __| |
 | |\/| |  / _` | | '__| | |/ / |  _ \  | | | '_ \   / _` |
 | |  | | | (_| | | |    |   <  | |_) | | | | | | | | (_| |
 |_|  |_|  \__,_| |_|    |_|\_\ |____/  |_| |_| |_|  \__,_|
 v1.16.0
info: Website generation started at 1:54:34 AM
info: Building assets...
info: Assets built
info: Generating pages...
[------------------] 0 / 18 pages builterror: TypeError: Cannot set property 'src' of null
warning: Error: Error while generating C:\Users\pzy5a\Documents\GitHub\markbind\docs\userGuide\makingTheSiteSearchable.md
error: Error: Empty src attribute in include in: C:\Users\pzy5a\Documents\GitHub\markbind\docs\userGuide\reusingContents.md
error: TypeError: Cannot set property 'src' of null
[================--] 16 / 18 pages builtwarning: Error: Error while generating C:\Users\pzy5a\Documents\GitHub\markbind\docs\userGuide\makingTheSiteSearchable.md
error: Error while generating C:\Users\pzy5a\Documents\GitHub\markbind\docs\userGuide\makingTheSiteSearchable.md

Solution: Use npm shrinkwrap to force cheerio to use our own version of htmlparser2.

Temporary workaround: If you face this issue, discard the changes made by npm for package-lock.json, and redo npm install.


Additional info: package-lock-new.json is the one whereby cheerio uses the official version of htmlparser2.

diff --git a/package-lock.json b/package-lock-new.json
index 5856b37..8c5a77d 100644
--- a/package-lock.json
+++ b/package-lock-new.json
@@ -828,6 +828,39 @@
         "lodash.reduce": "^4.4.0",
         "lodash.reject": "^4.4.0",
         "lodash.some": "^4.4.0"
+      },
+      "dependencies": {
+        "htmlparser2": {
+          "version": "3.10.0",
+          "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-3.10.0.tgz",
+          "integrity": "sha512-J1nEUGv+MkXS0weHNWVKJJ+UrLfePxRWpN3C9bEi9fLxL2+ggW94DQvgYVXsaT30PGwYRIZKNZXuyMhp3Di4bQ==",
+          "requires": {
+            "domelementtype": "^1.3.0",
+            "domhandler": "^2.3.0",
+            "domutils": "^1.5.1",
+            "entities": "^1.1.1",
+            "inherits": "^2.0.1",
+            "readable-stream": "^3.0.6"
+          }
+        },
+        "readable-stream": {
+          "version": "3.1.1",
+          "resolved": "https://registry.npmjs.org/readable-stream/-/readable-stream-3.1.1.tgz",
+          "integrity": "sha512-DkN66hPyqDhnIQ6Jcsvx9bFjhw214O4poMBcIMgPVpQvNy9a0e0Uhg5SqySyDKAmUlwt8LonTBz1ezOnM8pUdA==",
+          "requires": {
+            "inherits": "^2.0.3",
+            "string_decoder": "^1.1.1",
+            "util-deprecate": "^1.0.1"
+          }
+        },
+        "string_decoder": {
+          "version": "1.2.0",
+          "resolved": "https://registry.npmjs.org/string_decoder/-/string_decoder-1.2.0.tgz",
+          "integrity": "sha512-6YqyX6ZWEYguAxgZzHGL7SsCeGx3V2TtOTqZz1xSTSWnqsbWwbptafNyvf/ACquZUXV3DANr5BDIwNYe1mN42w==",
+          "requires": {
+            "safe-buffer": "~5.1.0"
+          }
+        }
       }
     },
     "chokidar": {
@@ -2332,6 +2365,7 @@
         "combined-stream": {
           "version": "1.0.5",
           "bundled": true,
+          "optional": true,
           "requires": {
             "delayed-stream": "~1.0.0"
           }
@@ -2386,7 +2420,8 @@
         },
         "delayed-stream": {
           "version": "1.0.0",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "delegates": {
           "version": "1.0.0",
@@ -2633,11 +2668,13 @@
         },
         "mime-db": {
           "version": "1.27.0",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "mime-types": {
           "version": "2.1.15",
           "bundled": true,
+          "optional": true,
           "requires": {
             "mime-db": "1.27.0"
           }
@@ -2703,7 +2740,8 @@
         },
         "number-is-nan": {
           "version": "1.0.1",
-          "bundled": true
+          "bundled": true,
+          "optional": true
         },
         "oauth-sign": {
           "version": "0.8.2",
@acjh
Copy link
Contributor

acjh commented Jan 15, 2019

This does not seem to be solved by npm shrinkwrap.

$ npm shrinkwrap
$ rm -rf node_modules/
$ npm i

Could this be caused by upstream (fb55/htmlparser2) publishing v3.10.0 in October 2018?
cheerio specifies "htmlparser2": "^3.9.1" and v3.10.0-markbind.1 < v3.10.0 (https://semver.org).

@yamgent
Copy link
Member Author

yamgent commented Jan 17, 2019

Could this be caused by upstream (fb55/htmlparser2) publishing v3.10.0 in October 2018?
cheerio specifies "htmlparser2": "[^](https://stackoverflow.com/questions/22343224/whats-the-difference-between-tilde-and-caret-in-package-json)3.9.1" and v3.10.0-markbind.1 < v3.10.0 (https://semver.org).

You may be right. Tbh, I have not fully explored the settings and capabilities of npm shrinkwrap, I was hoping that it would be able to force cheerio to use a specific version of htmlparser2 (i.e. MarkBind/htmlparser2#v3.10.0-markbind.1), rather than allowing any latter version to be used.

Also I am not sure whether such a solution is the best idea going forward, but beyond pretty disruptive measures (e.g. forcing ourselves to incorporate all the latest changes in htmlparser2, or to use a completely different parse library), I am out of any other good ideas to resolve this issue. Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-Build s.UnderDiscussion The team will evaluate this issue to decide whether it is worth adding
Projects
None yet
2 participants