-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output referrer when there is a crawl error #127
Conversation
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
🦋 Changeset detectedLatest commit: 5aee276 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Hey @exortech, I am planning to release a new major version pretty soon (the next few days), on the |
Thanks for taking the time to send a PR by the way! |
Sure. No problem.
I have another change that I'd like to propose, which is to
expose crawler.parseScriptTags as a configuration parameter. The built-in
parser for simplecrawler is pretty basic and generally does a poor job of
trying to pull uris out of script tags. This creates a lot of false
positives, especially if I'm trying to also use simplecrawler to detect
broken links. Changing the code to stop parsing script tags would be
simplest, but would break backwards compatibility. So the intention is to
make disabling script parsing configurable.
What do you think? Does that align with functionality that you would want
to support for lighthouse-parade?
Cheers,
Owen.
…On Mon, Dec 19, 2022 at 3:06 PM Caleb Eby ***@***.***> wrote:
Hey @exortech <https://github.com/exortech>, I am planning to release a
new major version pretty soon (the next few days), on the next branch.
Any chance you could reimplement this but targeting the next branch?
—
Reply to this email directly, view it on GitHub
<#127 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAERGU4MQ7AX2H7T7CT3RLWODS5VANCNFSM6AAAAAATD2KDQY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Owen Rogers | Exortech Consulting
@exortech <https://twitter.com/exortech> | http://exortech.com/
|
To be honest, I am not a big fan of the simplecrawler library (and the library is now deprecated as well). I would definitely be open to using a different library that may avoid some of simplecrawler's issues, and I'd also be open to adding parameters to configure the behavior of that new crawler. But I think I will release the next version before I make that change, in a future major version. |
Makes sense. I noticed that you have replacing simplecrawler on your task list for the next version in #117. So I guess it makes sense to hold off on this change until an alternate crawler is in place. Closing this PR to submit a new PR for the next branch. |
Resolves #126