Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude specific line #316

Open
jakobod opened this issue Jul 30, 2021 · 14 comments
Open

Exclude specific line #316

jakobod opened this issue Jul 30, 2021 · 14 comments
Labels
enhancement Improve the expected

Comments

@jakobod
Copy link

jakobod commented Jul 30, 2021

Hi!

I'm wondering why there is no directive that can be used to ignore a specific line? I have some files that have intentional typos in them that should be ignored, while the rest of the file should still be checked!
I'm thinking of a solution like clang-formats:

// clang-format off
...
// clang-format on

Did I not find that feature or has it been left out intentionally?

greetings!

@epage epage added the enhancement Improve the expected label Jul 30, 2021
@epage
Copy link
Collaborator

epage commented Jul 30, 2021

Just hasn't been implemented left.

Different approaches

  • File-level exclude
    • Already supported, does block spell checking the file name
    • Would be better with layered config, see Layered config #193
  • File-wide ignore directive
  • Line-trailing ignore directive
  • Ignore block directives, like mentioned above
  • External ignore directives
    • Looks like codespell uses this, seems like it'd be hard to maintain

Considerations for inline directives

  • Comment styles are language-specific
  • For global and trailing directives, we need to scan ahead

@davidsneighbour
Copy link

How about an ignore system the way tools like stylelint implements it:

  • ignore following line
  • ignore on and ignore off for blocks (multiple lines) of code

Sometimes there is an requirement of having "wrong" code and adding all these items (see #544) to the ignore list might lead to missed fixes down the road.

@Blacksmoke16
Copy link
Contributor

Following up on #613 (comment), my main use case is wanting to allow a specific word as allowed, while not preventing it from being flagged elsewhere.

Inline comment would be the most focused. From an implementation perspective, different comment syntax between languages shouldn't really matter if you just look for some unique string exists in a given line. That way each lang can start the comment with whatever syntax they use. Granted this would be harder if parsing is done token by token, not line by line however.

The start/end directive, or even as mentioned later a "ignore next line" could also work well, allowing the parser to be aware of what is coming up, versus trying to act upon something it already parsed.

Scoping extended-words in the config file to a specific line would be another solution that would provide the same benefit, but possibly easier to implement. However, it would be more likely to get out of sync from the code so 😕. Raising it up to specific file, could be a good middle ground, similar to #193, but a bit less verbose since you wouldn't need to add extra files in those contexts.

This is definitely something I'd love to see. Feels so hacky globally ignoring a misspelling when it only matters in a handful of contexts. E.g. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer

@neiljp
Copy link

neiljp commented Mar 9, 2023

I just started looking at using typos today, and it's very fast :)

Line-specific ignores are the main feature I'd look for that isn't present already. For me this mainly applies to tests, which can include intentional errors. I agree with others that excluding an entire file (or splitting content out into another file), or an entire word, is rather a large hammer right now.

I found this is also missing in codespell, though being worked on, and various options were brought up to integrate with multiple languages too (codespell-project/codespell#1212).

@epage
Copy link
Collaborator

epage commented Mar 9, 2023

Huh, seeing the discussion about using cspell's syntax is interesting. I like the idea of a cross-tool syntax, much like burntsushi helped create a cross-search tool ignore file.

I'm still not a fan of supporting different comment styles. If anything, I'd prefer a common shibboleth that works independent of comment styles.

epage added a commit to epage/typos that referenced this issue Mar 22, 2023
Typos primarily works off of identifiers and words.  We have built-in
support to detect constructs that span identifiers that should not be
spell checked, like UUIDs, emails, domains, etc.  This opens it up for
for user-defined identifier-spanning constructs using regexes via
`extend-ignore-re`.

This works differently than any of the previous ways of ignoring thing
because the regexes require extra parse passes.  Under the assumption
that (1) actual typos are rare and (2) number of files relying on
`extend-ignore-re` are rare, we only do these extra parse passes when a
typo is found, causing almost no performance hit in the expected case.

While this could be used for more generic types of ignores, it isn't the
most maintainable because it is separate from the source files in
question.  Ideally, we'd implement document settings / directives for
these cases (crate-ci#316).
@epage
Copy link
Collaborator

epage commented Mar 22, 2023

FYI #695 provides a new workaround for false positives

epage added a commit to epage/typos that referenced this issue Mar 22, 2023
Typos primarily works off of identifiers and words.  We have built-in
support to detect constructs that span identifiers that should not be
spell checked, like UUIDs, emails, domains, etc.  This opens it up for
for user-defined identifier-spanning constructs using regexes via
`extend-ignore-re`.

This works differently than any of the previous ways of ignoring thing
because the regexes require extra parse passes.  Under the assumption
that (1) actual typos are rare and (2) number of files relying on
`extend-ignore-re` are rare, we only do these extra parse passes when a
typo is found, causing almost no performance hit in the expected case.

While this could be used for more generic types of ignores, it isn't the
most maintainable because it is separate from the source files in
question.  Ideally, we'd implement document settings / directives for
these cases (crate-ci#316).
@Delgan
Copy link
Contributor

Delgan commented Sep 24, 2023

I believe that using inline exclusion is not a workable approach for a rather simple reason: it would be ineffective with file types that do not support comments, such as .json and .txt.

I think we need some kind of ignore configuration list that includes elements in the format "<file>:<line>:<word>" for example.

It's true that maintaining such a list will be cumbersome. However, to mitigate this problem, we could consider introducing a command-line option like --update-ignore-list that would conveniently regenerate the configuration file based on the detected typos.

I agree that inline comments have high appeal, but unfortunately they may not be universally applicable (although they could certainly complement the proposed approach).

@epage
Copy link
Collaborator

epage commented Sep 25, 2023

An "ignore all" mode sounds intriguing. I hesitate slightly because external excludes seems like a path of last resort and making it easy will likely incentivize people to not improve things, either within typos or in their repos.

I think in normal modes we should warn if we check a file and the ignore is invalid (not an error in case its reasonable for two people to run different typos versions on the same code base). If the file for an external exclude is within scope of the run and doesn't exist, we should probably make that a warning (not an error in case it is a "sometimes there" file).

If we can, the syntax for this should be easy for git add -p to split and add incrementally as our "interactive mode" just like with typos -w.

@Blacksmoke16
Copy link
Contributor

Blacksmoke16 commented Sep 26, 2023

it would be ineffective with file types that do not support comments, such as .json and .txt.

I don't think it worth sacrificing the feature just because some file types don't have comments. Supporting inline exclusion would work and be helpful for many other file types. I could see supporting something like an external list in addition to inline exclusions for those kinds of files. But I don't really want to have to deal with it when 99% of files I use support comments.

@epage
Copy link
Collaborator

epage commented Sep 26, 2023

To be clear, I do not see them as mutually exclusive but that we'd benefit from both

phip1611 pushed a commit to phip1611/typos that referenced this issue Dec 5, 2023
Typos primarily works off of identifiers and words.  We have built-in
support to detect constructs that span identifiers that should not be
spell checked, like UUIDs, emails, domains, etc.  This opens it up for
for user-defined identifier-spanning constructs using regexes via
`extend-ignore-re`.

This works differently than any of the previous ways of ignoring thing
because the regexes require extra parse passes.  Under the assumption
that (1) actual typos are rare and (2) number of files relying on
`extend-ignore-re` are rare, we only do these extra parse passes when a
typo is found, causing almost no performance hit in the expected case.

While this could be used for more generic types of ignores, it isn't the
most maintainable because it is separate from the source files in
question.  Ideally, we'd implement document settings / directives for
these cases (crate-ci#316).
@Lillecarl
Copy link

@epage it's a can of worms, but with tree-sitter you can run language specific parsers on filetypes and see what's actually a comment and what isn't. This could also be used for spicy options like

  • Don't spellcheck variables
  • Don't spellcheck function names
  • Don't spellcheck this scope
  • Don't spellcheck this class
  • Don't spellcheck this essentially

https://tree-sitter.github.io/tree-sitter/playground Have a look at the playground, it's amazing tech. Though it'd have to be a feature flag, tree-sitter is more correct than fast 😄

@epage
Copy link
Collaborator

epage commented Jan 30, 2024

I am also always put off from tree sitter in dealing with all of the individual plugins and allowing people to extend a program with more.

@Natim
Copy link

Natim commented Apr 3, 2024

We are also looking for this kind of feature on our side, currently, we add exceptions such as hd → has and KMS → km globally but it would be nicer to put online exceptions,

@epage
Copy link
Collaborator

epage commented Apr 3, 2024

Forgot to mention it in this issue but default.extend-ignore-re exists which can be used to make your own inline ignores.

Some suggested regexes (including line ignore and block ignore) can be found at https://github.com/crate-ci/typos/blob/master/docs/reference.md#config-fields

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improve the expected
Projects
None yet
Development

No branches or pull requests

8 participants