Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"ret", "prev" and "goes" are "fixed" when using en-us #592

Closed
embe-pw opened this issue Oct 13, 2022 · 8 comments
Closed

"ret", "prev" and "goes" are "fixed" when using en-us #592

embe-pw opened this issue Oct 13, 2022 · 8 comments
Labels
bug Not as expected

Comments

@embe-pw
Copy link

embe-pw commented Oct 13, 2022

Running typos (v1.12.8, binary from GitHub release) with typos.toml containing:

[default]
locale = "en-us"

causes some interesting suggestions:

error: `ret` should be `ert`
  --> ./foo.txt:1:1
  |
1 | ret
  | ^^^
  |
error: `prev` should be `perv`
  --> ./foo.txt:2:1
  |
2 | prev
  | ^^^^
  |
error: `goes` should be `ges`
  --> ./foo.txt:3:1
  |
3 | goes
  | ^^^^
  |
@epage
Copy link
Collaborator

epage commented Oct 13, 2022

varcon is an interesting thing

# ert (level 70)
A: ert / B: ret

I'm not finding an ert in american english.

# perv (level 55)
A: perv / B: prev

# perve (level 80)
A: perve / B: preve
A: perved / B: preved
A: perving / B: preving
A: perves / B: preves

I'm having a hard time finding these british words

# Ges (level 80)
A: Ges / B: Goes
A: ge / B: gae
A: ge / B: goe
A: ged / B: gaed

# gessed (level 80)
A: gessed / B: gessoed

# gesses (level 80)
A: gesses / B: gessoes

Again, not finding anything.

@epage
Copy link
Collaborator

epage commented Oct 13, 2022

In particular, ret and goes are american english words and we shouldn't be trying to convert them

This also makes me wonder how many more of these we have.

Ideally, we'd have a list of universal words and force the variations support to not correct one of those words. Unsure how to fully get that ideal, so we might just have to play whack-a-mole.

@epage epage added the bug Not as expected label Oct 13, 2022
@embe-pw
Copy link
Author

embe-pw commented Oct 13, 2022

Few other strange cases:

error: `axe` should be `ax`
error: `bare` should be `baer`
error: `CREAT` should be `CERAT`
error: `Ire` should be `Ier`
error: `prepend` should be `perpend`
error: `Ren` should be `Ern`

(the CREAT one catches the error, but CREATE is much more likely to be correct)

@alerque
Copy link

alerque commented Oct 9, 2023

I just ran across this project and was intrigued by the white-list approach mentioned in the readme of known-good substitutions, not just dictionary guesses. I ran it on a project (Lua) and was absolutely bombarded by false positives. Looking through the diffs I can see a few geed catches, but the miss rate is something like 95%, making it utterly unusable. I was surprised the results were so bad and decided to check out the issue tracker, which lands me here. Some of the most common misses are mentioned here (prepend→perpend, ret→ert, prev→perv, etc.). If this project is supposed to operate on code then I can't understand why these sorts of things are white-listed. Programmers are probably orders of magnitudes more likely to use "prev" to mean "previous" than "pervert",and so forth. How did these get listed at all?

@epage
Copy link
Collaborator

epage commented Oct 10, 2023

How did these get listed at all?

This issue is specifically for when people opt-in to a feature, like setting locale = "en-us" in their config. An easy workaround is to not do that.

The reason why it made it in is that we are leveraging varcon, like other dictionaries, as best as I understand it. As for why a lot of these went unnoticed for so long is that his is an op-tin feature and one that doesn't seem to be used all that much based on the number of concerns raised here (I also don't use it)

An easy workaround is to use the default behavior, rather than enable this feature.

@embe-pw
Copy link
Author

embe-pw commented Oct 12, 2023

Main problem with the workaround of "do not enable this" is that the feature is otherwise very useful, especially for people whose native languages are different from US English – it's very easy to introduce a variant spelling by mistake.

@embe-pw
Copy link
Author

embe-pw commented Aug 26, 2024

Seems like this has been fixed (probably by #1086) 🎉

@epage
Copy link
Collaborator

epage commented Aug 26, 2024

#1087 verified they are addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Not as expected
Projects
None yet
Development

No branches or pull requests

3 participants