Improve word-matching algorithm #94

AKushWarrior · 2020-11-02T19:16:46Z

Hey there,

As do a lot of QB teams, my team uses Protobowl for practice/fun since quarantine has started. I've always noticed that Protobowl has a lot of erroneous correct markings, as well as a lot of erroneous incorrect markings. Around a week ago, I started keeping a running log of the latter via screenshots.

I know that doing serious semantic analysis of the answers is probably impossible, but I think that some of this is fairly rudimentary and could be fixed. If @antimatter15 (who seems to be the sole active maintainer) is open to it, I'd be happy to take a shot at contributing to a potential fix.

Some of the screenshots I took are attached.

AKushWarrior · 2020-11-02T19:19:10Z

Apologies for the profanity, if that's something that matters to you.

AKushWarrior · 2020-11-02T19:25:46Z

Investigating the source code a little bit, there seems to be two "checkers": checker.coffee and checker2.coffee

I'm assuming the latter is currently used?

AKushWarrior · 2020-11-02T19:38:08Z

I think the system could benefit from some simple checks before it goes through a long series of checks that can mess things up. Maybe a case insensitive implementation of Levenshtein's algorithm, where a really high value between a bolded answer and guess is automatically marked as correct?

AKushWarrior · 2020-11-02T19:41:55Z

Contextual awareness could also be useful, but that might be stretching it. If the question notes that synonyms are okay, maybe usage of an NPM package could help us follow the intent. If a question notes that an answer should only be accepted before a certain point, maybe that answer should not be accepted afterwards.

antimatter15 · 2020-11-05T02:22:49Z

Sorry for the late reply, but if that's something that you'd be interested in tackling, I'd be happy to merge any improvements. I might even have a database somewhere of reported errors in its judgements. Case and typo insensitive Levenshtein distance is already being employed, but certainly there's a lot that can be done to improve it

…

On Mon, Nov 2, 2020 at 11:42 AM Aditya Kishore ***@***.***> wrote: Contextual awareness could also be useful, but that might be stretching it. If the question notes that synonyms are okay, maybe usage of an NPM package could help us follow the intent. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#94 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAHKZXWWYAUFGZ7XRX3DL3SN4DRHANCNFSM4TH3IQ2A> .

AKushWarrior · 2020-11-05T02:27:12Z

I'll dig a bit more, maybe set up some test cases for popular errors (as well ast things the current model does well). If you find that database, let me know?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve word-matching algorithm #94

Improve word-matching algorithm #94

AKushWarrior commented Nov 2, 2020 •

edited

Loading

AKushWarrior commented Nov 2, 2020

AKushWarrior commented Nov 2, 2020

AKushWarrior commented Nov 2, 2020 •

edited

Loading

AKushWarrior commented Nov 2, 2020 •

edited

Loading

antimatter15 commented Nov 5, 2020 via email

AKushWarrior commented Nov 5, 2020

Improve word-matching algorithm #94

Improve word-matching algorithm #94

Comments

AKushWarrior commented Nov 2, 2020 • edited Loading

AKushWarrior commented Nov 2, 2020

AKushWarrior commented Nov 2, 2020

AKushWarrior commented Nov 2, 2020 • edited Loading

AKushWarrior commented Nov 2, 2020 • edited Loading

antimatter15 commented Nov 5, 2020 via email

AKushWarrior commented Nov 5, 2020

AKushWarrior commented Nov 2, 2020 •

edited

Loading

AKushWarrior commented Nov 2, 2020 •

edited

Loading

AKushWarrior commented Nov 2, 2020 •

edited

Loading