You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Think the results below speak for themselves. It would be helpful to know what sort of normalisation is expected since it seems WTL does not do any. Mostly seems to be newlines influencing the results.
from whatthelang import WhatTheLang as WTL
p = WTL()
s = 'Chapter 14: Person of Interest\n Chapter 16: The Debt\n'
p.predict_lang(s) # fr
s = 'to actual persons, living or dead, \n business establishments, events,\n'
p.predict_lang(s) # es
s = 'Chapter 11: Freedom\n Chapter 12: Everyday Life for Five Years\n'
p.predict_lang(s) # fr
s = 'Prologue\n They were two beautiful people. Both strong and healthy, exactly what she was looking for.\n'
p.predict_lang(s) # fr
s = '###\n By Heather Graham\n'
p.predict_lang(s) # de
s = 'Raymond Stocker – Owner/operator of Nicoll’s Island amusement park\n Jasmine Stocker – wife of Raymond\n'
p.predict_lang(s) # de
s = 'Prologue\n Flirtation lasts the brief flutter of a butterfly’s wing.\n'
p.predict_lang(s) # fr
s = 'Any resemblance to places or actual persons,\n living or dead is entirely coincidental.\n'
p.predict_lang(s) # es
s = 'Dream of the Fir Bolg\n Eochaidh mac Eirc, Ard Ri, High King of Ireland and leader of the Fir Bolg, stood on the cliff edge and looked out to sea. The night was clear, the water was calm, and the moon and stars observed the scene like many bright eyes.\n'
p.predict_lang(s) # af
The text was updated successfully, but these errors were encountered:
@RuABraun We don't do any preprocessing inside the library for removing \n, Thanks for bringing this up. Do you see the correct results once \n is removed ?
Since you have found this out, would you be interested in sending a PR for fixing this. You would have to update here
Removing \n and : fixes the predictions. I can do the PR. The way I would do it is by removing all newline and punctuation characters (as well as numbers as is already done). Sound good to you?
Think the results below speak for themselves. It would be helpful to know what sort of normalisation is expected since it seems WTL does not do any. Mostly seems to be newlines influencing the results.
The text was updated successfully, but these errors were encountered: