-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Add Unicode normalization support #410
Comments
Thank you for opening an issue for this! 👍 I plan on eventually adding https://metacpan.org/pod/Unicode::Normalize into the dependencies and call upon it when saving tags and title to the database. |
Found some time after #282 and #405 to get this done today: All writes to the database that were utf-8 encoded are now also normalized to NFC. I've also added a temporary script plugin to fix existing databases by normalizing all their data to NFC: |
* [ImgBot] Optimize images *Total -- 955.60kb -> 756.84kb (20.8%) /tools/Documentation/.gitbook/assets/ratings.png -- 131.97kb -> 86.23kb (34.66%) /tools/Documentation/.gitbook/assets/index.png -- 193.08kb -> 129.82kb (32.76%) /tools/Documentation/.gitbook/assets/archive_thumb.jpg -- 630.55kb -> 540.78kb (14.24%) Signed-off-by: ImgBotApp <[email protected]> * Do not show page select when there are no results * Also hide table header * Update buildx action to the official Docker one * Update s6 * Update less, the alpine package doesn't seem to have s6-overlay-preinit yet? * pin s6 alpine package * Rollback alpine base for now * Fix ARM builds (#394) * Update push-continous-delivery.yml * Update release-delivery.yml * Update push-continous-delivery.yml * Update push-continous-delivery.yml * Improve visibility when sorting table columns + fix being able to put whitespace as a custom column * Fix context menu applying to overlays * (#374) Remove cooldown on auto-plugin as it's basically useless * Fix memory leak caused by the Parallel::Loops/Storable combo in the Search API * Add some basic retrying logic on our first Redis connection This avoids dying unnecessarily if Redis takes a while to load into memory. * Accept "false" properly for pinned on category creation * Also fix the update_cat endpoint * Remove favtagmigration script * Add a simple GET to /api/categories/:id * (#335) Reading progression is now server-side! * Fix method for progress in docs * Fix docs thanks for nothing gitbook * (#385) (#397) Remove the built-in Auto-Tag feature in favor of a Filename parsing plugin * Fix tests * Avoid uninitialized warnings if there's no progress/pagecount in DB * Add a pageread stat and actually use the package.json description 'cause why not * Remove unnecessary datatables cdn include * (#282) Rewrite Shinobu filemap so it relies on Redis and keeps state between restarts * (#405) Add job + api req to regen all thumbnails Also remove redis loading timeout for REALLY BIG dataset * (#410) NFC all the things * Add fa-solid-900.woff2 to the vendor deps so that browsers finally shut up about it, also added fa-regular * Fix submenu arrow colors in the various themes * autism * Add an API endpoint to return which categories an ID belongs to * (#375) Rework context menu so it uses the new endpoint to remove archives from categories * More doc details on the reading progression API * Some more documentation fixes * MORE Documentation updates gee * (#412) Add autofocus to the password input in login * (#397) Add a check to RegexParse to avoid putting numbers as languages * (#414) Fix regexparse not decoding the filesystem path + decode log in minion upload for extra clarity * (#389) Update magick to v7 in homebrew * backport changes from homebrew-core those guys write ruby way better than I do * Fix brew test runs * (#389) Add libheif to Dockerfile and unlock avif/heif support * Don't recommend homebrew for linux since it doesn't work out of the box atm * Add Mojolicious::Plugin::Status when running in debug mode * Update docs to fix wrong json examples * Stop using a static secret for mojo's cookie signatures * Remove the auto-plugin toggle and matching pref It's a bit unnecessary considering plugins have to be toggled manually anyways. * Rework Plugin Configuration page a bit further * (#267) Make the thumbnail folder location an option * [ImgBot] Optimize images (#416) /tools/Documentation/.gitbook/assets/thumbchange.png -- 55.64kb -> 36.35kb (34.67%) Signed-off-by: ImgBotApp <[email protected]> Co-authored-by: ImgBotApp <[email protected]> * Add some JS to migrate local reading progression Co-authored-by: ImgBotApp <[email protected]> Co-authored-by: Cirno the Strongest <[email protected]> Co-authored-by: imgbot[bot] <31301654+imgbot[bot]@users.noreply.github.com>
Before talking about the issue, I want to express my thank for you. LRR librate me from the hell of arrange my manga librarys.
Most of the files in my LRR library have filenames in Japanese. I found LRR behaved abnormally when I searched my library for some certain artist:
The search bar suggested two entries, which looks the same. However, when I applied the filter respectively, they gave me different search results, suggesting they were actually different even though they looked the same.
I found the former one is ジ (U+30B7) while the latter one is combined by シ (U+30B7) and ゛ (U+309B), Katakana-Hiragana Voiced Sound Mark, like U+0303, the Combining Tilde, which looks like this -> ~.
Due to lacking of support for Unicode normalization, when the metadata crawler extract metadata like title and tags from the filename, it just use whatever in the filename without normalize it, which causes the issue. Would Love if you can implement some kind of Unicode normalization when extract & edit metadata.
The text was updated successfully, but these errors were encountered: