Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add Unicode normalization support #410

Closed
sirius422 opened this issue Feb 18, 2021 · 2 comments
Closed

[Feature Request] Add Unicode normalization support #410

sirius422 opened this issue Feb 18, 2021 · 2 comments

Comments

@sirius422
Copy link

sirius422 commented Feb 18, 2021

Before talking about the issue, I want to express my thank for you. LRR librate me from the hell of arrange my manga librarys.

Most of the files in my LRR library have filenames in Japanese. I found LRR behaved abnormally when I searched my library for some certain artist:
1
The search bar suggested two entries, which looks the same. However, when I applied the filter respectively, they gave me different search results, suggesting they were actually different even though they looked the same.
I found the former one is ジ (U+30B7) while the latter one is combined by シ (U+30B7) and ゛ (U+309B), Katakana-Hiragana Voiced Sound Mark, like U+0303, the Combining Tilde, which looks like this -> ~.

Due to lacking of support for Unicode normalization, when the metadata crawler extract metadata like title and tags from the filename, it just use whatever in the filename without normalize it, which causes the issue. Would Love if you can implement some kind of Unicode normalization when extract & edit metadata.

@sirius422 sirius422 changed the title Add Unicode normalization support [Feature Request] Add Unicode normalization support Feb 18, 2021
@Difegue
Copy link
Owner

Difegue commented Feb 18, 2021

Thank you for opening an issue for this! 👍

I plan on eventually adding https://metacpan.org/pod/Unicode::Normalize into the dependencies and call upon it when saving tags and title to the database.

Difegue added a commit that referenced this issue Feb 23, 2021
@Difegue
Copy link
Owner

Difegue commented Feb 23, 2021

Found some time after #282 and #405 to get this done today: All writes to the database that were utf-8 encoded are now also normalized to NFC.

I've also added a temporary script plugin to fix existing databases by normalizing all their data to NFC:
image
So you should be able to fix the issue quickly once the new release drops. 🎉

@Difegue Difegue closed this as completed Feb 23, 2021
Difegue added a commit that referenced this issue Mar 11, 2021
* [ImgBot] Optimize images

*Total -- 955.60kb -> 756.84kb (20.8%)

/tools/Documentation/.gitbook/assets/ratings.png -- 131.97kb -> 86.23kb (34.66%)
/tools/Documentation/.gitbook/assets/index.png -- 193.08kb -> 129.82kb (32.76%)
/tools/Documentation/.gitbook/assets/archive_thumb.jpg -- 630.55kb -> 540.78kb (14.24%)

Signed-off-by: ImgBotApp <[email protected]>

* Do not show page select when there are no results

* Also hide table header

* Update buildx action to the official Docker one

* Update s6

* Update less, the alpine package doesn't seem to have s6-overlay-preinit yet?

* pin s6 alpine package

* Rollback alpine base for now

* Fix ARM builds (#394)

* Update push-continous-delivery.yml

* Update release-delivery.yml

* Update push-continous-delivery.yml

* Update push-continous-delivery.yml

* Improve visibility when sorting table columns + fix being able to put whitespace as a custom column

* Fix context menu applying to overlays

* (#374) Remove cooldown on auto-plugin as it's basically useless

* Fix memory leak caused by the Parallel::Loops/Storable combo in the Search API

* Add some basic retrying logic on our first Redis connection
This avoids dying unnecessarily if Redis takes a while to load into memory.

* Accept "false" properly for pinned on category creation

* Also fix the update_cat endpoint

* Remove favtagmigration script

* Add a simple GET to /api/categories/:id

* (#335) Reading progression is now server-side!

* Fix method for progress in docs

* Fix docs thanks for nothing gitbook

* (#385) (#397) Remove the built-in Auto-Tag feature in favor of a Filename parsing plugin

* Fix tests

* Avoid uninitialized warnings if there's no progress/pagecount in DB

* Add a pageread stat and actually use the package.json description 'cause why not

* Remove unnecessary datatables cdn include

* (#282) Rewrite Shinobu filemap so it relies on Redis and keeps state between restarts

* (#405) Add job + api req to regen all thumbnails
Also remove redis loading timeout for REALLY BIG dataset

* (#410) NFC all the things

* Add fa-solid-900.woff2 to the vendor deps
so that browsers finally shut up about it, also added fa-regular

* Fix submenu arrow colors in the various themes

* autism

* Add an API endpoint to return which categories an ID belongs to

* (#375) Rework context menu so it uses the new endpoint to remove archives from categories

* More doc details on the reading progression API

* Some more documentation fixes

* MORE Documentation updates gee

* (#412) Add autofocus to the password input in login

* (#397) Add a check to RegexParse to avoid putting numbers as languages

* (#414) Fix regexparse not decoding the filesystem path + decode log in minion upload for extra clarity

* (#389) Update magick to v7 in homebrew

* backport changes from homebrew-core
those guys write ruby way better than I do

* Fix brew test runs

* (#389) Add libheif to Dockerfile and unlock avif/heif support

* Don't recommend homebrew for linux since it doesn't work out of the box atm

* Add Mojolicious::Plugin::Status when running in debug mode

* Update docs to fix wrong json examples

* Stop using a static secret for mojo's cookie signatures

* Remove the auto-plugin toggle and matching pref
It's a bit unnecessary considering plugins have to be toggled manually anyways.

* Rework Plugin Configuration page a bit further

* (#267) Make the thumbnail folder location an option

* [ImgBot] Optimize images (#416)

/tools/Documentation/.gitbook/assets/thumbchange.png -- 55.64kb -> 36.35kb (34.67%)

Signed-off-by: ImgBotApp <[email protected]>

Co-authored-by: ImgBotApp <[email protected]>

* Add some JS to migrate local reading progression

Co-authored-by: ImgBotApp <[email protected]>
Co-authored-by: Cirno the Strongest <[email protected]>
Co-authored-by: imgbot[bot] <31301654+imgbot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants