[Feature] Autotag Optimizations #2366
Labels
bounty
This issue has a bounty on it in the OpenCollective
feature request
feature
Pull requests that add a new feature
Milestone
Autotag performance currently leaves me unable to run autotag on my image library (just under 14 million images).
I'm requesting some specific optimizations so that autotag can be greatly sped up. Note that the old bulk autotag implementation (sqlite regex based), was much faster, but was less configurable and used more memory.
In scope:
Not in scope: I am unsure of how much time the tag narrowing strategy currently saves. Is tokenizing the string and querying the database really faster than just naively doing all regex comparisons, especially if we only compile them once during task lifetime?
This was partially started in #1927, but it is not necessary to reuse any of it.
I am willing to put a decent bounty on this issue, as it is somewhat large and very useful to me.
The text was updated successfully, but these errors were encountered: