Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performer disambiguation and aliases #3113

Merged
merged 28 commits into from
Dec 1, 2022

Conversation

WithoutPants
Copy link
Collaborator

@WithoutPants WithoutPants commented Nov 10, 2022

Includes refactoring of the performer object model to use relationship fields in the same way as scenes etc.

Adds disambiguation field. This field is shown in parentheses next to the performer title. This field can be scraped.

image

Converts the Aliases field to be a list of strings, in the same way as tag aliases. Scraped aliases are assumed to be comma-delimited and are split accordingly. The migration splits aliases in the same way.

image

While performer name + disambiguation must be unique, aliases are not enforced as such.

Removes the checksum field.

Updates the auto tag logic to take aliases into account. Disambiguation is not currently used for auto-tagging.

Aliases are included when filtering with the performer selector, but only show (alias) suffix if the alias matches.

image

Resolves #2507
Resolves #1724 (I think)
Resolves #1162
Resolves #464

@WithoutPants WithoutPants added the feature Pull requests that add a new feature label Nov 10, 2022
@WithoutPants WithoutPants added this to the Version 0.18.0 milestone Nov 10, 2022
@echo6ix
Copy link
Contributor

echo6ix commented Nov 10, 2022

Two things.

I. Sorting. It's easier to parse aliases when they're case insensitive alphanumerically sorted. After each save of the performer profile automatically sort the list of aliases. This would also apply to the migration process when converting the comma delimited string of aliases into a list.

II. Bulk adding. The + button is fine for a handful of aliases, but it's not uncommon for some performers to have more than a handful of aliases. Entering every alias for these type of performers one line at a time would be tedious.

Possible solutions:

A. Add another button adjacent to the + button that displays a bulk add input form for a comma delimited list.

image

B. Pressing <enter> or <shift>+<enter> (if a single carriage return would cause too many false positives) in the alias input form automatically displays the next alias input form and moves the cursor to this new input form.

@bnkai
Copy link
Collaborator

bnkai commented Nov 10, 2022

There seems to be an issue when scraping a performer's alias (the existing aliases are not displayed)

Example
Performer has aliases
2022-11-10_20-26

When scraping no existing aliases are listed
2022-11-10_20-25

@WithoutPants
Copy link
Collaborator Author

I. Sorting. It's easier to parse aliases when they're case insensitive alphanumerically sorted. After each save of the performer profile automatically sort the list of aliases. This would also apply to the migration process when converting the comma delimited string of aliases into a list.

Thanks for the feedback. Performer aliases are now sorted in the edit panel.

II. Bulk adding. The + button is fine for a handful of aliases, but it's not uncommon for some performers to have more than a handful of aliases. Entering every alias for these type of performers one line at a time would be tedious.

Possible solutions:

A. Add another button adjacent to the + button that displays a bulk add input form for a comma delimited list.
B. Pressing <enter> or <shift>+<enter> (if a single carriage return would cause too many false positives) in the alias input form automatically displays the next alias input form and moves the cursor to this new input form.

Thanks for the suggestions. I've gone with an approach similar to solution B - please let me know what you think.

The string list input now always shows an empty text field at the end. When you start typing into the field, it automatically adds it to the list and creates a new empty field to tab into. The add button has been removed.

image
image

If you clear the last field, then it is removed from the list. Clearing a field that's not at the end will show an error feedback as per existing behaviour.

This should streamline the approach somewhat. Entering aliases involves entering a value, pressing tab twice, entering the next value and so on. It's not as streamlined as pasting in a comma-delimited list, but it is an improvement.

@echo6ix
Copy link
Contributor

echo6ix commented Nov 15, 2022

@WithoutPants Definitely a big improvement 👍

@WeedLordVegeta420
Copy link

WeedLordVegeta420 commented Nov 21, 2022

This is a huge improvement overall. It makes the performer aliases actually useful.

One minor issue is that when scraping/searching in the tagger view, the result doesn't automatically resolve names based on alias, you have to manually search. See example below - performer wasn't automatically resolved but can be found by manually searching.

image

@kermieisinthehouse
Copy link
Collaborator

I'm seeing an error while migrating in this branch:

error performing migration: UNIQUE constraint failed: performers.name in line 0: CREATE TABLE `performer_aliases` (
  `performer_id` integer NOT NULL,
  `alias` varchar(255) NOT NULL,
  foreign key(`performer_id`) references `performers`(`id`) on delete CASCADE,
  PRIMARY KEY(`performer_id`, `alias`)
);

CREATE INDEX `performer_aliases_alias` on `performer_aliases` (`alias`);

-- `performers`.`aliases` will be dropped in the post-migration

DROP INDEX `performers_checksum_unique`;
ALTER TABLE `performers` DROP COLUMN `checksum`;
ALTER TABLE `performers` ADD COLUMN `disambiguation` varchar(255);

CREATE UNIQUE INDEX `performers_name_disambiguation_unique` on `performers` (`name`, `disambiguation`) WHERE `disambiguation` IS NOT NULL;
CREATE UNIQUE INDEX `performers_name_unique` on `performers` (`name`) WHERE `disambiguation` IS NULL;

@kermieisinthehouse
Copy link
Collaborator

Above error is from having duplicate named performers, tied to #2344 .

There is another issue: filtering performers using the new "aliases" field throws a sql error.

@WeedLordVegeta420
Copy link

There is another issue: filtering performers using the new "aliases" field throws a sql error.

I see the same behavior. Looks like it's trying to do a join on the tags table instead of the performers table.

error executing count query with SQL: SELECT COUNT(*) as count FROM (SELECT DISTINCT performers.id FROM performers LEFT JOIN performer_aliases ON performer_aliases.performer_id = tags.id WHERE ((performer_aliases.alias LIKE ? OR performer_aliases.alias LIKE ?))) as temp, args: [%Arisa% %Takarada%], error: error executing `SELECT COUNT(*) as count FROM (SELECT DISTINCT performers.id FROM performers LEFT JOIN performer_aliases ON performer_aliases.performer_id = tags.id WHERE ((performer_aliases.alias LIKE ? OR performer_aliases.alias LIKE ?))) as temp` [[%Arisa% %Takarada%]]: no such column: tags.id

@kermieisinthehouse
Copy link
Collaborator

I think that the user should be able to mark aliases individually for "Do not autotag" - otherwise extremely common aliases like "Mary" can't be used.

@WithoutPants
Copy link
Collaborator Author

I think that the user should be able to mark aliases individually for "Do not autotag" - otherwise extremely common aliases like "Mary" can't be used.

This is a significant change and will mean bumping this from 0.18. The alternative is that we leave the aliases out of the auto-tag behaviour for now to address later.

@kermieisinthehouse
Copy link
Collaborator

Yes, I don't want to hold up the release for this - I think maintaining the current autotag behavior is probably a good middle ground, if that's an easy change

@WeedLordVegeta420
Copy link

I do think that leaving the aliases out of the auto-tag for now is probably the better move until there is an option to exclude them individually from auto-tag. Otherwise, people will inevitably upgrade to the latest release and run auto-tag without realizing the impact those one name aliases will have.

@kermieisinthehouse
Copy link
Collaborator

I can confirm that the most recent commits fix the duplicate performer issue!

@WithoutPants WithoutPants merged commit 4daf0a1 into stashapp:develop Dec 1, 2022
@WithoutPants
Copy link
Collaborator Author

$150 bounty placed (contribution no. 624029)

@WithoutPants WithoutPants added the bounty This issue has a bounty on it in the OpenCollective label Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bounty This issue has a bounty on it in the OpenCollective feature Pull requests that add a new feature
Projects
None yet
5 participants