You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many (about 30%) of these author records have no associated work record. Those are low-hanging fruit that could simply be bulk removed.
More have only work records that are misattributed to the “author” with these publisher names and the “publisher” shown as "Independently Published", “CreateSpace” or the like. For these there is often another correct work record of similar title showing the correct authorship. Some heuristics might help with these.
A substantial group however are corporate authorships by publisher staff writers with no public attribution to an individual. This is particularly common in bibliographies, reference works, study notes, and textbooks.
Suggestions?
The text was updated successfully, but these errors were encountered:
Thank you. A few are real publishers with human names, sometimes shown where the author is unk or staff, but in most cases you’ll want a chainsaw, not pruning shears.
This is hard going -- I have removed about 2k empty publisher-as-authors which have already been cleared out and had no editions or works assigned to them, and removed many entirely non-book publishers and their work, which has between a dozen and hundreds of junk non-book records (sometimes repeated over and over again)...
This leave many 100s of junk publisher names with only one or two non-book items, inter-dispersed with more legitimate publisher-recorded-as-author, where there is probably some kind of clean up required other than simply deleting junk, but I've cleared out as much as I can easily do in an automate sweep.
I'll think on what can be done with what remains, it'll probably be more of a general identification of obvious non-book items to remove them completely.
There are tens of thousands of bogus author records with names * Publishing or * Books. Somewhat fewer with * Editions and other-language equivalents.
Many originate with the import of low quality records from BWB or AMZ such as https://www.betterworldbooks.com/product/detail/9783110367737
which was imported as
https://openlibrary.org/books/OL34526350M/Quantenmechanik
where the authors include
https://openlibrary.org/authors/OL9711355A/Perseus_Books_Perseus_Books_LLC.
Many (about 30%) of these author records have no associated work record. Those are low-hanging fruit that could simply be bulk removed.
More have only work records that are misattributed to the “author” with these publisher names and the “publisher” shown as "Independently Published", “CreateSpace” or the like. For these there is often another correct work record of similar title showing the correct authorship. Some heuristics might help with these.
A substantial group however are corporate authorships by publisher staff writers with no public attribution to an individual. This is particularly common in bibliographies, reference works, study notes, and textbooks.
Suggestions?
The text was updated successfully, but these errors were encountered: