Continued retagging of the `albumtype` field (and potentially other fields) #4715

JOJ0 · 2023-03-17T07:46:37Z

After merging of #4582 there is a related issue remaining that was intentionally not addressed in that pull request.

We still see continued retagging of the albumtype (singular) field in certain cases.

Note: This bug is not about the ['a', 'l', ...] problem! This has been fixed in #4582 already! You might still have broken data in your database and thus see it. Follow this description to clean up your library: #4582 (comment)

The problem

The MusicBrainz API provides albumtypes and albumtype separately, but beets tries to derive the albumtype when reading files as being the first element of albumtypes. This can lead to the problem where beet write tries to modify the albumtype every time.

If an album has multiple types the MusicBrainz API seems to indicate that one of those is the "primary" type, and beets stores that in the db as the albumtype, but parsing the ID3 tag of the file it seems like it just uses the first element of albumtypes as the albumtype.

Writing the albumtype of those files doesn't actually write the file's tag, so beet write will always report trying to write it. It will update the timestamp of the file though, which could confuse other tools, making them "think" the file received relevant changes.

Reproduction of the problem

Timestamp of file is 10:34

$ ls -l "/Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac"
-rw-r--r--  1 jojo  staff  46857531 Dec 19 10:34 /Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac

Albumtype file tag is album / compilation:

$ mediainfo "/Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac" | grep -i type
Cover type                               : Cover (front)
RELEASETYPE                              : album / compilation
MUSICBRAINZ_ALBUMTYPE                    : album / compilation

which in reality is two values, each being set to a single tag:

$ mutagen-inspect "/Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac" | grep -i type
RELEASETYPE=album
RELEASETYPE=compilation
MUSICBRAINZ_ALBUMTYPE=album
MUSICBRAINZ_ALBUMTYPE=compilation

Now writing shows changes but they only show a single value instead of both values:

$ beet write body heat
Fockewulf 190 - The Dark Side of Italo Disco - Body Heat (vocal) - src:$data_source |Pop
  albumtype: album -> compilation

The file's timestamp gets updated:

$ ls -l "/Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac"
-rw-r--r--  1 jojo  staff  46857531 Dec 19 10:56 /Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac

The file's tag is still the same, beet write shouldn't have bothered to update the file!

$ mediainfo "/Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac" | grep -i type
Cover type                               : Cover (front)
RELEASETYPE                              : album / compilation
MUSICBRAINZ_ALBUMTYPE                    : album / compilation

$ mutagen-inspect "/Users/jojo/Music/dev-beets/Compilations/The Dark Side of Italo Disco/01 Body Heat (vocal).flac" | grep -i type
RELEASETYPE=album
RELEASETYPE=compilation
MUSICBRAINZ_ALBUMTYPE=album
MUSICBRAINZ_ALBUMTYPE=compilation

Setup

OS: ---
Python version: ---
beets version: 1.6.1 (dev)
Turning off plugins made problem go away (yes/no): no

My configuration (output of beet config) is:

not relevant

Related iterations of this bug

We might have a related issue if not exactly the same issue with other multi-valued fields (mb_albumartistids, ...) that have a single representation of that same field mb_albumartistid, ...). The logic in mediafile is identical. A detailed report of a user is found here: #5045

The text was updated successfully, but these errors were encountered:

judemille · 2023-03-18T03:20:28Z

Writing the albumtype of those files doesn't do anything

That's not entirely true. When beets does this rigmarole, it changes the file's timestamp, which will cause tools like Nextcloud to go through the whole procedure as if there's a whole new version of the file (Nextcloud, and WebDAV-based tools in general IIRC, do not allow incremental file updates.)

JOJ0 · 2023-03-18T09:53:41Z

Thanks a lot, important detail! I'll add that to the description later today!!

judemille · 2023-03-20T07:08:11Z

A potential hacky fix could be to copy the albumtype from the file on disk to the tags to be written, so the comparison thinks all is well. Perhaps I'll give that a shot.

EDIT: removed question about what files are affected: inspected mediafile code, conclusion: all files affected. Also changing proposed solution.

judemille · 2023-03-20T07:20:44Z

Can confirm that solution fixes the issue. Is it the best solution? Hell if I know. Does it stop the continued retagging? Yes.

E: Really, I think the best solution would be to make the code simply ignore the albumtype field in all cases where it would be trying to write to disk, not just this command, as well as not displaying the extraneous "changes" that would be patched out.

dr-waterstorm · 2023-03-22T08:13:10Z

Thank you @JOJ0 @judemille you were right about the issue with the release type. I did not update to GIT version, but only pulled the latest version from the Linux repo before. My apologies for not trying that :). That issue is fixed.

With the release types corrected, now I do observe the issue with he albumtype.
Beets keeps showing vast amounts of

albumtype: album -> compilation

and similar. Checking the file with ffmpeg for the tag still shows the old one, so it did not update the actual tag.

gamoeri · 2023-04-24T06:20:26Z

I am running into a similar issue - it does not update the albumtype, even if reimported.

ilmc888 · 2023-07-08T12:08:59Z

I have the same issue. Knowing next to nothing about the code but after some debugging I think the problem lies in mediafile.py (version 0.12 = same as master):

def update(self, dict):
        """Set all field values from a dictionary.

        For any key in `dict` that is also a field to store tags the
        method retrieves the corresponding value from `dict` and updates
        the `MediaFile`. If a key has the value `None`, the
        corresponding property is deleted from the `MediaFile`.
        """
        for field in self.sorted_fields():
            if field in dict:
                if dict[field] is None:
                    delattr(self, field)
                else:
                    setattr(self, field, dict[field])

When the albumtype tag is set correctly in this routine (to e.g., compilation) it gets immediately reverted back to album the next iteration of that loop which happen to set albumtypeS. My guess is that somehow mediafile overrides albumtype with the first entry of albumtypes again (by some operator overloading or such, haven't digged deeper yet).

Albumtype is also defined lower down in mediafile.py as a single field of albumtypes, which I assume will be the first entry of albumtypes:['album', 'compilation'].

    albumtypes = ListMediaField(
        MP3ListDescStorageStyle('MusicBrainz Album Type', split_v23=True),
        MP4ListStorageStyle('----:com.apple.iTunes:MusicBrainz Album Type'),
        ListStorageStyle('RELEASETYPE'),
        ListStorageStyle('MUSICBRAINZ_ALBUMTYPE'),
        ASFStorageStyle('MusicBrainz/Album Type'),
    )
    albumtype = albumtypes.single_field()

ilmc888 · 2023-07-08T13:16:41Z

Reading some other issues, the cause might already been known :-)

Is there any agreement on how to handle this? Just write albumtype as is, or check whether albumtype is in albumtypes? Both reading and writing.

JOJ0 · 2023-07-28T06:06:05Z

@ilmc888 I'd love to finally get to fixing this but I'm not yet sure what's the best way. If you feel like it help me brainstorm. The following ideas and I didn't check any of it in code:

If the tagging source is MusicBrainz we should always ignore the single albumtype field. It should always be written from the multi field albumtypes
With other source, like Discogs for example, there is no multi field albumtypes, there will always only be a single albumtype field. It should be written as-is / as received from the Discogs API

Take everything with a grain of salt and double check. Probably I'm just thinking out loud, being too lazy to read code atm (other music coding things in mind right now ;-)) I'd love to get help with tackling that issue, it's been long due....

ilmc888 · 2023-08-15T00:11:18Z

I might not be aware of all the intricacies yet, but I do understand that Musicbrainz has both albumtype and albumtypes information, but we can have only one tag 'MusicBrainz Album Type'?

Does it make sense to take a look at Musicbrainz Picard and how that information is handled there (just tag one track and ffprobe it)? Those are their own tags after all:
https://github.com/metabrainz/picard
https://picard-docs.musicbrainz.org/downloads/MusicBrainz_Picard_Tag_Map.html

In general I would just think about how it will be used by music player. If albumtype tag can / would be a list (or we make an extra albumtypes tag; if this is even possible), will this actually be used by music players? I suspect not (yet), so this would not be really practically useful?

On the other hand, I agree using the first entry of albumtypes might not be always informative (e.g., often just 'album' instead of 'live', 'compilation', etc.), so I understand why there was some logic included in beets to handle this. Still, I see this more as a flaw of Musicbrainz than anything else and wouldn't mind a one-on-one mapping ignoring albumtypes all together (until there is an accepted solution in the audio community). Including such logic will always be prone to errors (I believe I saw albumtypes of album / live / compilation / soundtrack, what would be the first one?).

Actually not sure if there is extra logic to extract one type from albumtypes? Does MB expose the primary release type (or can we find how they do it in the Picard code?)?

If the tagging source is MusicBrainz we should always ignore the single albumtype field. It should always be written from the multi field albumtypes

Well this seems like a solution to make the albumtype useful, but wasn't the problem that with a beet update you would then have to compare it to the internal albumtypes data (using the same approach to extract it). Otherwise you get the infinite update / write loop. Or just ignore the whole albumtypes in the database. On beet import / mbsync, use the MB albumtypes to extract a useful albumtype for the database and write it to the tag field (ignore the MB albumtype itself). A subsequent beet update will just compare the albumtype one-on-one and thus does not need any change.

This reply turned out longer than I expected, with likely a lot of (false) assumptions, but I hope it helps.

ilmc888 · 2023-08-15T00:32:43Z

I only had time to give it a brief look, so unsure if I'm looking at the right snippet but
https://github.com/metabrainz/picard/blob/3b20cb2f2d3fb1df14e7d69a96d9280e4f219c15/picard/mbjson.py#L607

Here I see that releasetype, which is presumably written as a tag (see id3.py), is a concatenation of primary and secondary releasetypes. What would be the problem if beets is doing the exact same thing?

And on beet update, if a MB id exist (or a albumtypes is in the DB) compare the MB albumtype tag, which is a list then (again, assumption Im making), to that same concatenation.

ilmc888 · 2023-08-17T22:02:48Z

@JOJ0 I'm probably overthinking things as it seems the only problem at the moment is writing the right tag to disk. In that case your solution will work fine. I think that change has to be in mediafile.py (?) as there a single album type is extracted from album types.

However, what about just changing the order of album types when pulling the data from MB. Reorder it so the most important type is in front (compilation, live, ...) instead of just album. In that case mediafile.py will just work as is (I think it simply extracts the first element)? Other sources will also just work, and the change is contained within the MB code itself?

JOJ0 · 2023-12-12T09:56:25Z

@arogl you could try quickfixing your issue #5042 by following my ideas in above comment. I still think that (at least) part of the problem could be solved like this: #4715 (comment)

So what I'm trying to say is: Try ignoring the (single) albumtype field in the musicbrainz autotag code and see if that changes anything.

We won't loose the information since musicbrainz also provides a multi field called albumtypes (I'm not sure of the name that comes from the MB api!) which will then be handled by mediafile when we write it to disk (#4715 (comment)). It could be though, that the useless writing of the albumtype file tag will still happen, regardless of the exclusion of the single albumtype field before (when getting it from the api and assigning it to the library objects) - I don't know - It would be super helpful to experiment with that!

Even a dirty quickfix would help us to draft a proper PR at some point!

celynw · 2024-01-03T23:44:03Z

Sorry to jump in, but this is driving me crazy so I thought I'd try something to help.
If I'm following correctly, I tried commenting these lines:

beets/beets/autotag/mb.py

Lines 555 to 558 in 296f01b

    
           if "type" in release["release-group"]: 
        
               reltype = release["release-group"]["type"] 
        
               if reltype: 
        
                   info.albumtype = reltype.lower()

Behaviour before:

$ beet ls -a -f '$album [$albumtype] [$albumtypes]'
Monstercat 001 - Launch Week [compilation] [album; compilation]
$ beet write
Arion - Monstercat 001 - Launch Week - Cold Blood & Ice Cream Cones
  albumtype: album -> compilation
$ beet write
Arion - Monstercat 001 - Launch Week - Cold Blood & Ice Cream Cones
  albumtype: album -> compilation

Behaviour after:

$ beet ls -a -f '$album [$albumtype] [$albumtypes]'
Monstercat 001 - Launch Week [album] [album; compilation]
$ beet write
$

So it fixes my infinite beet write loop.
I'm on the most recent beets commit, 296f01b (after version 1.6.1)

Based on mediafile, it seems like it's returning the first field for the singular albumtypes, which I also think is supposed to be the primary field in MusicBrainz.

Oh, and in both cases, metaflac tells me that there are two RELEASETYPE tags:

    comment[15]: RELEASETYPE=album
    comment[16]: RELEASETYPE=compilation

JOJ0 · 2024-01-04T13:52:38Z

Thanks @celynw yeah it helps.

I have something very similar in the queue, which basically just ignores albumtype and never uses it anywhere in beets in favor of albumtypes.

It's coded finished but too busy to file the PR this week but hopefully during the next month.

JOJ0 · 2024-03-14T09:30:02Z

Some notes on a very high level for this issue:

We need to decide on the source of truth for albumtypes and albumtype, I think it should be saved in the beets lib/db exclusively in one place, I suggest in the multifield albumtypes
If we want to derive a single albumtype (which quite a lot of plugins actually currently are using and requiring) we need to find a way to fetch it from albumtypes.
The current approach though, if I'm not mistaken, fetches albumtype from albumtypes on a very low level, i.e. it is implemented in beetbox/mediafile.git
Another problem of the current implementation is, that we save a single-valuealbumtype from various sources (Discogs, MusicBrainz, Spotify...) in the DB separately, which clashes with the information we have in the multi-field albumtypes
Maybe fetching albumtype from albumtypes should be handled on a higher level, i.e. on the beets library level / the database level.

Some of these issues I tried to address here: #5075

but I think that getting rid of albumtype (single-field) altogether is not the best solution. It breaks existing plugins that want to use it. We still need to find a working way of fetching a single (main) albumtype from albumtypes

JOJ0 mentioned this issue Mar 17, 2023

Store albumtypes multi-value field consistently in-DB & in-tag, preventing continual file re-tagging #4582

Merged

3 tasks

This comment has been minimized.

Sign in to view

JOJ0 mentioned this issue Oct 29, 2023

beet update messes up albumtype tags under Bash for Windows #4938

Closed

JOJ0 mentioned this issue Dec 19, 2023

beets cannot write the albumtype tags to the file metadata #5045

Closed

JOJ0 changed the title ~~Continued retagging of the albumtype field.~~ Continued retagging of the albumtype field (and potentially other fields) Dec 21, 2023

This was referenced Jan 10, 2024

An opinionated fix for "the albumtype(s) problem" #5075

Draft

Beets not updating certain tracks with albumartists (and other multi-tag fields) #5059

Open

andzno1 mentioned this issue Mar 27, 2024

Add ability to specify fields with beet write #3720

Open

andrewjmetzger mentioned this issue Jul 6, 2024

beet write continuously writes mb_artistid and mb_albumartistid (1.7.0_pre20240501) #5265

Open

daniele-athome added a commit to daniele-athome/beets that referenced this issue Aug 15, 2024

Patch for beetbox#4715

eb4f558

daniele-athome added a commit to daniele-athome/beets that referenced this issue Aug 15, 2024

Patch for beetbox#4715

c3aec0a

daniele-athome added a commit to daniele-athome/beets that referenced this issue Aug 15, 2024

Patch for beetbox#4715

61a74b5

daniele-athome added a commit to daniele-athome/beets that referenced this issue Aug 20, 2024

Patch for beetbox#4715

fe78fb7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continued retagging of the `albumtype` field (and potentially other fields) #4715

Continued retagging of the `albumtype` field (and potentially other fields) #4715

JOJ0 commented Mar 17, 2023 •

edited

Loading

judemille commented Mar 18, 2023 •

edited

Loading

JOJ0 commented Mar 18, 2023

judemille commented Mar 20, 2023 •

edited

Loading

judemille commented Mar 20, 2023 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

dr-waterstorm commented Mar 22, 2023 •

edited

Loading

gamoeri commented Apr 24, 2023

ilmc888 commented Jul 8, 2023 •

edited

Loading

ilmc888 commented Jul 8, 2023

JOJ0 commented Jul 28, 2023 •

edited

Loading

ilmc888 commented Aug 15, 2023

ilmc888 commented Aug 15, 2023

ilmc888 commented Aug 17, 2023

JOJ0 commented Dec 12, 2023 •

edited

Loading

celynw commented Jan 3, 2024 •

edited

Loading

JOJ0 commented Jan 4, 2024 •

edited

Loading

JOJ0 commented Mar 14, 2024 •

edited

Loading

Continued retagging of the albumtype field (and potentially other fields) #4715

Continued retagging of the albumtype field (and potentially other fields) #4715

Comments

JOJ0 commented Mar 17, 2023 • edited Loading

The problem

Reproduction of the problem

Setup

Related iterations of this bug

judemille commented Mar 18, 2023 • edited Loading

JOJ0 commented Mar 18, 2023

judemille commented Mar 20, 2023 • edited Loading

judemille commented Mar 20, 2023 • edited Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

dr-waterstorm commented Mar 22, 2023 • edited Loading

gamoeri commented Apr 24, 2023

ilmc888 commented Jul 8, 2023 • edited Loading

ilmc888 commented Jul 8, 2023

JOJ0 commented Jul 28, 2023 • edited Loading

ilmc888 commented Aug 15, 2023

ilmc888 commented Aug 15, 2023

ilmc888 commented Aug 17, 2023

JOJ0 commented Dec 12, 2023 • edited Loading

celynw commented Jan 3, 2024 • edited Loading

JOJ0 commented Jan 4, 2024 • edited Loading

JOJ0 commented Mar 14, 2024 • edited Loading

Continued retagging of the `albumtype` field (and potentially other fields) #4715

Continued retagging of the `albumtype` field (and potentially other fields) #4715

JOJ0 commented Mar 17, 2023 •

edited

Loading

judemille commented Mar 18, 2023 •

edited

Loading

judemille commented Mar 20, 2023 •

edited

Loading

judemille commented Mar 20, 2023 •

edited

Loading

dr-waterstorm commented Mar 22, 2023 •

edited

Loading

ilmc888 commented Jul 8, 2023 •

edited

Loading

JOJ0 commented Jul 28, 2023 •

edited

Loading

JOJ0 commented Dec 12, 2023 •

edited

Loading

celynw commented Jan 3, 2024 •

edited

Loading

JOJ0 commented Jan 4, 2024 •

edited

Loading

JOJ0 commented Mar 14, 2024 •

edited

Loading