Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty EA metadata leads to assert in ad_header_read_ea() #368

Closed
PuffyRainbowCloud opened this issue Aug 8, 2023 · 57 comments
Closed

Empty EA metadata leads to assert in ad_header_read_ea() #368

PuffyRainbowCloud opened this issue Aug 8, 2023 · 57 comments

Comments

@PuffyRainbowCloud
Copy link

I ran my server with Netatalk for over a year with no issue. Suddenly, after an update, it stopped working. At the time I didn't have much knowledge (I'm still a beginner) about running Ubuntu Server and after a reinstall of Netatalk didn't help I left it.
Today, I decided to look into it again. So far, I've been able to run systemctl status netatalk and saw the following in the log:
netatalk.service: Can't open PID file /var/lock/netatalk (yet?) after start: Operation not permitted
This is the only thing I know how to see that looks like an error. Any ideas?

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

@PuffyRainbowCloud Please share a few hundreds of lines of syslog from when you're attempting to start up netatalk. F.e. do this on the terminal: sudo journalctl -f -n 200. This should tell us what's actually going on.

The lockfile message is a red herring, I think. If I'm not mistaken it's a temporary state that resolves itself. If you see it repeatedly while netatalk is running you may want to check the permissions of that lockfile and make sure it's readable by everyone.

@PuffyRainbowCloud
Copy link
Author

PuffyRainbowCloud commented Aug 9, 2023

@PuffyRainbowCloud Please share a few hundreds of lines of syslog from when you're attempting to start up netatalk. F.e. do this on the terminal: sudo journalctl -f -n 200. This should tell us what's actually going on.

The lockfile message is a red herring, I think. If I'm not mistaken it's a temporary state that resolves itself. If you see it repeatedly while netatalk is running you may want to check the permissions of that lockfile and make sure it's readable by everyone.

So, I have an interesting update:
Initially, I tried installing via apt and following this guide: Netatalk 3.1 Manual but after following this guide: Install Netatalk 3.1.12 on Ubuntu 18.10 Cosmic it sorta works, adjusting for Netatalk release 3.1.15 and Tracker 3.0. I can't connect via AFP but the SAMBA shares now show up in Finder and no longer have to be connected to manually. I don't really understand why.

I believe I managed to comply with your request by running sudo systemctl restart netatalk && sudo journalctl -f -n 200

The output consists mainly of some error from deluged. This is everything about Netatalk:
Aug 09 01:42:39 servercluwub systemd[1]: Starting Netatalk AFP fileserver for Macintosh clients...
Aug 09 01:42:39 servercluwub systemd[1]: netatalk.service: Can't open PID file /var/lock/netatalk (yet?) after start: Operation not permitted
Aug 09 01:42:39 servercluwub systemd[1]: netatalk.service: Supervising process 111362 which is not our child. We'll most likely not notice when it exits.
Aug 09 01:42:39 servercluwub netatalk[111362]: Netatalk AFP server starting
Aug 09 01:42:39 servercluwub netatalk[111362]: Registered with Zeroconf
Aug 09 01:42:39 servercluwub systemd[1]: Started Netatalk AFP fileserver for Macintosh clients.
Aug 09 01:42:39 servercluwub sudo[111354]: pam_unix(sudo:session): session closed for user root
Aug 09 01:42:39 servercluwub cnid_metad[111364]: CNID Server listening on localhost:4700
Aug 09 01:42:39 servercluwub sudo[111365]: alice : TTY=pts/0 ; PWD=/home/alice ; USER=root ; COMMAND=/usr/bin/journalctl -f -n 200
Aug 09 01:42:39 servercluwub sudo[111365]: pam_unix(sudo:session): session opened for user root(uid=0) by alice(uid=1000)
Aug 09 01:42:39 servercluwub afpd[111363]: Netatalk AFP/TCP listening on 192.168.1.3:548

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

One way to test if the AFP server is running is to do asip-status.pl localhost on the terminal. If the netatalk server is working you should get a bunch of info about the server.

$ asip-status.pl localhost
AFP reply from localhost:548 via IPv4
Flags: 1  Cmd: 3  ID: 57005
Reply: DSIGetStatus
Request ID: 57005
Machine type: Netatal3.1.99dev
AFP versions: AFP2.2,AFPX03,AFP3.1,AFP3.2,AFP3.3,AFP3.4
UAMs: DHX2,Cleartxt Passwrd,No User Authent
Volume Icon & Mask: Yes
Flags:
    SupportsCopyFile
    SupportsServerMessages
    SupportsServerSignature
    SupportsTCP/IP
    SupportsSrvrNotifications
    SupportsOpenDirectory
    SupportsUTF8Servername
    SupportsUUIDs
    SupportsExtSleep
    SupportsSuperClient
Server name: buntu
Signature:
d5 fa ae dd 05 2e a4 8a a9 31 a3 03 0d c9 fe 5b  .........1.....[

Network address: w.x.y.z (IPv4 address)

Also, please note that if you went through the guide for installing from source, the resulting netatalk daemon will be reading from configuration files in a different location than the deb netatalk. This is most likely why you're not seeing any AFP shares on your Mac. To check which afp.conf file it's reading you can do netatalk -V which should get you something like:

$ netatalk -V
netatalk 3.1.99dev - Netatalk AFP server service controller daemon

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version. Please see the file COPYING for further information and details.

netatalk has been compiled with support for these features:

      Zeroconf support: Avahi
     Spotlight support: Yes

                  afpd: /usr/local/sbin/afpd
            cnid_metad: /usr/local/sbin/cnid_metad
       tracker manager: /usr/bin/tracker3 daemon
           dbus-daemon: /usr/bin/dbus-daemon
              afp.conf: /usr/local/etc/afp.conf
     dbus-session.conf: /usr/local/etc/dbus-session.conf
    netatalk lock file: /var/lock/netatalk

Then make sure that your configurations are reflected in /usr/local/etc/afp.conf with the example above.

@PuffyRainbowCloud
Copy link
Author

Alright, running that gives me this:

AFP reply from localhost:548 via IPv4
Flags: 1 Cmd: 3 ID: 57005
Reply: DSIGetStatus
Request ID: 57005
Machine type: Netatalk3.1.15
AFP versions: AFP2.2,AFPX03,AFP3.1,AFP3.2,AFP3.3,AFP3.4
UAMs: DHX2,DHCAST128
Volume Icon & Mask: Yes
Flags:
SupportsCopyFile
SupportsServerMessages
SupportsServerSignature
SupportsTCP/IP
SupportsSrvrNotifications
SupportsOpenDirectory
SupportsUTF8Servername
SupportsUUIDs
SupportsExtSleep
SupportsSuperClient
Server name: servercluwub
Signature:
43 3b bd 46 85 3d ac a1 ef f6 a4 70 01 b2 75 06 C;.F.=.....p..u.

Network address: 192.168.1.3 (IPv4 address)
UTF8 Servername: servercluwub

Which looks promising, right?
However, I've already set up afp.conf according to the guide I linked, in the correct location, and that doesn't seem to change anything.

@PuffyRainbowCloud
Copy link
Author

Hold several horses! I removed the semicolons from the lines in apf.conf and now I do see shares when I connect manually vi Go > Connect to server... and I can successfully connect to and mount them via AFP. However, the shares that show up in finder are still SMB rather than AFP.

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

Is Zeroconf support enabled in netatalk?

@PuffyRainbowCloud
Copy link
Author

I'm not sure. How would I find that out?

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

If you look at my earlier comment you can see the sample output for netatalk -V complete with Zeroconf support.

@PuffyRainbowCloud
Copy link
Author

PuffyRainbowCloud commented Aug 9, 2023

Hm. Indeed, despite following the guide, which should set that up, I get this:

$ netatalk -V
netatalk 3.1.15 - Netatalk AFP server service controller daemon

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version. Please see the file COPYING for further information and details.

netatalk has been compiled with support for these features:

  Zeroconf support:	No
 Spotlight support:	Yes

              afpd:	/usr/local/sbin/afpd
        cnid_metad:	/usr/local/sbin/cnid_metad
   tracker manager:	/usr/bin/tracker3 daemon
       dbus-daemon:	/usr/bin/dbus-daemon
          afp.conf:	/usr/local/etc/afp.conf
 dbus-session.conf:	/usr/local/etc/dbus-session.conf
netatalk lock file:	/var/lock/netatalk

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

Well there's your problem right there. So either Zeroconf support wasn't built into netatalk (i.e. libraries not found at configuration time) or is dynamically disabled (i.e. turned off in afp.conf).

First you can try to explicitly enable Zeroconf in afp.conf (zeroconf = yes) although I don't think that will make a difference.

As a next step run the configure script again with the same parameters you used previously, and then observe the summary output at the end. Under "Options" you should see something like:

    Options:
         Zeroconf support:        yes
         tcp wrapper support:     no
         quota support:           no
         valid shell check:       yes
         cracklib support:        no
         ACL support:             auto
         Kerberos support:        auto
         LDAP support:            no
         AFP stats via dbus:      yes
         dtrace probes:           no

@PuffyRainbowCloud
Copy link
Author

I figured out what happened. I made a type so I literally didn't install libavahi-client-dev. I can't believe I missed the error message on that. Note to self: copy package names, even when they're listed instead of being a copy-paste-able command. Double check apt. A configure now shows these options:

Options:
     Zeroconf support:        yes
     tcp wrapper support:     no
     quota support:           no
     valid shell check:       yes
     cracklib support:        yes
     ACL support:             auto
     Kerberos support:        yes
     LDAP support:            no
     AFP stats via dbus:      yes
     dtrace probes:           no

I'm gonna go ahead and install this and see how that goes.

@PuffyRainbowCloud
Copy link
Author

That did it. I feel immensely silly now. I guess I will forever wonder why my original install stopped working but at least I don't have to worry about it for now. If the problem returns in the future I'll have to make a new issue and hopefully troubleshoot it properly.

Is there anything else I should think about or know before closing the issue?

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

Well I hope it was a learning experience for you at least!

I'm a little bit concerned about your original issue. My initial guess is that it was the same as #236 but if updating to 3.1.15 proper solved the problem then it may have been something else.

@PuffyRainbowCloud
Copy link
Author

It certainly was! I have very little experience when it comes to compiling software so every opportunity to mess around with that is good. I'm going to keep my eyes peeled for further issues, definitely. Thank you for all of your help!

@PuffyRainbowCloud
Copy link
Author

Alright. So, some further information that's arisen since I thought this was fixed yesterday:
I ended up discovering that this wasn't fixed at all. I tried changing permissions on that PID file again and... It works. The error message didn't appear the first time I ran netatalk after changing permissions and I can not only access the share but open files too. This has, however, exposed this error message which occurs every time I access a folder (even if it's accessed successfully):

Aug 09 19:26:10 servercluwub afpd[2860]: ad_header_read_ea("/media/alice/Filserver"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Aug 09 19:26:10 servercluwub afpd[2860]: ad_open_hf_ea: unexpected: Invalid argument

I was aware of this error message and tried to research it yesterday but didn't come up with a solution. I've tried to rebuild the CNID databases for my shares but that didn't make a difference. What's weird is that the share seems to work now, despite the PID error message returning after further restarts of the service as well as these EA errors. Not quite sure what to make of it.

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

This is what I expected. It means your issue is a duplicate with #236 as I first thought.

If you have some time to spare for troubleshooting, I have a PR at #363 with additional logging to help pinpoint which metadata is causing trouble. I have not been able to reproduce the issue myself. It seems to be caused by the root of the volume having EA metadata that was created with an earlier version of netatalk, which is now treated as invalid.

Also, please run getfaddr on the volume root dir and share the output here so that we have a record of what the metadata was before we start tinkering with it.

@PuffyRainbowCloud
Copy link
Author

Fascinating!

I have all of the time in the world but very little knowledge or experience so I'm gonna need some handholding. When you say you have a PR, what does that mean and what am I expected to do to help?

I don't seem to have that command nor can i find a package by that name using apt search. Is it part of another package?

@PuffyRainbowCloud
Copy link
Author

Also, I have again lost the ability to stay connected and access files. When trying to log on it fails and I now get messages like these in journalctl:

Aug 09 19:45:23 servercluwub afpd[3748]: afp_disconnect: primary reconnect failed
Aug 09 19:45:23 servercluwub afpd[3748]: ad_header_read_ea("/home/alice"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Aug 09 19:45:23 servercluwub afpd[3748]: ad_open_hf_ea: unexpected: Invalid argument
Aug 09 19:45:23 servercluwub afpd[3748]: ===============================================================
Aug 09 19:45:23 servercluwub afpd[3748]: INTERNAL ERROR: Signal 11 in pid 3748 (3.1.15)
Aug 09 19:45:23 servercluwub afpd[3748]: ===============================================================
Aug 09 19:45:23 servercluwub afpd[3748]: PANIC: internal error
Aug 09 19:45:23 servercluwub afpd[3748]: BACKTRACE: 11 stack frames:
Aug 09 19:45:23 servercluwub afpd[3748]: #0 /usr/local/lib/libatalk.so.18(netatalk_panic+0x39) [0x7f276acdf139]
Aug 09 19:45:23 servercluwub afpd[3748]: #1 /usr/local/lib/libatalk.so.18(+0x3828e) [0x7f276acdf28e]
Aug 09 19:45:23 servercluwub afpd[3748]: #2 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f276a67f520]
Aug 09 19:45:23 servercluwub afpd[3748]: #3 /lib/x86_64-linux-gnu/libc.so.6(+0x17c818) [0x7f276a7b9818]
Aug 09 19:45:23 servercluwub afpd[3748]: #4 /usr/local/lib/libatalk.so.18(ad_copy_header+0x6c) [0x7f276acbc34c]
Aug 09 19:45:23 servercluwub afpd[3748]: #5 /usr/local/sbin/afpd(afp_exchangefiles+0x726) [0x55d596ae7926]
Aug 09 19:45:23 servercluwub afpd[3748]: #6 /usr/local/sbin/afpd(afp_over_dsi+0x51d) [0x55d596ad180d]
Aug 09 19:45:23 servercluwub afpd[3748]: #7 /usr/local/sbin/afpd(main+0x9ae) [0x55d596acf7fe]
Aug 09 19:45:23 servercluwub afpd[3748]: #8 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f276a666d90]
Aug 09 19:45:23 servercluwub afpd[3748]: #9 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f276a666e40]
Aug 09 19:45:23 servercluwub afpd[3748]: #10 /usr/local/sbin/afpd(_start+0x25) [0x55d596acfb45]
Aug 09 19:45:25 servercluwub afpd[3761]: Login by alice (AFP3.4)
Aug 09 19:45:25 servercluwub afpd[3761]: afp_disconnect: trying primary reconnect
Aug 09 19:45:25 servercluwub afpd[3297]: Reconnect: no child[3748]
Aug 09 19:45:30 servercluwub afpd[3761]: afp_disconnect: primary reconnect failed
Aug 09 19:45:30 servercluwub afpd[3761]: ad_header_read_ea("/home/alice"): invalid metadata EA this is now being treated as a fatal error. if you see this log entry, please file a bug ticket with your upstream vendor and attach the generated core file.
Aug 09 19:45:30 servercluwub afpd[3761]: ad_open_hf_ea: unexpected: Invalid argument
Aug 09 19:45:30 servercluwub afpd[3761]: ===============================================================
Aug 09 19:45:30 servercluwub afpd[3761]: INTERNAL ERROR: Signal 11 in pid 3761 (3.1.15)
Aug 09 19:45:30 servercluwub afpd[3761]: ===============================================================
Aug 09 19:45:30 servercluwub afpd[3761]: PANIC: internal error
Aug 09 19:45:30 servercluwub afpd[3761]: BACKTRACE: 11 stack frames:
Aug 09 19:45:30 servercluwub afpd[3761]: #0 /usr/local/lib/libatalk.so.18(netatalk_panic+0x39) [0x7f276acdf139]
Aug 09 19:45:30 servercluwub afpd[3761]: #1 /usr/local/lib/libatalk.so.18(+0x3828e) [0x7f276acdf28e]
Aug 09 19:45:30 servercluwub afpd[3761]: #2 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f276a67f520]
Aug 09 19:45:30 servercluwub afpd[3761]: #3 /lib/x86_64-linux-gnu/libc.so.6(+0x17c818) [0x7f276a7b9818]
Aug 09 19:45:30 servercluwub afpd[3761]: #4 /usr/local/lib/libatalk.so.18(ad_copy_header+0x6c) [0x7f276acbc34c]
Aug 09 19:45:30 servercluwub afpd[3761]: #5 /usr/local/sbin/afpd(afp_exchangefiles+0x726) [0x55d596ae7926]
Aug 09 19:45:30 servercluwub afpd[3761]: #6 /usr/local/sbin/afpd(afp_over_dsi+0x51d) [0x55d596ad180d]
Aug 09 19:45:30 servercluwub afpd[3761]: #7 /usr/local/sbin/afpd(main+0x9ae) [0x55d596acf7fe]
Aug 09 19:45:30 servercluwub afpd[3761]: #8 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f276a666d90]
Aug 09 19:45:30 servercluwub afpd[3761]: #9 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f276a666e40]
Aug 09 19:45:30 servercluwub afpd[3761]: #10 /usr/local/sbin/afpd(_start+0x25) [0x55d596acfb45]
Aug 09 19:45:31 servercluwub afpd[3765]: Login by alice (AFP3.4)
Aug 09 19:45:31 servercluwub afpd[3765]: afp_disconnect: trying primary reconnect
Aug 09 19:45:31 servercluwub afpd[3297]: Reconnect: no child[3761]

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

Great, thanks for offering to help out! If you look at the PR that I linked to, it says that the git branch name is "rdmark-issue-236". Therefore you want to check out this branch. Do something like git checkout rdmark-issue-236. Then go through the configuration and build steps as you did before.

@PuffyRainbowCloud
Copy link
Author

I assume that I need to run sudo make uninstall in my previous folder first?

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

That is not necessary. However, you do want to make sure netatalk is not running when you install the new version. So if you're running it through the systemd service, do sudo systemctl stop netatalk first.

@PuffyRainbowCloud
Copy link
Author

Alright. I should have your version set up now. What do we need to do next?

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

The same steps as before: Have journalctl running to see the logs (or use any other means to collect syslog) and then go through the steps to start up and try to use netatalk which leads to the error.

@PuffyRainbowCloud
Copy link
Author

Alright. I'm running it now and actively logging everything from the netatalk unit. I'm not seeing anything different in the log so far (it hasn't stopped working yet). Is that normal?

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

You're saying that you're not getting the internal error with stacktrace again? How about invalid metadata EA messages?

Please share your logs so that I can see what they say.

@PuffyRainbowCloud
Copy link
Author

I'll leave the exported terminal log here.
Terminal Saved Output.txt

@rdmark
Copy link
Member

rdmark commented Aug 9, 2023

Hmm something must have gone wrong when you installed it. This log still has the "please file a bug ticket with your upstream vendor" error message which I removed in the branch with added logging. You should not be seeing this anymore, but rather a similar message that tells us exactly which type of metadata is affected.

Are you sure you stopped all running instances of netatalk before you installed?

@PuffyRainbowCloud
Copy link
Author

I executed sudo systemctl stop netatalk before configuring, making, and installing. Could I have gotten the download wrong somehow? git checkout rdmark-issue-236 gave me fatal: not a git repository (or any of the parent directories): .git so I went to the link of your fork, downloaded the release from there, and continued from there. Was that incorrect?

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

Wait a minute, can you please hold off on testing for a moment. I realized an inconsistency in the code.

@PuffyRainbowCloud
Copy link
Author

I will standby until further notice.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

Ready to test now!

What I spotted was that 0 entry length is checked in ad_entry_check_size() and treated as an allowed case. After that, the execution returns to ad_entry() where it checks for 0 entry length again where it's treated as a failure and triggers the assert later on.

I think allowing 0 entry length is correct. So I changed the latter check to only check for 0 offset.

@PuffyRainbowCloud
Copy link
Author

Good and bad news!
The errors we have been looking at are gone, which we expected. However, this time the crash happened during testing. Specifically, I was closing a video file which was opened in mpv on one of my two shares. The share containing the file was unmounted from the computer on its own and the other share froze Finder and Firefox when I attempted to eject it. Journalctl is still spitting out the same error message as is in the included log at the time of writing.
netatalk rdmark-issue-236 build log 2023-08-10_0406.txt
netatalk journalctl 2023-08-10_0410.txt

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

Alright, now you have run into the other related bug which is reported in #357 !

This bug is essentially the inverse: it's trying to write metadata headers but get a zero buffer size. This one I can reproduce on my end so it should only be a matter of time (hopefully).

BTW I pushed a small change to silence the noisy "got_len entry present but empty" message. It's now maxdebug level so should only show up when you absolutely need it.

@PuffyRainbowCloud
Copy link
Author

Good luck! Let me know if you need anything else. Since we've now solved the issue in the title, should I close this issue or leave it open?

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

Let's keep it open until the fix is merged.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

A tentative fix for the latter issue now in the same branch. Please give it a whirl when you have a chance!

There may be unintended consequences from the fix, so please try moving / copying / executing various files!

Basically, the nature of the fix is that I now check for the buffer in the destination object before attempting to copy EA metadata, and skip the process if the length is zero.

@PuffyRainbowCloud
Copy link
Author

I'll leave it running today and do some copying and executing whenever I would anyway, logging all of it with journalctl. So far it has spammed these messages when playing back a symlinked video file:

Aug 10 09:13:57 servercluwub afpd[76073]: ad_header_read([symlink-path]): malformed AppleDouble
Aug 10 09:13:57 servercluwub afpd[76073]: ad_header_read_osx(rfpath, ad, &st) failed: Input/output error
Aug 10 09:13:57 servercluwub afpd[76073]: afp_openfork(symlink-path): ad_open: Input/output error```

Playing the original video files gives no errors.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

Apart from the log messages, are there any interruptions to the service at all? No crashes with stack traces?

From what I've learned, netatalk handling of symlinks has a long history of bugs and edge cases. There's an option in apf.conf "follow symlinks" where you can turn this on and off.

In fact, looking at this recent change 477af53 we should not attempt to read AD headers through symlinks in the first place, so the proper fix might be to add some more conditions to stop this from happening here.

@PuffyRainbowCloud
Copy link
Author

I haven't experienced any issues. I decided a decent method for stress testing was batch conversion of video files since that involves both reading and writing a lot of data, so I've been running ffmpeg on folders of files for hours and it hasn't crashed yet.

While I'm not very well versed in this type of work, what you're suggesting sounds reasonable to me.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

I've pushed some major changes to the branch now. When you have a moment to spare, can you please see what the logs says now about that symlinked video and malformed AppleDouble?

Also, are you able to share details on how your symlinks are set up? I tried playing around with them locally but couldn't reproduce your issue.

@PuffyRainbowCloud
Copy link
Author

Absolutely! I've included build output and journalctl output as before.

As for the symlinks, I created them using a script that ChatGPT wrote for me to automate the process. I've included a copy of this script. The symlink and the original file are in separate folders on the same drive. The symlink has been renamed. That's all I really know to tell. Regrettably, I cannot test the exact same symlink as it no longer exists. However, I did try a bunch of other symlinks that were created using the same tool, from the same parent folder. I also created a symlink of that same file, using the same tool, renamed it, and played it back. As you can see, there is no error in the log. I wish I had that original symlink but regrettably it was absentmindedly overwritten as part of the converting I did to stress test earlier.
journalctl netatalk 2023-08-11_0119.txt
netatalk rdmark-issue-236 build output 2023-08-11_0111.txt
generate_symlinks.txt

I want to add that that was the only symlink that threw that error during my testing earlier. Every other one, even in the same folder, showed no errors.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

Thanks for this. Looking at the script, you basically create symlinks in a parallel directory. And then, you say you launch the symlink to play back the video?

@PuffyRainbowCloud
Copy link
Author

Basically, yes. I do this so I can rename them for Jellyfin's sake without having to make a copy and without having to modify original file names which another application need to read as they originally were.

@rdmark
Copy link
Member

rdmark commented Aug 10, 2023

How are the video files encoded, and which client application do you use for playback? (in case the client is silently writing something back to the file meta data)

@PuffyRainbowCloud
Copy link
Author

PuffyRainbowCloud commented Aug 11, 2023

I was playing the file using mpv. ffprobe on the original file returns:

  Metadata:
    encoder         : libebml v1.4.2 + libmatroska v1.6.4
    creation_time   : 2023-04-20T14:12:54.000000Z
  Duration: 00:11:03.52, start: 0.000000, bitrate: 9645 kb/s
  Stream #0:0: Video: h264 (High), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 1k tbn (default)
    Metadata:
      BPS             : 8235708
      DURATION        : 00:11:03.440000000
      NUMBER_OF_FRAMES: 16586
      NUMBER_OF_BYTES : 682987275
      _STATISTICS_WRITING_APP: mkvmerge v65.0.0 ('Too Much') 64-bit
      _STATISTICS_WRITING_DATE_UTC: 2023-04-20 14:12:54
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
  Stream #0:1(eng): Audio: eac3, 48000 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      BPS             : 256000
      DURATION        : 00:11:03.424000000
      NUMBER_OF_FRAMES: 20732
      NUMBER_OF_BYTES : 21229568
      _STATISTICS_WRITING_APP: mkvmerge v65.0.0 ('Too Much') 64-bit
      _STATISTICS_WRITING_DATE_UTC: 2023-04-20 14:12:54
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
  Stream #0:2(fin): Audio: eac3, 48000 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      BPS             : 256000
      DURATION        : 00:11:03.424000000
      NUMBER_OF_FRAMES: 20732
      NUMBER_OF_BYTES : 21229568
      _STATISTICS_WRITING_APP: mkvmerge v65.0.0 ('Too Much') 64-bit
      _STATISTICS_WRITING_DATE_UTC: 2023-04-20 14:12:54
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
  Stream #0:3(nor): Audio: eac3, 48000 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      BPS             : 256000
      DURATION        : 00:11:03.424000000
      NUMBER_OF_FRAMES: 20732
      NUMBER_OF_BYTES : 21229568
      _STATISTICS_WRITING_APP: mkvmerge v65.0.0 ('Too Much') 64-bit
      _STATISTICS_WRITING_DATE_UTC: 2023-04-20 14:12:54
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
  Stream #0:4(swe): Audio: eac3, 48000 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      BPS             : 256000
      DURATION        : 00:11:03.424000000
      NUMBER_OF_FRAMES: 20732
      NUMBER_OF_BYTES : 21229568
      _STATISTICS_WRITING_APP: mkvmerge v65.0.0 ('Too Much') 64-bit
      _STATISTICS_WRITING_DATE_UTC: 2023-04-20 14:12:54
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES
  Stream #0:5(dan): Audio: ac3, 48000 Hz, stereo, fltp, 384 kb/s (default)
    Metadata:
      BPS             : 384000
      DURATION        : 00:11:03.424000000
      NUMBER_OF_FRAMES: 20732
      NUMBER_OF_BYTES : 31844352
      _STATISTICS_WRITING_APP: mkvmerge v65.0.0 ('Too Much') 64-bit
      _STATISTICS_WRITING_DATE_UTC: 2023-04-20 14:12:54
      _STATISTICS_TAGS: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTE

However, the error no longer appears on that or any other file so far.

Edit: I apologise for extra tect appearing within the code box. Github is refusing to respect my ending characters for it.
Edit 2: I seem to have figured out what I was doing wrong with that.

@rdmark
Copy link
Member

rdmark commented Aug 11, 2023

Thanks! For some reason I just can't get macOS to follow symlinks on the afp shared volume. The symlinks are there and recognized as "alias" but macOS can't find the original file. Not sure what I'm doing wrong. Doesn't help if I enable the follow symlinks option.

I'll take a break from trying to reproduce this for now, until I think of some other approach.

@PuffyRainbowCloud
Copy link
Author

That's interesting! It's worth noting that I created the symlinks on the server as opposed to on the Mac as that never worked in the past.

@rdmark
Copy link
Member

rdmark commented Aug 11, 2023

Yep I'm doing the same. Just running ln -s directly on the server's terminal command line. This is an example:

$ ls -al /srv/netatalk/test2/video-link/
total 16
drwxr-sr-x 2 dmark dmark 4096 Aug 10 17:31 .
drwxrwxrwx 7 root  root  4096 Aug 10 17:38 ..
lrwxrwxrwx 1 dmark dmark   29 Aug 10 17:31 scooter.mkv -> video/Scooter_-_4_AM_2012.mkv

(BTW I'm not a Scooter fan. It was just the first result on archive.org when searching for mkv files ^^; )

@PuffyRainbowCloud
Copy link
Author

That's fascinating! I have no idea why it wouldn't work on your end. Are you also running an Ubuntu server? Which version of macOS are you using? I'm on Ubuntu Server 22.04.3 LTS and macOS Ventura 13.4 (22F66).

@rdmark
Copy link
Member

rdmark commented Aug 11, 2023

This is with a Debian Bookworm server (which should be 99% identical to Ubuntu) and macOS Ventura. Are you able to share your afp.conf just to check if there's some option that you're enabling that I aren't.

@PuffyRainbowCloud
Copy link
Author

Absolutely:

;
; Netatalk 3.x configuration file
;

[Global]
admin auth user = alice
mimic model = AirPort
zeroconf name = servercluwub afp

[Homes]
basedir regex = /home
home name = $u

[Filserver]
path = /media/alice/Filserver
spotlight = yes
valid users = alice ida
follow symlinks = true

@rdmark
Copy link
Member

rdmark commented Aug 11, 2023

Thank you! If you encounter the symlink issue again, please add a comment to #270

Now for the original zero length AD entry issue that started this issue ticket, I've merged a fix through #178 so let me go ahead and close this ticket.

If you encounter the "invalid metadata EA this is now being treated as a fatal error" errors again, please file a new ticket so that we can look into what metadata is acting up this time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants