-
-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_fuse: weird test failure only on freebsd #6871
Comments
Hah. It always happens in the last line of that block, no matter what it is. Yup, it is |
I don't think I've seen that for borg 2.0.0a3 recently. So, is something broken in borg, on freebsd, in llfuse? |
I also tested on 3 different Linux dists, netbsd, openbsd and macOS - that issue did not show up there. |
still broken for 2.0.0b1 and freebsd 13.1-release-p1. |
And it's still broken. Any freebsd user out there who could debug / fix this? |
I added the "bug" label to get this better visibility. It doesn't look like a bug in borg though, but rather in freebsd's FUSE implementation. |
Longtime FreeBSD user here, borg user since January, and little to none python skills! What can I do to help? |
@Jamie-Landeg-Jones thanks for helping. The "Function not implemented" is only seen on FreeBSD (not on linux, not on macOS) and even on FreeBSD, this has worked before, so I suspect a bug / regression in FreeBSD's FUSE infrastructure. Can you maybe check the changes to the fuse stuff done in freebsd, whether something looks suspect? |
Hi. What I'd like to do first is replicate the error you posted and trace it from there. Cheers |
Our docs have instructions for developers, see there for the set up: https://borgbackup.readthedocs.io/en/1.2.4/development.html (maybe rather check out the 1.2-maint branch than master, the 1.2.x stable releases are made from this branch) For the fuse issue, you do not need to use fakeroot and you could limit pytest to only run the fuse related tests (which triggers the error):
|
Thanks for that. I've replicated the issue in C, and with a sshfs mounted fuse filesystem. Of course, this isn't a fix, but it will make it easier to track down the problem. Here's what I've noticed: As you said, it works fine on FreeBSD 12, just not on 13. I'll narrow down where tomorrow. The error only occurs under our use case if you read from the file. (if you disable the read-check in your tests, the close works ok) [N.B. I didn't test writing] To highlight this, I ran your test without the read check, then from the command line, run a bit of c code that simply opens the file... then closes it.... then opens it... reads 100 bytes. then closes it. then opens it then closes it. I also did similar on a sshfs filesystem, and got the same effect, but with a different error message, leading me to believe that there is some unset / previously initialised variable leaking through. I tested mounting ro and rw - no difference. I also tried using the "cat" unix command on one test file. The "cat" showed no error message, but again, by virtue of it reading the file, any subsequent fclose test failed. So, basically, any file that has had some bytes read, produces that error on close, and continues to do so for the duration of the mount. I now have enough to look at the fusefs source, and if I can't work it out, at least I'll have enough evidence to prod someone with!
|
great that you could reproduce it outside of borg!
btw, was it freebsd that changed the kernel module name? i remember having to update the Vagrantfile.
Am 9. Apr. 2023, 05:42 +0200 schrieb Jamie Landeg Jones ***@***.***>:
… Thanks for that. I've replicated the issue in C, and with a sshfs mounted fuse filesystem.
Of course, this isn't a fix, but it will make it easier to track down the problem.
Here's what I've noticed:
As you said, it works fine on FreeBSD 12, just not on 13. I'll narrow down where tomorrow.
The error only occurs under our use case if you read from the file. (if you disable the read-check in your tests, the close works ok) [N.B. I didn't test writing]
To highlight this, I ran your test without the read check, then from the command line, run a bit of c code that simply opens the file... then closes it.... then opens it... reads 100 bytes. then closes it. then opens it then closes it.
I also did similar on a sshfs filesystem, and got the same effect, but with a different error message, leading me to believe that there is some unset / previously initialised variable leaking through.
I tested mounting ro and rw - no difference. I also tried using the "cat" unix command on one test file. The "cat" showed no error message, but again, by virtue of it reading the file, any subsequent fclose test failed.
So, basically, any file that has had some bytes read, produces that error on close, and continues to do so for the duration of the mount.
I now have enough to look at the fusefs source, and if I can't work it out, at least I'll have enough evidence to prod someone with!
***@***.***:~/borg % file-close-check /tmp/tmp5bsl1_xh/mountpoint/archive/input/file1
/tmp/tmp5bsl1_xh/mountpoint/archive/input/file1 | open: ok, close: ok | open: ok, read: 100 bytes, close: Function not implemented | open: ok, close: Function not implemented
***@***.***:~/borg %
***@***.***:~/borg % file-close-check /tmp/tmp5bsl1_xh/mountpoint/archive/input/file1
/tmp/tmp5bsl1_xh/mountpoint/archive/input/file1 | open: ok, close: Function not implemented | open: ok, read: 100 bytes, close: Function not implemented | open: ok, close: Function not implemented
***@***.***:~/borg % file-close-check /tmp/tmp5bsl1_xh/mountpoint/archive/input/file1
/tmp/tmp5bsl1_xh/mountpoint/archive/input/file1 | open: ok, close: Function not implemented | open: ok, read: 100 bytes, close: Function not implemented | open: ok, close: Function not implemented
***@***.***:~/borg %
***@***.***:~/borg %
***@***.***:~/borg % sshfs ***@***.***:/tmp b1
***@***.***:~/borg %
***@***.***:~/borg % file-close-check b1/*
b1/COPYRIGHT | open: ok, close: ok | open: ok, read: 100 bytes, close: ok | open: ok, close: ok
b1/file-close-check.c | open: ok, close: ok | open: ok, read: 100 bytes, close: Permission denied | open: ok, close: Permission denied
b1/x | open: ok, close: ok | open: ok, read: 100 bytes, close: Permission denied | open: ok, close: Permission denied
***@***.***:~/borg %
***@***.***:~/borg %
***@***.***:~/borg % file-close-check b1/*
b1/COPYRIGHT | open: ok, close: ok | open: ok, read: 100 bytes, close: ok | open: ok, close: ok
b1/file-close-check.c | open: ok, close: Permission denied | open: ok, read: 100 bytes, close: Permission denied | open: ok, close: Permission denied
b1/x | open: ok, close: Permission denied | open: ok, read: 100 bytes, close: Permission denied | open: ok, close: Permission denied
***@***.***:~/borg %
***@***.***:~/borg % l b1
total 59
4 drwxrwxrwt 1 root wheel - 256 9 Apr 04:00 ./
3 drwxr-xr-x 14 bb bb uarch 36 9 Apr 03:58 ../
8 -r--r----- 1 jamie wheel - 6,197 9 Apr 04:09 COPYRIGHT
4 -rw-r----- 1 root wheel - 1,554 9 Apr 03:39 file-close-check.c
16 -rwxr-x--- 1 root wheel - 13,550 9 Apr 03:51 x
***@***.***:~/borg %
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Been a bit busy today, but will have another look tomorrow. Yes, Freebsd changed it from fuse.ko to fusefs.ko in 2019, which is roughly FreeBSD 12.1 upwards. As the oldest officially supported version is 12.4, all supported versions now use fusefs. https://cgit.freebsd.org/src/commit/sys/modules/fusefs?id=123af6ec70016f5556da5972d4d63c7d175c06d3 |
Ah, so this was just a rename! Good. I had the suspicion that they might have switched to a completely new implementation (including new bugs). |
Yeah, just a rename, (I have one box that has the renamed name, but none of the bugs you discovered) However, a number of changes since then, and other sorts of new bugs! Anyway, I've tracked down the commit that caused the issue, and have reported it, and also posted a fix (it works, but I don't know enough about the fusefs subsystem to know if the fix is correct) Basically, it appears to be due to an attempt to update atime on a file with no write-access. The close fails because of this (whilst it checks for write-access before attempting to write data, it doesn't seem to do so for metadata) If you're interested, I've posted lots more information to the bug report, here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270749 And in case there are followups there, here is my email letting people know about the bug: As for borg-tests, even when this issue is fixed, it doesn't help people running FreeBSD 13.0 and 13.1 Can you try changing the test to use '-no-atime" on the fusefs? ("-readonly" makes no difference, but if you can mount with "no-atime" I think it might.) Of course, that only helps if no-atime is acceptable to you -- I assume it is, as the fuse mounted borgbackup is only ever read-only anyway I hope this helps! Cheers, Jamie. P.s. I've found a couple of potential borg issues recently, but I've not tested them on 2.X yet, and need to narrow them down (and make sure it's not user error!) before posting them. Cheers, Jamie |
No update yet ... :-( |
Considering that this does not seem to be a borg bug, I'll keep this open as a reminder, but move it to rc1 milestone. |
The official fix has now been applied to "MAIN', and will hit all supported stable channels in a week. This means, the fix should be all release versions AFTER 13.2-RELEASE (I don't know if there is going to be a 13.3-RELEASE, but 14.0-RELEASE is imminent) Ref: https://cgit.freebsd.org/src/commit/?id=fb619c94c679e939496fe0cf94b8d2cba95e6e63 Cheers! Jamie |
@Jamie-Landeg-Jones great news, thanks! So guess it might be in the next release there (which is what I use for borg release testing and pyinstaller borg binary building): |
I'm not familiar with vagrant, but I just had a look at your link, and that version they updated a few days ago is indeed based on FreeBSD-13.2-RELEASE so yes, their next release should have it. Of course, vagrant cloud may stop at 13.2 seeing that 14.0 will be released soon, I don't know, but if they keep updating their freebsd13 branch, then yes, the next update will presumably be 13.3-release, which will have the fix. One thing, 13.2 was released on April 11th, so there may be a bit of a lag on vagrant providing the upgrade when/if 13.3 comes out. Cheers! |
borg todo: upgrade the freebsd vagrant machine to a more recent freebsd and check if the issue disappears. update:
So I will just upgrade all the vagrant VMs to 14. @Jamie-Landeg-Jones thanks for helping!
|
No problem! For reference, this fix has entered the FreeBSD 13 tree, but there hasn't been a release yet - 13.2 has been around a while. When 13.3 is out, the fix will be in there. Still, 13.X is still running for those who have problems upgrading, so if you've got 14.X running, it's best to stick with that! |
1.4-maint: fixed in #7988 |
borg 2.0.0a4, python 3.9.13, vagrant, freebsd13, pytest 7.0.1 and 6.2.5:
The text was updated successfully, but these errors were encountered: