Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync_file_range: Function not implemented #645

Closed
mathieupost opened this issue Jul 12, 2016 · 23 comments
Closed

sync_file_range: Function not implemented #645

mathieupost opened this issue Jul 12, 2016 · 23 comments

Comments

@mathieupost
Copy link

When I try to run a MongoDB instance (version 3.2.7) with mongod --dbpath /path/to/database I get an error saying: sync_file_range: Funtion not implemented when using the standard WiredTiger storage engine (when using the old mmapv1, this error doesn't occur).

I've tried this with a dbpath that was an empty directory on a ntfs disk and I also tried a exFat filesystem, but it looks like the sync_file_range function is a linux function that is not yet implemented in the linux subsystem (http://man7.org/linux/man-pages/man2/sync_file_range.2.html).

I'm running build 14379.rs1_release.160627-1607

@therealkenc
Copy link
Collaborator

Also TCP_KEEPIDLE and TCP_KEEPINTVL also appear to be problematic here.

If you're willing to build from source, I've hacked a tiny patch (short raw) that works around both issues. Build instructions for WSL are here.

@acezard
Copy link

acezard commented Sep 18, 2016

I had the same problem as @mathieupost and installed an old version instead. That v2.4.9 also gives me the TCP_KEEPIDLE and TCP_KEEPINTVL errors. And [conn1] mincore failed: errno:38 Function not implemented. But other than that it seems to work

@hasenbanck
Copy link

This issue also arises when you use PostgreSQL under WSL.
Affected code line in PostgreSQL.

@TheJP
Copy link

TheJP commented Apr 22, 2018

What is the status on this? I tried to run borg (borgbackup/borg#1961) on WSL but got stuck on this sync_file_range.

  File "src/borg/platform/linux.pyx", line 255, in borg.platform.linux.SyncFile.write
  File "src/borg/platform/linux.pyx", line 229, in borg.platform.linux._sync_file_range
OSError: [Errno 38] Function not implemented

@therealkenc
Copy link
Collaborator

therealkenc commented Jun 21, 2018

@dcayme asked in a dupe:

Does anyone know if this issue will be fixed?

Given this is two years old it would be nice if sync_file_range(2) were just stubbed to do nothing. Fine, it can't be implemented right now for reasons. MongoDB probably won't pass an ACID test with this stubbed. I get that. But anyone using WSL as their MongoDB deployment environment is high, and for anyone using MongoDB in their development environment, it doesn't matter.

I'd say different if this were still 2016, and the feature had a snowballs chance of making the Fall 2018 Update. But at this point we aren't learning which surface is missing and in what packages the API is being used. We know.

@ThomasWaldmann
Copy link

borgbackup 1.1.x is not usable on WSL due to sync_file_range failing.

@ThomasWaldmann
Copy link

@mathieupost there's a typo in the issue title. /nitpick

@mathieupost mathieupost changed the title sync_file_range: Funtion not implemented sync_file_range: Function not implemented Oct 9, 2018
@mathieupost
Copy link
Author

@ThomasWaldmann fixed ;)

@adrach
Copy link

adrach commented Jan 4, 2019

has there been any progress? is there anything we can do to move this along? happy to contribute my time if somebody is willing to partner up with me

@therealkenc
Copy link
Collaborator

happy to contribute my time if somebody is willing to partner up with me

It is a trivial config tweak posted in July of 2016. It does not require a team effort. It requires a compiler.

[that patch is why the function call might as well be stubbed in WSL btw]

@TheJP
Copy link

TheJP commented Jan 4, 2019

MongoDB is not the only software that calls sync_file_range (e.g. Borg is another one as I've pointed out).
Would be cool if adrach or someone else could implement it. But I agree, if it can't be done I would prefer it to be stubbed for now.

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 4, 2019

MongoDB is not the only software that calls sync_file_range

🙄

Would be cool if adrach or someone else could implement it

And how would you propose he or "someone" do that.

[Allowing for the fact that if one doesn't care if the behavior is stubbed and one is willing to entertain the idea of "Cygwin for WSL", you can do a lot of things. Mask it in the glibc function here. Another one-liner that does not require a team.]

@therealkenc
Copy link
Collaborator

Or for that matter this will probably work:

// stub-sfr.c
#define _GNU_SOURCE         /* See feature_test_macros(7) */
#include <fcntl.h>
int sync_file_range(int fd, off64_t offset, off64_t nbytes,
                           unsigned int flags)
{
    return 0;
}

Build:

$ sudo apt-install build-essential wget 
$ cd ~ && mkdir -p ~/no-sfr && cd no-sfr
$ wget -O no-sfr.c https://bit.ly/2LPMywi
$ gcc -fPIC -c -o no-sfr.o no-sfr.c
$ gcc -shared -o no-sfr.so no-sfr.o
$ export LD_PRELOAD="$PWD/no-sfr.so"
$ # ... run stuff

Left as an exercise to do setsockopt(...SO_KEEPALIVE...).

@ThomasWaldmann
Copy link

ThomasWaldmann commented Jan 4, 2019

While the dummy implementation might get some stuff "working" (in the sense of "not crashing immediately", not necessarily "working with good performance"), I'ld appreciate if someone would really fix the root cause, i.e. implement the missing function and emulate it as closely as possible for WSL.

It might be really important for performance (like avoiding first caching lots of stuff, eating lots of memory for it and then having a burst of I/O when the cache gets flushed).

As it is just about performance, implementing an empty stub seems better than failing with not implemented. So I'ld advise doing this ASAP and implementing the real thing as time permits.

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 4, 2019

It might be really important for transactional behaviour sometimes

Sigh. Your post (a) states the catastrophically obvious, and, (b) just decreased our changes of the syscall being stubbed by some large percent (if some percent ever existed; I had feint hopes). One imagines this is the reason the devs decided to leave this function return ENOSYS.

ending up on-disk in right order

The syscall is for durability not atomicity or isolation.

Let me repeat: Anyone using WSL as their MongoDB deployment environment is high, and for anyone using MongoDB in their development environment, it doesn't matter. If you are worried about the state of your data in the event of a hard system falldown, and don't have a UPS and a clean shutdown protocol in place, and are using WSL as your platform of choice, then you have reached a very (let's call it) unusual set of circumstances. Which goes to where implementation of this syscall likely sits in the priority queue (I wouldn't know, but I can make educated guesses). You could, however, open a UserVoice ticket describing your specific scenario and needs.

@ThomasWaldmann
Copy link

ThomasWaldmann commented Jan 5, 2019

No need to jump at me.

It seems I confused this syscall with other syscalls borgbackup is doing to make sure stuff ends up on disk in right order before a transaction is committed. I edited my previous comment to remove what was wrong.

I do not use mongodb, nor do I use WSL. I am just trying to help borgbackup users who want to run a recent version of it on WSL to backup their data. Hopefully you do not consider them high also.

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 6, 2019

While the dummy implementation might get some stuff "working" (in the sense of "not crashing immediately", not necessarily "working with good performance")

No need to jump at me.

The shortness was due to the scare quote. The work-around, which is now buried, addresses borgbackup issue for the people you are trying to help full stop. With good performance (or rather, as good as it gets on WSL; which is not great).

I am just trying to help borgbackup users who want to run a recent version of it on WSL to backup their data.

A constructive way to do that would be to upstream a patch to borg that makes that call to sync_file_range() a soft fail. There is no reason for borg to falldown. In the alternative, if that sync is considered critical and you want to trap the fail hard on Real Linux (for reasons), guard WSL (and android) by parsing /proc/version. This will almost certainly be necessary because whatever hope we (you and me) collectively had of it being stubbed (if there was any) likely died the moment you posted quoth: "[I'd] appreciate if someone would really fix the root cause", despite the late edit. [Although I would like to be proved wrong, natch.]

Hopefully you do not consider them high also

Meh. I don't judge.

@adrach
Copy link

adrach commented Jan 8, 2019

sorry, did not mean to cause a storm here... I should have time this weekend to stub it out, how would we go about making a note/warning somewhere that the function is stubbed out?

@therealkenc
Copy link
Collaborator

therealkenc commented Jan 9, 2019

// stub-sfr.c version 2.2
//    gcc -fPIC -c -o stub-sfr.o stub-sfr.c
//    gcc -shared -o stub-sfr.so stub-sfr.o -ldl
#define _GNU_SOURCE
#include <fcntl.h>
#include <dlfcn.h>
#include <stdio.h>

typedef int (*sync_file_range_)(int fd, off64_t offset, off64_t nbytes, unsigned int flags);
int sync_file_range(int fd, off64_t offset, off64_t nbytes, unsigned int flags)
{
    static int warning_given = 0;
    static sync_file_range_ fn = NULL;
    if (fn == NULL) {
        fn = (sync_file_range_)dlsym(RTLD_NEXT, "sync_file_range");
    }
    if (!warning_given && fn(fd, offset, nbytes, flags) != 0) {
        fprintf(stderr, "Heads up the program you are running called sync_file_range(),\n"
            "but it failed, so we aren't calling it again. Sorry it didn't work out.\n");
        warning_given = 1;
    }
    return 0;
}

That said, such a message conveys no actionable information to the user whatsoever and thus violates some commonly (but by no means universally) accepted UX principles. [ed with bugfix]

ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Feb 5, 2019
see there:

borgbackup#1961

and especially there (not implemented sync_file_range):

microsoft/WSL#645
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Feb 5, 2019
see there:

borgbackup#1961

and especially there (not implemented sync_file_range):

microsoft/WSL#645
@macdice
Copy link

macdice commented Feb 22, 2019

Hello, FYI we'll probably fix this for PostgreSQL by tolerating ENOSYS (we didn't know it was already spewing warnings on WSL because nobody complained; now that we made it into a PANIC, we found out pretty quickly...). Discussion: https://www.postgresql.org/message-id/flat/CA%2BmCpegfOUph2U4ZADtQT16dfbkjjYNJL1bSTWErsazaFjQW9A%40mail.gmail.com

@NickMoignard
Copy link

can confirm this still exists using rails & pg10.4 on an Ubuntu WSL

@macdice
Copy link

macdice commented Jan 28, 2020

can confirm this still exists using rails & pg10.4 on an Ubuntu WSL

Please try a current release (eg 10.11).

@scy
Copy link

scy commented Jun 3, 2020

@therealkenc I’m assuming this issue has been closed because it doesn’t exist anymore in WSL2? And that closing it also means that it won’t be fixed ever for WSL1 users?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests