Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition with libuv fs poll #1082

Closed
schrodit opened this issue Jun 17, 2023 · 4 comments · Fixed by flatcar/scripts#939
Closed

Race condition with libuv fs poll #1082

schrodit opened this issue Jun 17, 2023 · 4 comments · Fixed by flatcar/scripts#939
Labels
kind/bug Something isn't working

Comments

@schrodit
Copy link

schrodit commented Jun 17, 2023

Description

We discovered a race condition when trying to poll fs events using uv_fs_poll_start.
The race condition happens when the fs poll watch is started for a newly created directory and directly afterwards a file is written to that directory. That file does most of the times not get recognized by the file watch.

If we add a sleep of approx. 1s between the watch start and file creation, the file gets recognized.

We first encountered this issue while debugging a failing test in nodejs:

(async () => {
    const root = '/tmp/debug-test';
    await fs.rm(root, {recursive: true, force: true});
    await fs.mkdir(root, {recursive: true});
    //await wait(duration({milliseconds: 500}));
    fscb.watchFile(root, {
        interval: 2000,
    }, (curr) => {
        console.log('watchFile', curr);
    });
    await fs.writeFile(path.join(root, 'test.txt'), 'foo');
    console.log('Printed file');
})()

Bug occurs on the following systems and cloudproviders:

  • Provider: Azure on Flatcar stable (kernel 5.15.111), beta (kernel 5.15.113) and alpha.
  • Provider Equinix on Flatcar stable (kernel 5.15.94)
  • Provider PlusServer (German Hoster with Openstack) on Flatcar stable (kernel 5.15.111).

The bug does not occur on:

  • Provider PlusServer on Ubuntu 20.04
  • Azure Ubuntu 20.04
  • Laptop with ArchLinux (Kernel 6.3.3)

Impact

  • We always need a sleep second before we can be sure the watch recognizes all files.
  • This bug is a blocker for us using Flatcar.

Environment and steps to reproduce

  1. Create a machine with Flatcar (tested on stable, beta and alpha)
  2. start a ubuntu container sudo docker run -it --entrypoint bash --rm ubuntu:20.04
  3. compile the following c program with gcc main.c -o main -luv
#include <stdio.h>
#include <uv.h>


void cb (uv_fs_poll_t* handle,
                           int status,
                           const uv_stat_t* prev,
                           const uv_stat_t* curr) {
  printf("Received %ld\n", curr->st_ino);
}

void writeToFile () {
 FILE *f = fopen("/tmp/debug-test/test6.txt", "w");

 fprintf(f, "foo");
 fclose(f);
}

void writeToFileCb (uv_work_t *req) {
   writeToFile();
}

void after(uv_work_t *req, int status) {
   printf("Finished\n");
}

static uv_fs_poll_t handle;
static uv_work_t req;
static char root[] = "/tmp/debug-test";

int main()
{
   setbuf(stdout, NULL);
   uv_loop_t *loop = uv_default_loop();

   uv_fs_poll_init(loop, &handle); 
   const int err = uv_fs_poll_start(&handle, cb, root, 2000);
   if (err != 0) {
      printf("Error %d", err);
      return 42;
   }
   printf("--ready\n");
   uv_queue_work(loop, &req, writeToFileCb, after);
   return uv_run(uv_default_loop(), UV_RUN_DEFAULT);
}
  1. Setup the tmp dir: mkdir /tmp/debug-test
  2. run rm /tmp/dir/* && ./main multiple times -> the poll does not notice the file.

Expected behavior

The written file should always be reported by the poll as it does on other distros.

@schrodit schrodit added the kind/bug Something isn't working label Jun 17, 2023
@jepio
Copy link
Member

jepio commented Jun 19, 2023

uv_fs_poll_start uses a very bad algorithm for polling directories - it checks mtime in the directory inode (for "portability") instead of using the correct kernel facilities for this purpose (fanotify/inotify).

The ext4 root partition on Flatcar is formatted with 128-byte inode size, which means filesystem timestamps have a granularity of "seconds" (not nanoseconds).

If you move your container storage to a different disk/filesystem, or reformat the root partition you won't hit this. Try the following butane snippet:

variant: flatcar
version: 1.0.0
storage:
  filesystems:
  - device: /dev/disk/by-partlabel/ROOT
    label: ROOT
    format: ext4
    wipe_filesystem: true

@schrodit
Copy link
Author

schrodit commented Jun 19, 2023

5. rm /tmp/dir/* && ./main

can confirm that this fixes the issue.
Thanks for helping.

@jepio is this something that is likely to be changed upstream. So that the root partition is formatted with the default size as it was before that commit: flatcar/scripts@d09aeb3

@pothos
Copy link
Member

pothos commented Jun 20, 2023

Yes, we should change that, thanks to both of you!

128-byte inodes cannot handle dates beyond 2038 and are deprecated

jepio added a commit to flatcar/scripts that referenced this issue Jun 21, 2023
Inode sizes smaller than 256:
- don't support extended metadata (nanosecond timestamp resolution)
- cannot handle dates beyond 2038
- are deprecated

Change the default from 128 to 256. There is no way to apply this change on a
mounted filesystem so this change will only apply to new deployments.

Fixes: flatcar/Flatcar#1082
Signed-off-by: Jeremi Piotrowski <[email protected]>
jepio added a commit to flatcar/scripts that referenced this issue Jun 22, 2023
Inode sizes smaller than 256:
- don't support extended metadata (nanosecond timestamp resolution)
- cannot handle dates beyond 2038
- are deprecated

Change the default from 128 to 256. There is no way to apply this change on a
mounted filesystem so this change will only apply to new deployments.

Fixes: flatcar/Flatcar#1082
Signed-off-by: Jeremi Piotrowski <[email protected]>
jepio added a commit to flatcar/scripts that referenced this issue Jun 22, 2023
Inode sizes smaller than 256:
- don't support extended metadata (nanosecond timestamp resolution)
- cannot handle dates beyond 2038
- are deprecated

Change the default from 128 to 256. There is no way to apply this change on a
mounted filesystem so this change will only apply to new deployments.

Fixes: flatcar/Flatcar#1082
Signed-off-by: Jeremi Piotrowski <[email protected]>
jepio added a commit to flatcar/scripts that referenced this issue Jun 22, 2023
Inode sizes smaller than 256:
- don't support extended metadata (nanosecond timestamp resolution)
- cannot handle dates beyond 2038
- are deprecated

Change the default from 128 to 256. There is no way to apply this change on a
mounted filesystem so this change will only apply to new deployments.

Fixes: flatcar/Flatcar#1082
Signed-off-by: Jeremi Piotrowski <[email protected]>
jepio added a commit to flatcar/scripts that referenced this issue Jun 22, 2023
Inode sizes smaller than 256:
- don't support extended metadata (nanosecond timestamp resolution)
- cannot handle dates beyond 2038
- are deprecated

Change the default from 128 to 256. There is no way to apply this change on a
mounted filesystem so this change will only apply to new deployments.

Fixes: flatcar/Flatcar#1082
Signed-off-by: Jeremi Piotrowski <[email protected]>
@jepio
Copy link
Member

jepio commented Jun 22, 2023

Fix merged and cherry-picked to all branches.

This won't apply to running systems though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants