Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pass real fd when restore and dump ext file #2502

Closed
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jul 7, 2023

  1. lib/py: drop python 2 compatibility

    This patch removes code introduced for compatibility with
    Python 2 in commits:
    
      bf80fee (lib: correctly handle stdin/stdout (Python 3))
    
      b82f222 (lib: fix crit-recode fix for Python 2)
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    460c4d2 View commit details
    Browse the repository at this point in the history
  2. zdtm: drop python 2 compatibility

    This patch removes the code for Python 2 compatibility introduced
    with commit e65c7b5 (zdtm: Replace imp module with importlib).
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    75d9d68 View commit details
    Browse the repository at this point in the history
  3. cgroup: Propagate error on cgroup mount failure.

    This makes the error to mount cgroup hierarchy a bit less noisy:
    
    Error (criu/cgroup.c:623): cg: Unable to mount cgroup2 : Invalid argument'
    
    Instead of
    
    Error (criu/cgroup.c:623): cg: Unable to mount cgroup2 : Invalid argument'
    Error (criu/cgroup.c:715): cg: failed walking /proc/self/fd/-1/zdtmtst for empty cgroups: No such file or directory'
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    b759678 View commit details
    Browse the repository at this point in the history
  4. files-reg: Debug "open file on overmounted mount" error.

    Log the mount and file that were the cause of failing a dump.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    ce33c49 View commit details
    Browse the repository at this point in the history
  5. compel: Log the status word with "Task is still running" errors.

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    b42e7af View commit details
    Browse the repository at this point in the history
  6. sk-unix: Log both peer names when failing on an external stream unix …

    …socket.
    
    Make debugging dump failures resulting in "sk unix: Can't dump half
    of stream unix connection" errors easier.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    cf01c32 View commit details
    Browse the repository at this point in the history
  7. soccr: Log offset when failed to restore socket's queued data.

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    13c08b8 View commit details
    Browse the repository at this point in the history
  8. soccr: Log name of socket queue that failed to restore.

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    dc3f4b5 View commit details
    Browse the repository at this point in the history
  9. log: Remove error logs for ignored or otherwise logged subprocess exits.

    Errors in early restore.log for status=1 from a subprocess are confusing,
    esp. that they don't show what command failed. Since the result is
    either ignored or logged anyway, mark the calls as "can fail".
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    4d67f67 View commit details
    Browse the repository at this point in the history
  10. mount: Demote fsnotify logs for ignored failures.

    Make logs about inaccessible mounts warnings, as the failures are
    normally harmless (e.g. failure to read /dev/cgroup) and don't
    make the CRIU run fail. (If it happens that the fsnotify can't
    find a file, then to debug, full CRIU logs will be necessary anyway.)
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    fb149f7 View commit details
    Browse the repository at this point in the history
  11. irmap: Reduce error log severity to warning.

    These errors originate from the filesystem scanning in irmap.c and are mostly
    benign. Nevertheless, if they do result in a failed irmap lookup, that failed
    lookup is more interesting from an application perspective.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    cf4b225 View commit details
    Browse the repository at this point in the history
  12. kerndat: bind ipv6-socket only if ipv6 is enabled

    Fixes: checkpoint-restore#2222
    Fixes: f1c8d38 ("kerndat: check if setsockopt IPV6_FREEBIND is supported")
    Signed-off-by: Yan Evzman <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    yevzman authored and avagin committed Jul 7, 2023
    Configuration menu
    Copy the full SHA
    a4bb3f9 View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2023

  1. zdtm: replace NR_fstat with NR_statx

    NR_fstat is a deprecated syscall, some
    modern architectures such as riscv and
    loongarch64 no longer support this syscall.
    It is usually replaced by NR_statx.
    
    NR_statx is supported since linux 4.10.
    
    Signed-off-by: znley <[email protected]>
    znley authored and avagin committed Jul 10, 2023
    Configuration menu
    Copy the full SHA
    a96aa58 View commit details
    Browse the repository at this point in the history
  2. kerndat: don't leak a socket file descriptor

    kerndat_has_ipv6_freebind creates a socket but doesn't close it.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jul 10, 2023
    Configuration menu
    Copy the full SHA
    935e60d View commit details
    Browse the repository at this point in the history
  3. ci: add workflow to ensure self-contained commits

    Signed-off-by: Prajwal S N <[email protected]>
    snprajwal authored and avagin committed Jul 10, 2023
    Configuration menu
    Copy the full SHA
    2d6f04c View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2023

  1. include: add common header files for loongarch64

    Signed-off-by: znley <[email protected]>
    znley authored and mihalicyn committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    f684719 View commit details
    Browse the repository at this point in the history
  2. compel: add loongarch64 support

    Signed-off-by: znley <[email protected]>
    znley authored and mihalicyn committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    52630db View commit details
    Browse the repository at this point in the history
  3. images: add loongarch64 core image

    Signed-off-by: znley <[email protected]>
    znley authored and mihalicyn committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    521383d View commit details
    Browse the repository at this point in the history
  4. criu: add loongarch64 support to parasite and restorer

    Signed-off-by: znley <[email protected]>
    znley authored and mihalicyn committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    95fbd7e View commit details
    Browse the repository at this point in the history
  5. zdtm: add loongarch64 support

    Signed-off-by: znley <[email protected]>
    znley authored and mihalicyn committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    c941282 View commit details
    Browse the repository at this point in the history
  6. ci: add workflow for loongarch64

    Signed-off-by: znley <[email protected]>
    znley authored and mihalicyn committed Jul 19, 2023
    Configuration menu
    Copy the full SHA
    f70c782 View commit details
    Browse the repository at this point in the history

Commits on Jul 22, 2023

  1. util: Implement fchown() and fchmod() wrappers.

    Add generic wrappers for fchown() and fchmod() that skip the calls if
    no changes are needed. This will allow to unify places where we can
    avoid errors when no-op requests are not permitted.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    9f69484 View commit details
    Browse the repository at this point in the history
  2. sk-unix: Avoid restore_file_perms() EPERM error for no-op changes.

    Note: This removes the difference in calling convention of
    restore_file_perms() returning -errno that was the only call that did
    this in the caller.
    
    From: Radosław Burny <[email protected]>
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    923e66b View commit details
    Browse the repository at this point in the history
  3. files-reg: Avoid EPERM in ghost_apply_metadata() for no-op changes.

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    5f214bc View commit details
    Browse the repository at this point in the history
  4. cgroup: Replace restore_perms() with cr_fchperm().

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    6e23125 View commit details
    Browse the repository at this point in the history
  5. memfd: Avoid EPERM for no-op chown().

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    00d061b View commit details
    Browse the repository at this point in the history
  6. tty: Avoid EPERM for no-op chown().

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    e90fbd7 View commit details
    Browse the repository at this point in the history
  7. restore: Avoid need for CAP_SETPCAP if not changing uids.

    When CRIU is run with the task's credentials on restore, don't set uids
    and gids. This avoids the need to modify the SECURE_NO_SETUID_FIXUP flag
    which requires CAP_SETPCAP.
    
    From: Andy Tucker <[email protected]>
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    b9f360b View commit details
    Browse the repository at this point in the history
  8. restore: Skip setgroups() when already correct.

    Skip calling setgroups() when the list of auxiliary groups already has
    the values we want.  This allows restoring into an unprivileged user
    namespace where setgroups() is disabled.
    
    From: Ambrose Feinstein <[email protected]>
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 22, 2023
    Configuration menu
    Copy the full SHA
    53dd6ba View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2023

  1. restore: Fix capability migration requirements between different kern…

    …els.
    
    When restoring on a kernel that has different number of supported
    capabilities than checkpoint one, check that the extra caps are unset.
    
    There are two directions to consider:
    
    1) dump.cap_last_cap > restore.cap_last_cap
    	- restoring might reduce the processes' capabilities if restored
    	  kernel doesn't support checkpointed caps. Warn.
    
    2) dump.cap_last_cap < restore.cap_last_cap
    	- restoring will fill the extra caps with zeroes. No changes.
    
    Note: `last_cap` might change without affecting `n_words`.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    ff67ad8 View commit details
    Browse the repository at this point in the history

Commits on Jul 27, 2023

  1. prctl: Migrate prctl(NO_NEW_PRIVS) setting.

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe committed Jul 27, 2023
    Configuration menu
    Copy the full SHA
    6bad5d2 View commit details
    Browse the repository at this point in the history
  2. prctl: test prctl(NO_NEW_PRIVS) setting

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe committed Jul 27, 2023
    Configuration menu
    Copy the full SHA
    d490218 View commit details
    Browse the repository at this point in the history
  3. restore: Skip dropping BSET capability if irrelevant.

    prctl(NO_NEW_PRIVS) when set prevents child processes gaining
    capabilities not in permitted set. In this case, inability to
    clear capability from BSET that is not in the permitted set is
    harmless.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe committed Jul 27, 2023
    Configuration menu
    Copy the full SHA
    988a5f4 View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2023

  1. sk-inet: Extend 'TCP repair off' failure log.

    Include the file descriptor and error code in the debug message to make
    it more useful.
    
    Fixes: e7ba909 (2016-03-14 "cr-check: Inspect errno on syscall failures")
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 1, 2023
    Configuration menu
    Copy the full SHA
    cc500d9 View commit details
    Browse the repository at this point in the history
  2. memfd: dump and restore permissions.

    memfd is created by default with +x permissions set. This can be changed
    by a process using fchmod() and expected to prevent using this fd for
    exec(). Migrate the permissions.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 1, 2023
    Configuration menu
    Copy the full SHA
    f2d9672 View commit details
    Browse the repository at this point in the history
  3. zdtm/memfd00: test memfd file mode

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 1, 2023
    Configuration menu
    Copy the full SHA
    88249fe View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2023

  1. apparmor: fix incorrect usage of sizeof on char ptr

    In criu/apparmor.c: write_aa_policy(), the arg path is passed as a char
    pointer. The original code used sizeof(path) to get the size of it,
    which is incorrect as it always return the size of the char pointer
    (typically 8 or 4), not the actual capacity of the char array.
    
    Given that this function is only invoked with path declared as `char
    path[PATH_MAX]`, replacing sizeof(path) with PATH_MAX should correctly
    represent the maximum size of it.
    
    Fixes: 8723e3f ("check: add a feature test for apparmor_stacking")
    
    Signed-off-by: Haorong Lu <[email protected]>
    ancientmodern authored and avagin committed Aug 3, 2023
    Configuration menu
    Copy the full SHA
    9118601 View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2023

  1. page-xfer: Pull tcp_cork,nodelay().

    Move tcp_cork() and tcp_nodelay() to the only user: page-xfer.c. While
    at it, fix error messages (as they do not refer to restoring the sockopt
    values) and demote them as they are not fatal to the page transfer.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    1db922f View commit details
    Browse the repository at this point in the history
  2. irmap: scan user-provided paths in order

    Make the scan use the order of paths that came from the user.
    
    Fixes: 4f2e4ab ("irmap: add --irmap-scan-path option"; 2015-09-16)
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 4, 2023
    Configuration menu
    Copy the full SHA
    72494ed View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2023

  1. amdgpu_plugin: remove duplicated log prefix

    The log prefix "amdgpu_plugin:" is defined with `LOG_PREFIX` in
    `amdgpu_plugin.c`.  However, the prefix is also included in each
    log message. As a result it appears duplicated in the log messages:
    
    (00.044324) amdgpu_plugin: amdgpu_plugin: devices:1 bos:58 objects:148 priv_data:45696
    (00.045376) amdgpu_plugin: amdgpu_plugin: Thread[0x5589] started
    (00.167172) amdgpu_plugin: amdgpu_plugin: img_path = amdgpu-kfd-62.img
    (00.083739) amdgpu_plugin: amdgpu_plugin : amdgpu_plugin_dump_file() called for fd = 235
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 7, 2023
    Configuration menu
    Copy the full SHA
    242de4e View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2023

  1. scripts/apt: don't hide apt output

    It is required to investigate issues.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    6fc5bc6 View commit details
    Browse the repository at this point in the history
  2. ci/docker: install all required packages

    This change fixes the issue:
    ```
    The following packages have unmet dependencies:
     docker-ce : Depends: containerd.io (>= 1.6.4)
    E: Unable to correct problems, you have held broken packages.
    ```
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 8, 2023
    Configuration menu
    Copy the full SHA
    2199220 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2023

  1. lib/py: add VMA_AREA_MEMFD constant

    The VMA_AREA_MEMFD constant was introduced with commit
    
    29a1a88
    memfd: add memory mapping support
    
    This patch extends the status map used in CRIT and coredump with the
    value of this constant to recognize it.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 18, 2023
    Configuration menu
    Copy the full SHA
    e1cda9f View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2023

  1. loongarch64: reformat syscall_64.tbl for 8-wide tabs

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 21, 2023
    1 Configuration menu
    Copy the full SHA
    288d6a6 View commit details
    Browse the repository at this point in the history
  2. dump+restore: Implement membarrier() registration c/r.

    Note: Silently drops MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED as it's
    not currently detectable. This is still better than silently dropping
    all membarrier() registrations.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 21, 2023
    1 Configuration menu
    Copy the full SHA
    2f50da4 View commit details
    Browse the repository at this point in the history
  3. zdtm: membarrier: test migration of membarrier() registration

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 21, 2023
    Configuration menu
    Copy the full SHA
    b5c3ccc View commit details
    Browse the repository at this point in the history
  4. Put a cap on the size of single preadv in restore operation.

    While each preadv() is followed by a fallocate() that removes the data
    range from image files on tmpfs, temporarily (between preadv() and
    fallocate()) the same data is in two places; this increases the memory
    overhead of restore operation by the size of a single preadv.
    Uncapped preadv() would read up to 2 GiB of data, thus we limit that to
    a smaller block size (128 MiB).
    
    Based-on-work-by: Paweł Stradomski <[email protected]>
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 21, 2023
    Configuration menu
    Copy the full SHA
    5fedcaa View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2023

  1. github: auto-remove changes requested and awaiting reply labels

    Labels are removed when new comments are posted.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    649292c View commit details
    Browse the repository at this point in the history
  2. loongarch64: fix syscall_64.tbl

    The 288d6a6 change broke all the syscall numbers.
    
    Reported-by: Michał Mirosław <[email protected]>
    Fixes: (288d6a6 "loongarch64: reformat syscall_64.tbl for 8-wide tabs")
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    942b5fd View commit details
    Browse the repository at this point in the history
  3. memfd: don't set fd attributes not needed for vma mapping

    There is only one user of memfd_open() outside of memfd.c: open_filemap().
    It is restoring a file-backed mapping and doesn't need nor expect to
    update F_SETOWN nor the fd's position.  Check the inherited_fd() handling
    in the callers to simplify the code.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    359b257 View commit details
    Browse the repository at this point in the history

Commits on Aug 25, 2023

  1. ci/loongarch64: compile tests before running zdtm.py

    Otherwise tests fail by timeout.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 25, 2023
    Configuration menu
    Copy the full SHA
    675c5e4 View commit details
    Browse the repository at this point in the history
  2. kerndat: Make pagemap check more robust against swapped out pages.

    Fix test of whether the kernel exposes page frame numbers to cope with the
    possibility that the top of the stack is swapped out, which was happening
    in about one 1 out of 3 million runs.  This lead to a later failure when
    trying to read the PFN of the zero page, after which criu would exit with
    no error message.
    
    Original-From: Ambrose Feinstein <[email protected]>
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 25, 2023
    Configuration menu
    Copy the full SHA
    af31e8e View commit details
    Browse the repository at this point in the history
  3. compel/infect: include the relevant pid in "no-breakpoints restore" d…

    …ebug message
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 25, 2023
    Configuration menu
    Copy the full SHA
    38baf73 View commit details
    Browse the repository at this point in the history
  4. proc_parse: remove trivial goto from vma_get_mapfile_user()

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 25, 2023
    Configuration menu
    Copy the full SHA
    ba27d27 View commit details
    Browse the repository at this point in the history
  5. test/other: add test for action-script

    This commit is introducing a test for the action-script functionality
    of CRIU to verify that pre-dump, post-dump, pre-restore, pre-resume,
    post-restore, post-resume hooks are executed during dump/restore.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 25, 2023
    Configuration menu
    Copy the full SHA
    2df6ec5 View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2023

  1. proc_parse: Log smaps entry while dumping VMA.

    Help debugging problems with restoring custom VMAs.
    
    From: Michał Cłapiński <[email protected]>
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe committed Aug 28, 2023
    Configuration menu
    Copy the full SHA
    92b96a5 View commit details
    Browse the repository at this point in the history
  2. kerndat: Make errors from clone3() check more precise.

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe committed Aug 28, 2023
    Configuration menu
    Copy the full SHA
    a04fac5 View commit details
    Browse the repository at this point in the history
  3. kerndat: check_pagemap: close(fd) on error path

    Plug a fd leak when returning error from check_pagemap().
    (Cosmetic, as the process will exit soon anyway.)
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 28, 2023
    Configuration menu
    Copy the full SHA
    d0f88ff View commit details
    Browse the repository at this point in the history
  4. kerndat: check_pagemap: reword retried case explanation

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 28, 2023
    Configuration menu
    Copy the full SHA
    6d0e785 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2023

  1. memfd: return original memfd fd for execveat()

    If there is only a single RW opened fd for a memfd, it can be used
    to pass it to execveat() with AT_EMPTY_PATH to have its contents
    executed.  This currently works only for the original fd from
    memfd_create().  For now we ignore processes that reopen the memfd's
    rw and expect a particular executability trait of it.  (Note: for
    security purposes recent kernels have SEAL_EXEC to make memfds
    non-executable.)
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    e0d13ef View commit details
    Browse the repository at this point in the history
  2. zdtm: test execveat(memfd)

    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Aug 29, 2023
    Configuration menu
    Copy the full SHA
    a652c68 View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2023

  1. CONTRIBUTING.md: don't mention ctags

    Ctags is mentioned in the beginning of the "Edit the source code" which
    is really confusing: Do you need ctags to edit CRIU code? - No. It is
    just one helpful tool to browse the code, and we do not want to enforce
    it. So, what is it doing in contribution guide? People who really need
    it should be able to find it in Makefile or just write oneliner of their
    own to collect tags...
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Aug 31, 2023
    Configuration menu
    Copy the full SHA
    959a32d View commit details
    Browse the repository at this point in the history
  2. CONTRIBUTING.md: improve coding-style related sections

    This is highlight that code readability is the real goal of all the
    coding-style rules. We should not do coding-style just for coding-style,
    e.g. when clang-format suggests crazy formating we should not follow it
    if we feel it is bad.
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Aug 31, 2023
    Configuration menu
    Copy the full SHA
    75146b0 View commit details
    Browse the repository at this point in the history
  3. lint: don't fail workflow on indent fail

    There are multiple cases where good human readable code block is
    converted to an unreadable mess by clang-format, so we don't want to
    rely on clang-format completely. Also there is no way, as far as I can
    see, to make clang-format only fix what we want it to fix without
    breaking something.
    
    So let's just display hints inline where clang-format is unhappy. When
    reviewer sees such a warning it's a good sign that something is broken
    in coding-style around this warning.
    
    We add special script which parses diff generated by indent and
    generates warning for each hunk.
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Aug 31, 2023
    Configuration menu
    Copy the full SHA
    03541c0 View commit details
    Browse the repository at this point in the history

Commits on Sep 4, 2023

  1. Configuration menu
    Copy the full SHA
    1df618a View commit details
    Browse the repository at this point in the history
  2. vagrant: run tests with fedora 38

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 4, 2023
    Configuration menu
    Copy the full SHA
    82bfb67 View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2023

  1. dump: use MEMBARRIER_CMD_GET_REGISTRATIONS when available

    MEMBARRIER_CMD_GET_REGISTRATIONS can tell us whether or not the process used
    MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED unlike the old probing method.
    
    Falls back to the old method when MEMBARRIER_CMD_GET_REGISTRATIONS is
    unavailable.
    
    Signed-off-by: Michal Clapinski <[email protected]>
    mclapinski authored and avagin committed Sep 14, 2023
    Configuration menu
    Copy the full SHA
    4b7287b View commit details
    Browse the repository at this point in the history
  2. zdtm: test MEMBARRIER_CMD_GLOBAL_EXPEDITED migration

    Check membarrier registration both ways:
    1. By issuing membarrier commands and checking if they succeed.
    2. By issuing MEMBARRIER_CMD_GET_REGISTRATIONS.
    
    The first way is needed for older kernels. The second way is needed to test
    MEMBARRIER_CMD_GLOBAL_EXPEDITED.
    
    Signed-off-by: Michal Clapinski <[email protected]>
    mclapinski authored and avagin committed Sep 14, 2023
    Configuration menu
    Copy the full SHA
    d752479 View commit details
    Browse the repository at this point in the history
  3. criu/plugin: Add environment variable to cap size of buffers.

    The amdgpu plugin would create a memory buffer at the size
    of the largest VRAM bo (buffer object). On some systems, VRAM
    size exceeds RAM size, so the largest bo might be larger than
    the available memory.
    
    Add an environment variable KFD_MAX_BUFFER_SIZE, which caps the
    size of this buffer. By default, it is set to 0, and has no
    effect. When active, any bo larger than its value will be
    saved to/restored from file in multiple passes.
    
    Signed-off-by: David Francis <[email protected]>
    fdavid-amd authored and avagin committed Sep 14, 2023
    Configuration menu
    Copy the full SHA
    f043f53 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2023

  1. compel: Add support for ppc64le scv syscalls

    Power ISA 3.0 added a new syscall instruction. Kernel 5.9 added
    corresponding support.
    
    Add CRIU support to recognize the new instruction and kernel ABI changes
    to properly dump and restore threads executing in syscalls. Without this
    change threads executing in syscalls using the scv instruction will not
    be restored to re-execute the syscall, they will be restored to execute
    the following instruction and will return unexpected error codes
    (ERESTARTSYS, etc) to user code.
    
    Signed-off-by: Younes Manton <[email protected]>
    ymanton authored and avagin committed Sep 20, 2023
    Configuration menu
    Copy the full SHA
    4766ffa View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2023

  1. lib/pycriu: generate version.py

    The version of CRIU is specified in the Makefile.versions file.
    This patch generates '__varion__' value for the pycriu module.
    This value can be used by crit to implement `--version`.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 22, 2023
    Configuration menu
    Copy the full SHA
    4ae1518 View commit details
    Browse the repository at this point in the history
  2. crit/setup.py: use __version__ from pycriu

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 22, 2023
    Configuration menu
    Copy the full SHA
    b8b2fe6 View commit details
    Browse the repository at this point in the history
  3. py/cli: add --version option

    This patch implements the '--version' for the crit tool.
    
    $ crit --version
    3.17
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 22, 2023
    Configuration menu
    Copy the full SHA
    150eecc View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2023

  1. ci: stop testing ubuntu overlayfs

    They break it with each kernel rebase. More details are here:
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1857257
    
    Last time, it was fixed a few month ago and it has been broken again in
    5.15.0-1046-azure.
    
    Let's bind-mount the CRIU directory into a test container to make it
    independent of a container file system.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Sep 25, 2023
    Configuration menu
    Copy the full SHA
    5e37ccf View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2023

  1. zdtm: If ignoring kernel taint, also ignore taint changes.

    At least in Google's VM environment, the kernel taints are unrelated to CRIU
    runs.  Don't fail tests if taints change, if kernel taints are ignored.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    1a6c015 View commit details
    Browse the repository at this point in the history
  2. zdtm: cgroup04: Improve error messages.

    Make the errno values reported by cgroup04 always correct and showing
    relevant parameters.
    Constify constant strings, while at it.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    f74140c View commit details
    Browse the repository at this point in the history
  3. zdtm: cgroup04: Improve skip check's robustness.

    cgroup04 test needs full control over mem and devices cgroup hierarchies.
    Make the test's .checkskip script better at detecting if the cgroups are
    available for use.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    66369f9 View commit details
    Browse the repository at this point in the history
  4. zdtm: Treat ESRCH from kill() as success.

    This fixes a failure to clean up after a failed test, where CRIU didn't start properly.
    
    ```
    ===================== Run zdtm/transition/socket-tcp in h ======================
    Start test
    ./socket-tcp --pidfile=socket-tcp.pid --outfile=socket-tcp.out
    Traceback (most recent call last):
      File ".../zdtm_py.py", line 1906, in do_run_test
        cr(cr_api, t, opts)
      File ".../zdtm_py.py", line 1584, in cr
        cr_api.dump("dump")
      File ".../zdtm_py.py", line 1386, in dump
        self.__dump_process = self.__criu_act(action,
      File ".../zdtm_py.py", line 1224, in __criu_act
        raise test_fail_exc("CRIU %s" % action)
    test_fail_exc: CRIU dump
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "<embedded module '_launcher'>", line 182, in run_filename_from_loader_as_main
      File "<embedded module '_launcher'>", line 34, in _run_code_in_main
      File ".../zdtm_py.py", line 2790, in <module>
        fork_zdtm()
      File ".../zdtm_py.py", line 2782, in fork_zdtm
        do_run_test(tinfo[0], tinfo[1], tinfo[2], tinfo[3])
      File ".../zdtm_py.py", line 1922, in do_run_test
        t.kill()
      File ".../zdtm_py.py", line 509, in kill
        os.kill(int(self.__pid), sig)
    ProcessLookupError: [Errno 3] No such process
    ```
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    87c42d5 View commit details
    Browse the repository at this point in the history
  5. zdtm: socket_udp_shutdown: Make the test fail instead of timing out.

    When -- after restore -- sockets can't communicate, the test times out
    while waiting on recvfrom(). Since the communication is local, send()
    works instantaneously - so mark sockets with SOCK_NONBLOCK and report
    failure if the message is not received immediately.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Sep 26, 2023
    Configuration menu
    Copy the full SHA
    a056519 View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2023

  1. zdtm: check userns once

    All test logs are flooded with the "userns is supported" messages...
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Sep 27, 2023
    Configuration menu
    Copy the full SHA
    8b5f3af View commit details
    Browse the repository at this point in the history

Commits on Sep 28, 2023

  1. Return page size as unsigned long

    Currently page_size() returns unsigned int value that is after "bitwise
    not" is promoted to unsigned long value e.g. in uffd.c
    handle_page_fault. Since the value is unsigned promotion is done with 0
    MSB that results in lost of MSB pagefault address bits. So make
    page_size to return  unsigned long to avoid such situation.
    
    Signed-off-by: Vladislav Khmelevsky <[email protected]>
    yota9 authored and avagin committed Sep 28, 2023
    Configuration menu
    Copy the full SHA
    1e4f5fb View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2023

  1. vma: Add !VVAR condition to vma_entry_can_be_lazy

    Currently most of the times we don't have problems with VVAR segment and
    lazy restore because when VDSO is parked there is an munmap call that
    calls UFFDIO_UNREGISTER on the destination address.
    But we don't want to enable userfaultfd for VDSO and VVAR at the first
    place.
    yota9 authored and avagin committed Sep 29, 2023
    Configuration menu
    Copy the full SHA
    4f0c07f View commit details
    Browse the repository at this point in the history
  2. criu: change the comment about magic numbers

    Signed-off-by: Michal Clapinski <[email protected]>
    mclapinski authored and avagin committed Sep 29, 2023
    Configuration menu
    Copy the full SHA
    5de9040 View commit details
    Browse the repository at this point in the history

Commits on Oct 5, 2023

  1. plugins: the UPDATE_VMA_MAP callback returns fd with the full control

    It means CRIU has to close it when it is not needed.
    
    It looks more logically correct and matches the behaviour of
    the RESTORE_EXT_FILE callback.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    f832d87 View commit details
    Browse the repository at this point in the history
  2. amdgpu: don't leak fd on an error path in open_img_file

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    25f685e View commit details
    Browse the repository at this point in the history
  3. amdgpu: print an error if the dup syscall fails

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    01d559d View commit details
    Browse the repository at this point in the history
  4. ci: enable build with amdgpu plugin

    This patch adds the `libdrm-dev` package to the list of CRIU
    dependencies installed in CI to build CRIU with amdgpu plugin.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    f593257 View commit details
    Browse the repository at this point in the history
  5. amdgpu: fix clang warnings

    amdgpu_plugin.c:930:6: error: variable 'buffer' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
            if (ret) {
                ^~~
    amdgpu_plugin.c:988:8: note: uninitialized use occurs here
            xfree(buffer);
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    2ff90f0 View commit details
    Browse the repository at this point in the history
  6. memfd: don't reopen file descriptors for memory mappings

    One memfd can be shared by a few restored files. Only of these files is
    restored with a file created with memfd_open. Others are restored by reopening
    memfd files via /proc/self/fd/.
    
    It seems unnecessary for restoring memfd memory mappings. We can always use the
    origin file.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    f54cf19 View commit details
    Browse the repository at this point in the history
  7. zdtm/memfd04: check execveat on memfd that has memory mappings

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 5, 2023
    Configuration menu
    Copy the full SHA
    c20fb83 View commit details
    Browse the repository at this point in the history

Commits on Oct 6, 2023

  1. clang-format: disable column limit constraint

    The "ColumnLimit: 120" is not only allowing lines to be longer than 80
    characters but it also forces line wrapping at 120 characters. If total
    expression length is more than 120 characters, clang-format will try to
    wrap it as close to 120 as it can, it would not even allow to wrap at 80
    characters if we really want it. But as we all know 80 characters is
    Linux kernel coding style default and as far as our coding style is
    based on it it is really strange to prohibit wrapping lines at 80
    characters...
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Oct 6, 2023
    Configuration menu
    Copy the full SHA
    5bf7652 View commit details
    Browse the repository at this point in the history

Commits on Oct 7, 2023

  1. pie: Mark __export_*() functions as externally_visible

    GCC's lto source:
    > To avoid this problem the compiler must assume that it sees the
    > whole program when doing link-time optimization.  Strictly
    > speaking, the whole program is rarely visible even at link-time.
    > Standard system libraries are usually linked dynamically or not
    > provided with the link-time information.  In GCC, the whole
    > program option (@option{-fwhole-program}) asserts that every
    > function and variable defined in the current compilation
    > unit is static, except for function @code{main} (note: at
    > link time, the current unit is the union of all objects compiled
    > with LTO).  Since some functions and variables need to
    > be referenced externally, for example by another DSO or from an
    > assembler file, GCC also provides the function and variable
    > attribute @code{externally_visible} which can be used to disable
    > the effect of @option{-fwhole-program} on a specific symbol.
    
    As far as I read gcc's source, ipa_comdats() will avoid placing symbols
    that are either already in a user-defined section or have
    externally_visible attribute into new optimized gcc sections.
    
    Signed-off-by: Dmitry Safonov <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    0x7f454c46 authored and avagin committed Oct 7, 2023
    Configuration menu
    Copy the full SHA
    e3391ed View commit details
    Browse the repository at this point in the history
  2. util: allow to run criu under strace

    fork_and_ptrace_attach has to fork a child with CLONE_UNTRACED,
    so that strace doesn't trace it.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 7, 2023
    Configuration menu
    Copy the full SHA
    24e2492 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2023

  1. tun: don't parse buffers that have not been filled with data

    read_ns_sys_file() can return an error, but we are trying to parse a
    buffer before checking a return code.
    
    CID 417395 (checkpoint-restore#3 of 3): String not null terminated (STRING_NULL)
    2. string_null: Passing unterminated string buf to strtol, which expects
       a null-terminated string.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    4e5247a View commit details
    Browse the repository at this point in the history
  2. apparmor: remove the redundant check

    This check is redundant as line 201 checks for this condition.
    
    Signed-off-by: Taemin Ha <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Taemin Ha authored and avagin committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    3015aad View commit details
    Browse the repository at this point in the history
  3. arch/x86: remove the redundant check

    The is_native field is a boolean. Therefore, else if() should can be
    changed to a simple else{}.
    
    Signed-off-by: Taemin Ha <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Taemin Ha authored and avagin committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    9e05b65 View commit details
    Browse the repository at this point in the history
  4. zdtm/cow00: fix typo

    The condition meant to check fd2 instead of fd1, which is checked in
    line 24.
    
    Signed-off-by: Taemin Ha <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Taemin Ha authored and avagin committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    06a3f13 View commit details
    Browse the repository at this point in the history
  5. zdtm/thread_different_uid_gid: remove the redundant check

    line 131 checks if (ret >= 0). line 133 could be replaced by a simple else statement
    
    Signed-off-by: Taemin Ha <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Taemin Ha authored and avagin committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    c03c737 View commit details
    Browse the repository at this point in the history
  6. criu/proc_parse: refactor the eventpoll parser

    Eventpollentry's fields are set only when ret == 3 or ret == 6. The
    remaining cases can be grouped together to an error
    
    Signed-off-by: Taemin Ha <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Taemin Ha authored and avagin committed Oct 8, 2023
    Configuration menu
    Copy the full SHA
    ab73a84 View commit details
    Browse the repository at this point in the history

Commits on Oct 12, 2023

  1. files-reg: don't change the file pos in get_build_id

    At this point the correct position is already restored, so reading from
    the fd results in the position being moved forward by 5 bytes.
    
    Fixes: 9191f87 ("criu/files-reg.c: add build-id validation functionality")
    Signed-off-by: Michal Clapinski <[email protected]>
    mclapinski authored and avagin committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    811a380 View commit details
    Browse the repository at this point in the history
  2. zdtm/lib: add missing signal.h header

    Signed-off-by: Michal Clapinski <[email protected]>
    mclapinski authored and avagin committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    d9ca0c7 View commit details
    Browse the repository at this point in the history
  3. zdtm/static: test the offset migration of ELF files

    Signed-off-by: Michal Clapinski <[email protected]>
    mclapinski authored and avagin committed Oct 12, 2023
    Configuration menu
    Copy the full SHA
    42c1c84 View commit details
    Browse the repository at this point in the history

Commits on Oct 13, 2023

  1. zdtm: cgroup_ifpriomap: Improve skip check's robustness.

    cgroup_ifpriomap test needs net_prio cgroup, which might not be
    available. Make the .checkskip script check it.
    
    Signed-off-by: Michał Mirosław <[email protected]>
    osctobe authored and avagin committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    711775f View commit details
    Browse the repository at this point in the history

Commits on Oct 17, 2023

  1. lib: use separate packages for pycriu and crit

    Newer versions of pip use an isolated virtual environment when building
    Python projects. However, when the source code of CRIT is copied into
    the isolated environment, the symlink for `../lib/py` (pycriu) becomes
    invalid. As a workaround, we used the `--no-build-isolation` option for
    `pip install`. However, this functionality has issues in some versions
    of PIP [1, 2]. To fix this problem, this patch adds separate packages
    for pycriu and crit, and each package is installed independently.
    
    [1] pypa/pip#8221
    [2] pypa/pip#8165 (comment)
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 17, 2023
    Configuration menu
    Copy the full SHA
    df24fe8 View commit details
    Browse the repository at this point in the history

Commits on Oct 22, 2023

  1. Makefile: introduce ARCHCFLAGS for arch specific cflags

    Do not use $(USERCFLAGS) for anything other than what the user provide.
    
    Signed-off-by: Marcus Folkesson <[email protected]>
    marcusfolkesson authored and avagin committed Oct 22, 2023
    Configuration menu
    Copy the full SHA
    c474816 View commit details
    Browse the repository at this point in the history

Commits on Nov 28, 2023

  1. comple: correct the syscall number of bind on ARM64

    In the compel/arch/arm/plugins/std/syscalls/syscall.def, the syscall number of bind on ARM64 should be 200 instead of 235
    
    Signed-off-by: Sally Kang <[email protected]>
    SallyKAN authored and avagin committed Nov 28, 2023
    Configuration menu
    Copy the full SHA
    d88dcef View commit details
    Browse the repository at this point in the history

Commits on Nov 29, 2023

  1. ci: fix rawhide netlink error

    The rawhide netlink errors are fixed with a newer kernel than the
    default 6.2 available in Fedora 38.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber committed Nov 29, 2023
    Configuration menu
    Copy the full SHA
    5e56756 View commit details
    Browse the repository at this point in the history
  2. test: check for btrfs in the current directory

    The old test was checking if '/' is btrfs but we should check if the
    current directory is btrfs.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber committed Nov 29, 2023
    Configuration menu
    Copy the full SHA
    4213f16 View commit details
    Browse the repository at this point in the history
  3. ci: switch to permissive selinux mode during test

    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber committed Nov 29, 2023
    Configuration menu
    Copy the full SHA
    9d3e71a View commit details
    Browse the repository at this point in the history

Commits on Nov 30, 2023

  1. ci: fix codespell errors

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Nov 30, 2023
    Configuration menu
    Copy the full SHA
    b17a73b View commit details
    Browse the repository at this point in the history

Commits on Dec 6, 2023

  1. docker-test: fix condition for max tries

    Replace a recursive call with a loop.
    
    Reported-by: Andrei Vagin <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    95975e0 View commit details
    Browse the repository at this point in the history
  2. docker-test: downgrade docker to v24.0.7

    Checkpoint/restore with version 25.0.0-beta.1 fails
    with the following error:
    
    $ docker start --checkpoint=c1 cr
    Error response from daemon: failed to create task for container: content digest fdb1054b00a8c07f08574ce52198c5501d1f552b6a5fb46105c688c70a9acb45: not found: unknown
    
    Release notes:
    moby/moby#46816
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Dec 6, 2023
    Configuration menu
    Copy the full SHA
    0da1ab2 View commit details
    Browse the repository at this point in the history

Commits on Dec 8, 2023

  1. Makefile: Use common warnings settings for loongarch64

    WARNINGS variable should be amended, not redefined.
    We still need, e.g.,  `-Wno-dangling-pointer` to build
    criu on loongarch64 with gcc13.
    
    Signed-off-by: Ivan A. Melnikov <[email protected]>
    iv-m authored and avagin committed Dec 8, 2023
    Configuration menu
    Copy the full SHA
    378da3b View commit details
    Browse the repository at this point in the history

Commits on Dec 11, 2023

  1. tty: skip ioctl(TIOCSLCKTRMIOS) if possible

    If ioctl(TIOCSLCKTRMIOS) fails with EPERM it means that a CRIU
    process lacks of CAP_SYS_ADMIN capability. But we can use
    ioctl(TIOCGLCKTRMIOS) to *read* current ->termios_locked
    value from the kernel and if it's the same as we already have
    we can skip failing ioctl(TIOCSLCKTRMIOS) safely.
    
    Adrian has recently posted [1] a very good patch to allow ioctl(TIOCSLCKTRMIOS)
    for processes that have CAP_CHECKPOINT_RESTORE (right now it requires CAP_SYS_ADMIN).
    
    [1] https://lore.kernel.org/all/[email protected]/
    
    Suggested-by: Andrei Vagin <[email protected]>
    Signed-off-by: Alexander Mikhalitsyn <[email protected]>
    mihalicyn authored and avagin committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    dc49eb4 View commit details
    Browse the repository at this point in the history
  2. ci: do not use 'tail' for skip-file-rwx-check test

    Newer versions of 'tail' rely on inotify and after a restore 'tail' is
    unhappy with the state of inotify and just stops.
    
    This replaces 'tail' with a minimal shell based test (thanks Andrei).
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    1573064 View commit details
    Browse the repository at this point in the history
  3. ci: fix centos-stream 9 ci errors

    The image has a too old version of nettle which does not work with gnutls.
    Just upgrade to the latest to make the error go away.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    561f845 View commit details
    Browse the repository at this point in the history

Commits on Dec 21, 2023

  1. ci: disable non-root in user namespace test in container

    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber committed Dec 21, 2023
    Configuration menu
    Copy the full SHA
    e7aca13 View commit details
    Browse the repository at this point in the history

Commits on Dec 25, 2023

  1. gitignore: remove historical left-over files

    In commit [1] was introduced a mechanism to auto-generate the files:
    sys-exec-tbl*.c, syscalls*.S, syscall-codes*.h, and syscall*.h.
    This commit also updated the gitignore rules to ignore auto-generated
    files. However, after commit [2], the path for these files has changed
    and the patterns specified in gitignore are no longer needed.
    
    [1] bbc2f13 (x86/build: generate syscalls-{64,32}.built-in.o)
    [2] 19fadee (compel: plugins,std -- Implement syscalls in std plugin)
    
    Reported-by: @felicitia
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Dec 25, 2023
    Configuration menu
    Copy the full SHA
    61224f2 View commit details
    Browse the repository at this point in the history

Commits on Dec 30, 2023

  1. make: fix compilation on alpine

    Starting with the musl v1.2.4~69, _GNU_SOURCE doesn't set _LARGEFILE64_SOURCE.
    
    Fixes checkpoint-restore#2313
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Dec 30, 2023
    Configuration menu
    Copy the full SHA
    50aa6da View commit details
    Browse the repository at this point in the history

Commits on Jan 8, 2024

  1. irmap: hardcode some more interesting paths

    Signed-off-by: robert <[email protected]>
    rayrapetyan authored and avagin committed Jan 8, 2024
    Configuration menu
    Copy the full SHA
    cda1c5c View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2024

  1. net: fix network unlock with iptables-nft

    When iptables-nft is used as backend for iptables, the rules for
    network locking are translated into the following nft rules:
    
    ```
    $ iptables-restore-translate -f lock.txt
    add table ip filter
    add chain ip filter CRIU
    insert rule ip filter INPUT counter jump CRIU
    insert rule ip filter OUTPUT counter jump CRIU
    add rule ip filter CRIU mark 0xc114 counter accept
    add rule ip filter CRIU counter drop
    ```
    
    These rules create the following chains:
    
    ```
    table ip filter { # handle 1
    	chain CRIU { # handle 1
    		meta mark 0x0000c114 counter packets 16 bytes 890 accept # handle 6
    		counter packets 1 bytes 60 drop # handle 7
    		meta mark 0x0000c114 counter packets 0 bytes 0 accept # handle 8
    		counter packets 0 bytes 0 drop # handle 9
    	}
    
    	chain INPUT { # handle 2
    		type filter hook input priority filter; policy accept;
    		counter packets 8 bytes 445 jump CRIU # handle 3
    		counter packets 0 bytes 0 jump CRIU # handle 10
    	}
    
    	chain OUTPUT { # handle 4
    		type filter hook output priority filter; policy accept;
    		counter packets 9 bytes 505 jump CRIU # handle 5
    		counter packets 0 bytes 0 jump CRIU # handle 11
    	}
    }
    ```
    
    In order to delete the CRIU chain, we need to first delete all four
    jump targets. Otherwise, `-X CRIU` would fail with the following error:
    
    iptables-restore v1.8.10 (nf_tables):
    line 5: CHAIN_DEL failed (Resource busy): chain CRIU
    
    Reported-by: Andrei Vagin <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    0416d81 View commit details
    Browse the repository at this point in the history
  2. test/nfconntrack: use nft or iptables-legacy

    nft does not support xtables compat expressions
    https://git.netfilter.org/nftables/commit/?id=79195a8cc9e9d9cf2d17165bf07ac4cc9d55539f
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    e5f4d8c View commit details
    Browse the repository at this point in the history
  3. net: add error messages for restore of nftables

    Show appropriate error messages when restore of nftables fails.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    8f4430d View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2024

  1. kerndat: check the PAGEMAP_SCAN ioctl

    PAGEMAP_SCAN is a new ioctl that allows to get page attributes in a more
    effeciant way than reading pagemap files.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    615e45e View commit details
    Browse the repository at this point in the history
  2. page-cache: use the PAGEMAP_SCAN ioctl when it is available

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    bfa9428 View commit details
    Browse the repository at this point in the history
  3. pagemap-cache: add an ability to run tests without PAGEMAP_SCAN

    This change adds a new injectable fault (135) to disable PAGEMAP_SCAN and fault
    back to read pagemap files.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    50190ae View commit details
    Browse the repository at this point in the history

Commits on Jan 23, 2024

  1. zdtm: socket-tcp-nft-nfconntrack: add a hook to the chain in nft case

    Let's use hooked nft chain which actually affects packets.
    
    Fixes: e5f4d8c ("test/nfconntrack: use nft or iptables-legacy")
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and rst0git committed Jan 23, 2024
    Configuration menu
    Copy the full SHA
    dfd7d63 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. criu-log: remove unused declaration

    This patch removes a leftover declaration for log_closedir()
    which has been removed in the following commit:
    
    dc80d6f
    log: get rid of LOG_DIR_FD_OFF and opening cwd in log_init()
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jan 25, 2024
    Configuration menu
    Copy the full SHA
    6349473 View commit details
    Browse the repository at this point in the history
  2. net: return bool with iptable_has_criu_jump_target

    To improve readability, this patch changes the return type of
    iptables_has_criu_jump_target() to a boolean, where 'true' indicates
    that iptables has CRIU jump target and 'false' indicates otherwise.
    
    Suggested-by: Pavel Tikhomirov <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jan 25, 2024
    Configuration menu
    Copy the full SHA
    07a090b View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. plugin/amdgpu: Don't print error for "No such process" during resume

    During the late stages of restore, each process being resumed gets
    an ioctl call to KFD_CRIU_OP_RESUME. If the process has no kfd
    process info, this call with fail with -ESRCH. This is normal
    behaviour, so we shouldn't print an error message for it.
    
    Signed-off-by: David Francis <[email protected]>
    fdavid-amd authored and avagin committed Feb 1, 2024
    Configuration menu
    Copy the full SHA
    a9cbdad View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2024

  1. plugin/amdgpu: Also don't print 'plugin failed' in criu

    We already don't treat it as error in the plugin itself, but after
    returning -1 from RESUME_DEVICES_LATE hook we print debug message in
    criu about failed plugin, let's return 0 instead.
    
    While on it let's replace ret to exit_code.
    
    Fixes: a9cbdad ("plugin/amdgpu: Don't print error for "No such process" during resume")
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Feb 5, 2024
    Configuration menu
    Copy the full SHA
    639068e View commit details
    Browse the repository at this point in the history
  2. amdgpu_plugin: Refactor code in preparation to support C&R for DRM de…

    …vices
    
    Add a new compilation unit to host symbols and methods that will be
    needed to C&R DRM devices. Refactor code that indicates support for
    C&R and checkpoints KFD and DRM devices
    
    Signed-off-by: Ramesh Errabolu <[email protected]>
    rerrabolu authored and avagin committed Feb 5, 2024
    Configuration menu
    Copy the full SHA
    81f2c41 View commit details
    Browse the repository at this point in the history
  3. amdgpu_plugin: Refactor code used to implement Checkpoint

    Refactor code used to Checkpoint DRM devices. Code is moved
    into amdgpu_plugin_drm.c file which hosts various methods to
    checkpoint and restore a workload.
    
    Signed-off-by: Ramesh Errabolu <[email protected]>
    rerrabolu authored and avagin committed Feb 5, 2024
    Configuration menu
    Copy the full SHA
    9d9ae29 View commit details
    Browse the repository at this point in the history
  4. sk-inet: Added IP_TTL socket option

    Signed-off-by: rahulk789 <[email protected]>
    rahulk789 authored and avagin committed Feb 5, 2024
    Configuration menu
    Copy the full SHA
    66cab1f View commit details
    Browse the repository at this point in the history
  5. zdtm: Added tests for IP_TTL restore

    Signed-off-by: rahulk789 <[email protected]>
    rahulk789 authored and avagin committed Feb 5, 2024
    Configuration menu
    Copy the full SHA
    a49d6db View commit details
    Browse the repository at this point in the history

Commits on Feb 6, 2024

  1. sk-inet: fix codding style in restore_ip_opts

    Commit [1] introduced codding-style breackage, let's fix it.
    
    Fixes: 66cab1f ("sk-inet: Added IP_TTL socket option") [1]
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Feb 6, 2024
    Configuration menu
    Copy the full SHA
    495081c View commit details
    Browse the repository at this point in the history

Commits on Feb 12, 2024

  1. amdgpu_plugin: fix lint errors

    $ make lint
     ...
     # Do not append \n to pr_perror, pr_pwarn or fail
     ! git --no-pager grep -E '^\s*\<(pr_perror|pr_pwarn|fail)\>.*\\n"'
     plugins/amdgpu/amdgpu_plugin.c:		pr_perror("%s(), Can't handle VMAs of input device\n", __func__);
    
     ! git --no-pager grep -En '^\s*\<pr_(err|warn|msg|info|debug)\>.*);$' | grep -v '\\n'
     plugins/amdgpu/amdgpu_plugin_drm.c:45:		pr_err("Error in getting stat for: %s", path);
     plugins/amdgpu/amdgpu_plugin_util.c:77:		pr_err("Unable to read file (read:%ld buf_len:%ld)", len_read, buf_len);
     plugins/amdgpu/amdgpu_plugin_util.c:89:		pr_err("Unable to write file (wrote:%ld buf_len:%ld)", len_write, buf_len);
     plugins/amdgpu/amdgpu_plugin_util.c:120:		pr_err("%s: Failed to open for %s", path, write ? "write" : "read");
     plugins/amdgpu/amdgpu_plugin_util.c:126:		pr_err("%s: Failed get pointer for %s", path, write ? "write" : "read");
     plugins/amdgpu/amdgpu_plugin_util.c:136:		pr_err("%s:Failed to access file size", path);
     plugins/amdgpu/amdgpu_plugin_util.c:152:		pr_err("Cannot fopen %s", file_path);
    
     make: *** [Makefile:470: lint] Error 1
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Feb 12, 2024
    Configuration menu
    Copy the full SHA
    6d37f9a View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2024

  1. kerndat: check support for PAGE_IS_SOFT_DIRTY

    The commit introducing PAGE_IS_SOFT_DIRTY has not been merged
    in kernel v6.7.x.
    
    fs/proc/task_mmu: report SOFT_DIRTY bits through the PAGEMAP_SCAN ioctl
    torvalds/linux@e6a9a2cbc13bf
    
    As a result, CRIU fails with the following error:
    
    Error (criu/pagemap-cache.c:199): pagemap-cache: PAGEMAP_SCAN: Invalid argument'
    Error (criu/pagemap-cache.c:225): pagemap-cache: Failed to fill cache for 63 (400000-402000)'
    
    This patch updates check_pagemap() in kerndat to check if PAGE_IS_SOFT_DIRTY is supported.
    Fixes: checkpoint-restore#2334
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    7bd786d View commit details
    Browse the repository at this point in the history
  2. pb2dict: fix flake8 error

    This patch fixes the following flake8 error:
    python3 -m flake8 --config=scripts/flake8.cfg lib/pycriu/images/pb2dict.py
    lib/pycriu/images/pb2dict.py:361:43: E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()`
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    cac03be View commit details
    Browse the repository at this point in the history
  3. make: replace flake8 with ruff

    Ruff (https://github.com/astral-sh/ruff) is a Python linter
    written in Rust, designed to replace Flake8. It is significantly
    faster and actively maintained.
    
    In addition to replacing flake8 with ruff, this patch also
    creates separate makefile targets for ruff, shellcheck and
    codespell, so that they can be tested independently.
    
    RUFF_FLAGS can be used to specify options such as '--fix'.
    Example:
    	make lint
    	make ruff RUFF_FLAGS=--fix
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    8a22b15 View commit details
    Browse the repository at this point in the history
  4. criu-ns: fix lint error

    This patch fixes the following lint error:
    scripts/criu-ns:219:16: E713 [*] Test for membership should be `not in`
    
    The change in this patch is auto-generated with `ruff --fix`.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    da7e6d3 View commit details
    Browse the repository at this point in the history

Commits on Feb 18, 2024

  1. cgroup: Add support for restoring a thread in a correct v1 cgroup

    Currently we have checkpoint/restore support only of cgroup v2 threaded
    controllers. Threads originating in cgroup v1 environments will be
    restored to the main thread's cgroup. This change extends the support
    for a cgroups v1.
    
    Signed-off-by: Stepan Pieshkin <[email protected]>
    StepanPieshkin authored and avagin committed Feb 18, 2024
    Configuration menu
    Copy the full SHA
    1120308 View commit details
    Browse the repository at this point in the history
  2. zdtm/static: check that cgroup layout of threads is preserved

    Co-developed-by: Stepan Pieshkin <[email protected]>
    Signed-off-by: Stepan Pieshkin <[email protected]>
    Signed-off-by: Michal Clapinski <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    StepanPieshkin authored and avagin committed Feb 18, 2024
    Configuration menu
    Copy the full SHA
    d553fad View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2024

  1. compiler: add ALIGN_DOWN macro

    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    c98fefd View commit details
    Browse the repository at this point in the history
  2. compel: always pass user_fpregs_struct_t to compel_get_task_regs()

    All architectures create on-stack structure for floating point save area
    in compel_get_task_regs() if the caller passes NULL rather than a valid
    pointer.
    
    The only place that calls compel_get_task_regs() with NULL for floating
    point save area is parasite_start_daemon() and it is simpler to define
    this strucuture on stack of parasite_start_daemon().
    
    The availability of floating point save data is required in
    parasite_start_daemon() to detect shadow stack presence early during
    parasite infection and will be used in later patches.
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    0dba58a View commit details
    Browse the repository at this point in the history
  3. compel: shstk: save CET state when CPU supports it

    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    fc683cb View commit details
    Browse the repository at this point in the history
  4. compel: infect: prepare parasite_service() for addition of CET support

    To support sigreturn with CET enabled parasite must rewind its stack
    before calling sigreturn so that shadow stack will be compatible with
    actual calling sequence.
    
    In addition, calling sigreturn from top level routine
    (__export_parasite_head_start) will significantly simplify the shadow
    stack manipulations required to execute sigreturn.
    
    For x86 make fini_sigreturn() return the stack pointer for the signal
    frame that will be used by sigreturn and propagate that return value up
    to __export_parasite_head_start.
    
    In non-daemon mode parasite_trap_cmd() returns non-positive value
    which allows to distinguish daemon and non-daemon mode and properly stop
    at int3 in non-daemon mode.
    
    Architectures other than x86 remain unchanged and will still call
    sigreturn from fini_sigreturn().
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    eee2236 View commit details
    Browse the repository at this point in the history
  5. compel: shstk: prepare shadow stack signal frame

    When calling sigreturn with CET enabled, the kernel verifies that the
    shadow stack has proper address of sa_restorer and a "restore token".
    Normally, they pushed to the shadow stack when signal processing is
    started.
    
    Since compel calls sigreturn directly, the shadow stack should be
    updated to match the kernel expectations for sigreturn invocation.
    
    Add parasite_setup_shstk() that sets up the shadow stack with the
    address of __export_parasite_head_start as sa_restorer and with the
    required restore token.
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    a09a0eb View commit details
    Browse the repository at this point in the history
  6. criu: shstk: add VMA_AREA_SHSTK flag

    The shadow stack VMAs require special care because they can only be
    created and populated using special system calls.
    
    Add VMA_AREA_SHSTK flag and set it for VMAs that are marked as "ss" in
    /proc/pid/smaps
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    dbab276 View commit details
    Browse the repository at this point in the history
  7. criu: shstk: premap and prepopulate shadow stack VMAs

    Shadow stack VMAs cannot be mmap()ed, they must be created using
    map_shadow_stack() system call and populated using special wrss
    instruction available only when shadow stack is enabled.
    
    Premap them to reserve virtual address space and populate it to have
    there contents available for later copying after enabling shadow stack.
    
    Along with the space required by shadow stack VMAs also reserve an extra
    page that will be later used as a temporary shadow stack.
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    95896b4 View commit details
    Browse the repository at this point in the history
  8. criu: shstk: prepare shadow stack parameters for restorer blob

    Shadow stacks must be populated using special WRSS instruction. This
    instruction is only available when shadow stack is enabled, calling it
    with disabled shadow stack causes #UD.
    
    Moreover, shadow stack VMAs cannot be mremap()ed and they must be
    created using map_shadow_stack() system call. This requires delaying the
    restore of shadow stacks to restorer blob after the CRIU mappings are
    cleared.
    
    Introduce rst_shstk_info structure to hold shadow stack parameters
    required in the restorer blob and populate this structure in
    arch_prepare_shstk() method.
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    763d07a View commit details
    Browse the repository at this point in the history
  9. criu: kerndat: add kdat_has_shstk()

    Detect if CRIU runs with shadow stack enabled and store the result in
    kerndat.
    
    Unlike most kerndat knobs, kdat_has_shstk() does not check for
    availability of the shadow stack in the kernel, but rather checks if
    criu runs with shadow stack enabled.
    
    This depends on hardware availabilty, kernel and glibc support, compiler
    options and glibc tunables, so kdat_has_shstk() must be called every
    time CRIU starts and its result cannot be cached.
    
    The result will be used by the code that controls shadow stack
    enablement in the next commit.
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    9ac6584 View commit details
    Browse the repository at this point in the history
  10. restore: add infrastructure to enable shadow stack

    There are several gotachs when restoring a task with shadow stack:
    * depending on the compiler options, glibc version and glibc tunables
      CRIU can run with or without shadow stack.
    * shadow stack VMAs are special, they must be created using a dedicated
      map_shadow_stack() system call and can be modified only by a special
      instruction (wrss) that is only available when shadow stack is
      enabled.
    * once shadow stack is enabled, it is not writable even with wrss;
      writes to shadow stack can be only enabled with ptrace() and only when
      shadow stack is enabled in the tracee.
    * if the shadow stack is enabled during restore rather than by glibc,
      calling retq after arch_prctl() that enables the shadow stack causes
      #CP, so the function that enables shadow stack can never return.
    
    Add the infrastructure required to cope with all of those:
    
    * modify the restore code to allow trampoline (arch_shstk_trampoline)
      that will enable shadow stack and call restore_task_with_children().
    * add call to arch_shstk_unlock() right after the tasks are clone()ed;
      this will allow unlocking shadow stack features and making shadow
      stack writable.
    * add stubs for architectures that do not support shadow stacks
    * add implementation of arch_shstk_trampoline() and arch_shstk_unlock()
      for x86, but keep it disabled; it will be enabled along with addtion
      of the code that will restore shadow stack in the restorer blob
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    95c049e View commit details
    Browse the repository at this point in the history
  11. restorer: shstk: implement shadow stack restore

    The restore of a task with shadow stack enabled adds these steps:
    
    * switch from the default shadow stack to a temporary shadow stack
      allocated in the premmaped area
    * unmap CRIU mappings; nothing changed here, but it's important that
      CRIU mappings can be removed only after switching to a temporary
      shadow stack
    * create shadow stack VMA with map_shadow_stack()
    * restore shadow stack contents with wrss
    * switch to "real" shadow stack
    * lock shadow stack features
    
    Signed-off-by: Mike Rapoport (IBM) <[email protected]>
    rppt authored and avagin committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    cb39c62 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. ci: try to fix broken docker test

    Upgrade to 22.04 base image and use the existing version of docker.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    f7b2e63 View commit details
    Browse the repository at this point in the history
  2. mem: fix some VMAs being incorrectly mapped wtih PROT_WRITE

    A memory interval is a half-open interval, so the condition
    when pr->pe->vaddr == vma->e->end should not be interpreted
    as an intersection and should cause vma to be marked with VMA_NO_PROT_WRITE.
    
    Fixes: checkpoint-restore#2364
    
    Signed-off-by: Artem Trushkin <[email protected]>
    AT120 authored and avagin committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    06c1016 View commit details
    Browse the repository at this point in the history
  3. Add support for reset-on-fork scheduling flag

    This patch extends CRIU with support for SCHED_RESET_ON_FORK.
    When the SCHED_RESET_ON_FORK flag is set, the following rules
    apply for subsequently created children:
    
    - If the calling thread has a scheduling policy of SCHED_FIFO or
    SCHED_RR, the policy is reset to SCHED_OTHER in child processes.
    
    - If the calling process has a negative nice value, the nice value
    is reset to zero in child processes.
    
    (See 'man 7 sched')
    
    Fixes: checkpoint-restore#2359
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    6ee6be5 View commit details
    Browse the repository at this point in the history
  4. zdtm/sched_policy00: use reset-on-fork flag

    This patch extends the sched_policy00 test case to verify that
    the SCHED_RESET_ON_FORK flag is restored correctly.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    2355a2a View commit details
    Browse the repository at this point in the history
  5. criu: move timers dump/restore code into separate file

    Fixes: checkpoint-restore#335
    
    Signed-off-by: ccccrrrr <[email protected]>
    ccccrrrr authored and avagin committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    6182876 View commit details
    Browse the repository at this point in the history

Commits on Mar 27, 2024

  1. ci: silence CircleCI warning about deprecated image

    CircleCI currently prints out the following warning:
    
       This job is using a deprecated image 'ubuntu-2004:202010-01', please update to a newer image
    
    According to https://discuss.circleci.com/t/linux-image-deprecations-and-eol-for-2024/
    the recommended image name is: "image: default"
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    bec56d6 View commit details
    Browse the repository at this point in the history
  2. timer: fix wrapping allignment in function declaration

    Currently we have tabs + spaces on the wrapped line but the wrapped part
    is not alligned to the opening bracket.
    
    Fixes: bbe26d1b7 ("timer: fix allignment in function definition")
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Mar 27, 2024
    Configuration menu
    Copy the full SHA
    00d7cdc View commit details
    Browse the repository at this point in the history

Commits on Apr 9, 2024

  1. Makefile.config: fix/improve feature warnings.

    1. Tell which RPMs or DEBs are required in all cases.
    
    2. Use $(info ...) everywhere.
    
    3. Drop extra nested $(info), instead use (a document) a simpler kludge.
    
    4. Simplify and unify the language, add missing periods.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin authored and avagin committed Apr 9, 2024
    Configuration menu
    Copy the full SHA
    9a282a5 View commit details
    Browse the repository at this point in the history

Commits on Apr 10, 2024

  1. check: verify ino and dev of overlayfs files in /proc/pid/maps

    Check that the file device and inode shown in /proc/pid/maps match
    values returned by stat(2).
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    7aa8ec4 View commit details
    Browse the repository at this point in the history
  2. ci: update base OS to ubuntu 22.04

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    21c8f72 View commit details
    Browse the repository at this point in the history
  3. ci: update actions/checkout to v4

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    654fed9 View commit details
    Browse the repository at this point in the history
  4. ci/vdso01: fix typo

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Apr 10, 2024
    Configuration menu
    Copy the full SHA
    1a4c103 View commit details
    Browse the repository at this point in the history

Commits on Apr 16, 2024

  1. mem: optimize debug logging of enqueued pages

    During restore, CRIU prints "Enqueue page-read" messages for
    each page-read request [1]. However, this message does not
    provide useful information, increases performance overhead
    during restore and the size of log file.
    
    $ ./zdtm.py run -t zdtm/static/maps06 -f h -k always
    $ grep 'Enqueue page-read' dump/zdtm/static/maps06/56/1/restore.log | wc -l
    20493
    
    This commit replaces these log messages with a single message
    that shows the number of enqueued page-read requests.
    
    $ grep 'enqueued' dump/zdtm/static/maps06/56/1/restore.log
    (00.061449)     56: nr_enqueued:   20493
    
    [1] checkpoint-restore@91388fc
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Apr 16, 2024
    Configuration menu
    Copy the full SHA
    18dcf15 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2024

  1. sk-tcp: cleanup dump_tcp_conn_state error handling

    1) In dump_tcp_conn_state, if return from libsoccr_save is >=0, we check
    that sizeof(struct libsoccr_sk_data) returned from libsoccr_save is
    equal to sizeof(struct libsoccr_sk_data) we see in dump_tcp_conn_state
    (probably to check if we use the right library version). And if sizes
    are different we go to err_r, which just returns ret, which can
    teoretically be 0 (if size in library is zero) and that would lead
    dump_one_tcp treat this as success though it is obvious error.
    
    2) In case of dump_opt or open_image fails we don't explicitly set ret
    and rely that sizeof(struct libsoccr_sk_data) previously set to ret is
    not 0, I don't really like it, it makes reading code too complex.
    
    3) We have a lot of err_* labels which do exactly the same thing, there
    is no point in having all of them, also it is better to choose the name
    of the label based on what it really does.
    
    So let's refactor error handling to avoid these inconsistencies.
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Apr 18, 2024
    Configuration menu
    Copy the full SHA
    c716c4d View commit details
    Browse the repository at this point in the history

Commits on May 10, 2024

  1. criu: fix a fatal failure if nft doesn't work

    On some systems, nft binary might not be installed, or some kernel
    options might be unconfigured, resulting in something like this:
    
    	sudo unshare -n nft create table inet CRIU
    	Error: Could not process rule: Operation not supported
    	create table inet CRIU
    	^^^^^^^^^^^^^^^^^^^^^^^
    
    This is similar to what kerndat_has_nftables_concat() does, and if the
    outcome is the same, it returns an error to kerndat_init(), and an error
    from kerndat_init() is considered fatal.
    
    Let's relax the check, returning mere "feature not working" instead of
    a fatal error.
    
    Signed-off-by: Kir Kolyshkin <[email protected]>
    kolyshkin authored and avagin committed May 10, 2024
    Configuration menu
    Copy the full SHA
    37fbcc5 View commit details
    Browse the repository at this point in the history

Commits on May 21, 2024

  1. sk-tcp: Move TCP socket options from TcpStreamEntry to TcpOptsEntry

    Currently some of the TCP socket option information is stored in the
    TcpStreamEntry, but the information in the TcpStreamEntry is only
    restored after the TCP socket has established connection, which
    results in these TCP socket options not being restored for
    unconnected TCP sockets.
    
    In this commit move the TCP socket options from TcpStreamEntry to
    TcpOptsEntry and add dump_tcp_opts() and restore_tcp_opts() for TCP
    socket options dump and restore.
    
    Signed-off-by: Juntong Deng <[email protected]>
    juntongdeng authored and avagin committed May 21, 2024
    Configuration menu
    Copy the full SHA
    de14579 View commit details
    Browse the repository at this point in the history
  2. sk-tcp: Move TCP socket options from SkOptsEntry to TcpOptsEntry

    Currently some TCP socket option information is stored in SkOptsEntry,
    which is a little confusing.
    
    SkOptsEntry should only contain socket options that are common to
    all sockets.
    
    In this commit move the TCP-specific socket options from SkOptsEntry
    to TcpOptsEntry.
    
    Signed-off-by: Juntong Deng <[email protected]>
    juntongdeng authored and avagin committed May 21, 2024
    Configuration menu
    Copy the full SHA
    277878b View commit details
    Browse the repository at this point in the history
  3. sk-tcp: Add test cases for TCP_CORK and TCP_NODELAY socket options

    Currently there are no socket option test cases for TCP_CORK and
    TCP_NODELAY, this commit adds related test cases.
    
    The socket option test cases for TCP_KEEPCNT, TCP_KEEPIDLE, and
    TCP_KEEPINTVL already exist in socket-tcp_keepalive.c, so they are
    not included in this test case.
    
    Signed-off-by: Juntong Deng <[email protected]>
    juntongdeng authored and avagin committed May 21, 2024
    Configuration menu
    Copy the full SHA
    516b369 View commit details
    Browse the repository at this point in the history

Commits on May 24, 2024

  1. mount: fix unbounded write

    Replace sprintf() with snprintf() and specify maximum length of
    characters to avoid potential overflow.
    
    Reported-by: GitHub CodeQL (https://codeql.github.com/)
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed May 24, 2024
    Configuration menu
    Copy the full SHA
    0f3246a View commit details
    Browse the repository at this point in the history

Commits on May 25, 2024

  1. test/make: remove unused target

    A fault-injection test was introduced in commit [1] and later removed in
    commit [2]. This patch removes the obsolete Makefile target.
    
    [1] b95407e
        test: check, that parasite can rollback itself (v2)
    
    [2] 2cb4532
        tests: remove zdtm.sh (v2)
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed May 25, 2024
    Configuration menu
    Copy the full SHA
    f6d635c View commit details
    Browse the repository at this point in the history
  2. ci: update check for SELinux

    The rawhide tests runs in a container. Containers always have SELinux
    disabled from the inside. Somehow /sys/fs/selinux is now mounted. We
    used the existence of that directory if SELinux is available. This seems
    to be no longer true.
    
    Signed-off-by: Adrian Reber <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    adrianreber authored and avagin committed May 25, 2024
    Configuration menu
    Copy the full SHA
    8631228 View commit details
    Browse the repository at this point in the history
  3. criu: move sigact dump/restore code into sigact.c

    Seperate sigact dump/restore code from cr-restore.c and parasite-syscall.c into sigact.c
    
    Signed-off-by: Arnav Bhatt <[email protected]>
    arnavbhatt288 authored and avagin committed May 25, 2024
    Configuration menu
    Copy the full SHA
    1a848fe View commit details
    Browse the repository at this point in the history

Commits on May 28, 2024

  1. Configuration menu
    Copy the full SHA
    7de0b45 View commit details
    Browse the repository at this point in the history
  2. net: Fix TOCTOU race condition in unix_conf_op

    The unix_conf_op function reads the size of the sysctl entry array
    twice. gcc thinks that it can lead to a time-of-check to time-of-use
    (TOCTOU) race condition if the array size changes between the two reads.
    
    Fixes checkpoint-restore#2398
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin authored and rst0git committed May 28, 2024
    Configuration menu
    Copy the full SHA
    b384afa View commit details
    Browse the repository at this point in the history
  3. pagemap-cache: handle short reads

    It is possible for pread() to return fewer number of bytes than
    requested. In such case, we need to repeat the read operation
    with appropriate offset.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    avagin committed May 28, 2024
    Configuration menu
    Copy the full SHA
    fea3890 View commit details
    Browse the repository at this point in the history
  4. zdtm: add support for LD_PRELOAD tests

    This commit adds a `--preload-libfault` option to ZDTM's run command.
    This option runs CRIU with LD_PRELOAD to intercept libc functions
    such as pread(). This method allows to simulate special cases,
    for example, when a successful call to pread() transfers fewer
    bytes than requested.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed May 28, 2024
    Configuration menu
    Copy the full SHA
    f4a16a0 View commit details
    Browse the repository at this point in the history

Commits on Jun 4, 2024

  1. ci: remove CentOS Stream 8 test (EOL)

    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Jun 4, 2024
    Configuration menu
    Copy the full SHA
    9eaab45 View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2024

  1. zdtm: Distinguish between fail and crash of dump

    Adds a exit_signal static method to criu_cli, criu_config and criu_rpc
    used to detect a crash.
    
    Fixes: checkpoint-restore#350
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    f287a1a View commit details
    Browse the repository at this point in the history
  2. test/dump-crash: check code path when dump crashes

    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Jun 8, 2024
    Configuration menu
    Copy the full SHA
    ced120a View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2024

  1. ci: upgrade to Fedora 40 Vagrant images (38 is EOL)

    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Jun 10, 2024
    Configuration menu
    Copy the full SHA
    b5e2025 View commit details
    Browse the repository at this point in the history

Commits on Jun 25, 2024

  1. make: improve check for externally managed Python

    Move PYTHON_EXTERNALLY_MANAGED and PIP_BREAK_SYSTEM_PACKAGES
    into Makefile.install to avoid code duplication. In addition, add
    PIPFLAGS variable to enable specifying pip options during installation.
    This is particularly useful for packaging, where it is common for `pip install`
    to run in an environment with pre-installed dependencies and without internet
    access. In such environment, we need to specify the following options:
    
        --no-build-isolation --no-index --no-deps
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    0567127 View commit details
    Browse the repository at this point in the history
  2. readme: update link to FAQ page

    The current link opens a page with the following text:
    
        The MediaWiki FAQ can be found at:
        https://www.mediawiki.org/wiki/Special:MyLanguage/Manual:FAQ
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jun 25, 2024
    Configuration menu
    Copy the full SHA
    bf8c134 View commit details
    Browse the repository at this point in the history

Commits on Jul 1, 2024

  1. criu: Restore rseq_cs state slightly earlier in the restore sequence …

    …and run the plugin finalizer later in the dump sequence
    
    Restore rseq_cs state before calling RESUME_DEVICES_LATE as the CUDA plugin will
    temporarily unfreeze a thread during the plugin hook to assist with device
    restore
    
    Run the plugin finalizer later in the dump sequence since the finalizer is used
    by the CUDA plugin to handle some process cleanup
    
    Signed-off-by: Jesus Ramos <[email protected]>
    jesus-ramos authored and avagin committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    fc65e46 View commit details
    Browse the repository at this point in the history
  2. criu/plugin: Introduce new plugin hooks PAUSE_DEVICES and CHECKPOINT_…

    …DEVICES to be used during pstree collection
    
    PAUSE_DEVICES is called before a process is frozen and is used by the CUDA
    plugin to place the process in a state that's ready to be checkpointed and
    quiesce any pending work
    
    CHECKPOINT_DEVICES is called after all processes in the tree have been frozen
    and PAUSE'd and performs the actual checkpointing operation for CUDA
    applications
    
    Signed-off-by: Jesus Ramos <[email protected]>
    jesus-ramos authored and avagin committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    a85f488 View commit details
    Browse the repository at this point in the history
  3. criu/plugin: Add NVIDIA CUDA plugin

    Adding support for the NVIDIA cuda-checkpoint utility, requires the use of an
    r555 or higher driver along with the cuda-checkpoint binary.
    
    Signed-off-by: Jesus Ramos <[email protected]>
    jesus-ramos authored and avagin committed Jul 1, 2024
    Configuration menu
    Copy the full SHA
    c0708cb View commit details
    Browse the repository at this point in the history

Commits on Jul 2, 2024

  1. compel: fix build on Amazon Linux 2 due to missing PTRACE_ARCH_PRCTL

    Commit fc683cb ("compel: shstk: save CET state when CPU supports it")
    started using PTRACE_ARCH_PRCTL to query shadow stack status. While
    PTRACE_ARCH_PRCTL has existed in the kernel for a long time, it was only
    added to glibc in version 2.27. Amazon Linux 2 (AL2) has glibc 2.26,
    which does not have this definition. As a result, build on AL2 fails
    with the below error:
    
        compel/arch/x86/src/lib/infect.c: In function ‘get_task_xsave’:
        compel/arch/x86/src/lib/infect.c:276:14: error: ‘PTRACE_ARCH_PRCTL’ undeclared (first use in this function)
        276 |   if (ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long)&features, ARCH_SHSTK_STATUS)) {
            |              ^~~~~~~~~~~~~~~~~
    
    While the definition is present on the system via the kernel headers (in
    asm/ptrace-abi.h) which can be reached by including linux/ptrace.h, the
    comment in compel/include/uapi/ptrace.h says:
    
        We'd want to include both sys/ptrace.h and linux/ptrace.h, hoping
        that most definitions come from either one or another. Alas, on
        Alpine/musl both files declare struct ptrace_peeksiginfo_args, so
        there is no way they can be used together. Let's rely on libc one.
    
    Since including linux/ptrace.h is not an option, define
    PTRACE_ARCH_PRCTL if it doesn't already exist. An interesting point to
    note is that in sys/ptrace.h, PTRACE_ARCH_PRCTL is an enum value so the
    preprocessor doesn't know about it. PT_ARCH_PRCTL is the preprocessor
    symbol that matches the value of PTRACE_ARCH_PRCTL. So look for
    PT_ARCH_PRCTL to decide if PTRACE_ARCH_PRCTL is available or not.
    
    Another interesting point to note is that AL2 ships with GCC 7 by
    default, which does not support the -mshstk option, causing other build
    failures. Luckily, it also ships GCC 10 which does have the option.
    Using GCC 10 lets the build succeed.
    
    Fixes: fc683cb ("compel: shstk: save CET state when CPU supports it")
    Signed-off-by: Pratyush Yadav <[email protected]>
    prati0100 authored and avagin committed Jul 2, 2024
    Configuration menu
    Copy the full SHA
    a11e944 View commit details
    Browse the repository at this point in the history
  2. plugins/cuda: fix crosscompilation

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jul 2, 2024
    Configuration menu
    Copy the full SHA
    0a725b8 View commit details
    Browse the repository at this point in the history
  3. irmap: duplicate string in irmap_scan_path_add

    Duplicate string in irmap_scan_path_add, otherwise it will free before
    parsing next configuration input.
    
    [ avagin: handle errors of xstrdup ]
    
    Signed-off-by: Liu Hua <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Liu Hua authored and avagin committed Jul 2, 2024
    Configuration menu
    Copy the full SHA
    fac8d64 View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2024

  1. cgroupd: unblock SIGTERM to make stop_cgroupd actually work

    Sometimes due to sigblockmask inheritance cgroupd can inherit SIGTERM
    blocked. That will lead cgroupd ignoring SIGTERM from stop_cgroupd() and
    CRIU will get stuck due to waiting for never-stopping cgroupd.
    
    I see this happening in lxc-checkpoint, also saw this in OpenVZ jenkins
    on cgroup_inotify00 test.
    
    Signed-off-by: Pavel Tikhomirov <[email protected]>
    Snorch authored and avagin committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    c2f101a View commit details
    Browse the repository at this point in the history
  2. apparmor: get_suspend_policy must return NULL in error cases

    Before this fix, it could return MAP_FAILED which is ((void *) -1).
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin authored and rst0git committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    c6c83f1 View commit details
    Browse the repository at this point in the history
  3. vdso: proxify the __vdso_clock_gettime64 function

    It was added in v5.3-rc1~211^2~4^2~10.
    
    Fixes checkpoint-restore#2390
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jul 3, 2024
    Configuration menu
    Copy the full SHA
    6f92787 View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2024

  1. scripts/build: drop centos 7 targets

    The CI tests with CentOS 7 have been disabled and removed [1,2].
    This patch removes the obsolete Makefile targets for these tests.
    
    [1] checkpoint-restore@24bc083
    [2] checkpoint-restore@f8466ca
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jul 7, 2024
    Configuration menu
    Copy the full SHA
    116c689 View commit details
    Browse the repository at this point in the history

Commits on Jul 13, 2024

  1. util: use close_range when it's supported

    close_range is faster than reading /proc/self/fd and closing descriptors
    one by one.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jul 13, 2024
    Configuration menu
    Copy the full SHA
    dcb577b View commit details
    Browse the repository at this point in the history

Commits on Jul 14, 2024

  1. zdtm: make cgroup testcases run non-parallel

    cgroup testcases live in the same cgroup root zdtmtst and
    zdtmtst.defaultroot controller then create child subgroup for testing. This
    can cause problems when cgroup testcases run in parallel. For example,
    testcase A dumps the child subgroup of testcase B since it's in the cgroup
    root but in the middle of restoring of testcase A, testcase B completes and
    cleans up the subgroup directory. This causes error in testcase A restore.
    This commit adds excl flag to all cgroup testcases description so that
    these don't run parallel.
    
    Signed-off-by: Bui Quang Minh <[email protected]>
    minhbq-99 authored and avagin committed Jul 14, 2024
    Configuration menu
    Copy the full SHA
    c2f9f90 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2024

  1. Adjust to glibc __rseq_size semantic change

    In commit 2e456ccf0c34a056e3ccafac4a0c7effef14d918 ("Linux: Make
    __rseq_size useful for feature detection (bug 31965)") glibc 2.40
    changed the meaning of __rseq_size slightly: it is now the size
    of the active/feature area (20 bytes initially), and not the size
    of the entire initially defined struct (32 bytes including padding).
    The reason for the change is that the size including padding does not
    allow detection of newly added features while previously unused
    padding is consumed.
    
    The prep_libc_rseq_info change in criu/cr-restore.c is not necessary
    on kernels which have full ptrace support for obtaining rseq
    information because the code is not used.  On older kernels, it is
    a correctness fix because with size 20 (the new value), rseq
    registeration would fail.
    
    The two other changes are required to make rseq unregistration work
    in tests.
    
    Signed-off-by: Florian Weimer <[email protected]>
    fweimer-rh authored and avagin committed Jul 17, 2024
    Configuration menu
    Copy the full SHA
    5c3f621 View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2024

  1. docs: update amdgpu-plugin man page

    This patch updates the dependencies section of the AMDGPU plugin man
    page to reflect that the plugin has been merged upstream and to fix a
    formatting issue.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jul 19, 2024
    Configuration menu
    Copy the full SHA
    1b3ba30 View commit details
    Browse the repository at this point in the history
  2. plugins: set executable bit on .so files

    For historical reasons, some tools like rpm [1] or ldd [2,3]
    may expect the executable bit to be present for the correct
    identification of shared libraries. The executable bit on .so
    files is set by default by compilers (e.g., GCC). It is not
    strictly necessary but primarily a convention.
    
    [1] https://docs.fedoraproject.org/en-US/package-maintainers/CommonRpmlintIssues/#unstripped_binary_or_object
    [2] https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/ldd.bash.in;h=d6b640df;hb=HEAD#l154
    
    [3] $ sudo ldd /usr/lib/criu/*.so
    /usr/lib/criu/amdgpu_plugin.so:
    ldd: warning: you do not have execution permission for `/usr/lib/criu/amdgpu_plugin.so'
    	linux-vdso.so.1 (0x00007fd0a2a3e000)
    	libdrm.so.2 => /lib64/libdrm.so.2 (0x00007fd0a29eb000)
    	libdrm_amdgpu.so.1 => /lib64/libdrm_amdgpu.so.1 (0x00007fd0a29de000)
    	libc.so.6 => /lib64/libc.so.6 (0x00007fd0a27fc000)
    	/lib64/ld-linux-x86-64.so.2 (0x00007fd0a2a40000)
    /usr/lib/criu/cuda_plugin.so:
    ldd: warning: you do not have execution permission for `/usr/lib/criu/cuda_plugin.so'
    	linux-vdso.so.1 (0x00007f1806e13000)
    	libc.so.6 => /lib64/libc.so.6 (0x00007f1806c08000)
    	/lib64/ld-linux-x86-64.so.2 (0x00007f1806e15000)
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Jul 19, 2024
    Configuration menu
    Copy the full SHA
    8b04dd6 View commit details
    Browse the repository at this point in the history

Commits on Jul 22, 2024

  1. test/zdtm: mount a new tmpfs to the zdtm root /dev

    The current file system can be mounted with nodev.
    
    Fixes checkpoint-restore#2441
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Jul 22, 2024
    Configuration menu
    Copy the full SHA
    93746eb View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2024

  1. seize: fix pause-devices plugin hook

    The plugin hook "PAUSE_DEVICES" was recently introduced in the following
    commit. This hook was intended to execute the cuda-checkpoint tool
    before the process tree is frozen. However, the run_plugins() call has
    been placed immediately *after* freeze_processes(). This causes the
    cuda-checkpoint tool to hang indefinitely during the checkpointing
    of CUDA applications running in containers, eventually leading to its
    termination by the timeout alarm.
    
    a85f488
    criu/plugin: Introduce new plugin hooks PAUSE_DEVICES and CHECKPOINT_DEVICES to be used during pstree collection
    
    This problem can be reproduced with the following example:
    
    sudo podman run -d --rm \
            --device nvidia.com/gpu=all --security-opt=label=disable \
            quay.io/radostin/cuda-counter
    
    sudo podman container checkpoint -l -e /tmp/checkpoint.tar
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    7a27427 View commit details
    Browse the repository at this point in the history
  2. plugin: enable multiple plugins for the same hook

    CRIU provides two plugins for checkpoint/restore of GPU applications:
    amdgpu and cuda. Both plugins use the `RESUME_DEVICES_LATE` hook to
    enable restore:
    
        CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, amdgpu_plugin_resume_devices_late)
        CR_PLUGIN_REGISTER_HOOK(CR_PLUGIN_HOOK__RESUME_DEVICES_LATE, cuda_plugin_resume_devices_late)
    
    However, CRIU currently does not support running more than one plugin
    for the same hook. As a result, when both plugins are installed, the
    resume function for CUDA applications is not executed. To fix this,
    we need to make sure that both `plugin_resume_devices_late()` functions
    return `-ENOTSUP` when restore is not supported.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 7, 2024
    Configuration menu
    Copy the full SHA
    c4ba553 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2024

  1. delete redundant include header files

    restorer.h has been included in line 43.
    
    Fixes: 22963d2 ("Hide asm/restorer.h from sources")
    
    Signed-off-by: liuchao173 <[email protected]>
    liuchao173 authored and avagin committed Aug 9, 2024
    Configuration menu
    Copy the full SHA
    b7f6b72 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2024

  1. ci/podman: show criu logs in case of error

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    883d442 View commit details
    Browse the repository at this point in the history
  2. ci/podman: show mounts

    Show information about mounts available on the host filesystem.
    This is useful for debugging.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    756a7aa View commit details
    Browse the repository at this point in the history
  3. cuda: don't leak fds to cuda-checkpoint

    Leaking open file descriptors to third-party tools can lead
    to security risks.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    208f60f View commit details
    Browse the repository at this point in the history
  4. cuda: fix launch cuda-checkpoint

    When the cuda-checkpoint tool is not installed, execvp() is expected to
    fail and return -1. In this case, we need to call exit() to terminate
    the child process that was created earlier with fork().
    
    Since CRIU can be used with applications that do not use CUDA, even
    when the CUDA plugin is installed, this patch also updates the log
    messages to show debug and warning (instead of error) when the
    cuda-checkpoint tool is not found in $PATH.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    b83f131 View commit details
    Browse the repository at this point in the history
  5. zdtm: add option to run tests with criu plugins

    By default, if the "CRIU_LIBS_DIR" environment variable is not set,
    CRIU will load all plugins installed in `/usr/lib/criu`. This may
    result in running the ZDTM tests with plugins for a different version
    of CRIU (e.g., installed from a package).
    
    This patch updates ZDTM to always set the "CRIU_LIBS_DIR" environment
    variable and use a local "plugins" directory. This directory contains
    copies of the plugin files built from source. In addition, this patch
    adds the `--criu-plugin` option to the `zdtm.py run` command, allowing
    tests to be run with specified CRIU plugins.
    
    Example:
    
      - Run test only with AMDGPU plugin
        ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin amdgpu
    
      - Run test only with CUDA plugin
        ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin cuda
    
      - Run test with both AMDGPU and CUDA plugins
        ./zdtm.py run -t zdtm/static/busyloop00 --criu-plugin amdgpu cuda
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    0a5dfcf View commit details
    Browse the repository at this point in the history
  6. ci: run tests with amdgpu and cuda plugins

    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    919de60 View commit details
    Browse the repository at this point in the history
  7. timer: fix printf specifiers for __suseconds64_t

    New internal glibc types __timeval64 [1] and __suseconds64_t [2] have
    been introduced as a solution for the Y2038 problem [3]. These 64-bit
    types are used across all architectures. However, this change causes
    the following build errors when cross-compiling on ARMv7 (armhf):
    
    criu/timer.c:49:17: error: format '%ld' expects argument of type 'long int', but argument 5 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=]
       49 |         pr_info("Restored %s timer to %" PRId64 ".%ld -> %" PRId64 ".%ld\n", n,
          |                 ^~~~~~~~~~~~~~~~~~~~~~~~
       50 |                 (int64_t)val->it_value.tv_sec, val->it_value.tv_usec,
          |                                                ~~~~~~~~~~~~~~~~~~~~~
          |                                                             |
          |                                                             __suseconds64_t {aka long long int}
    
    criu/timer.c:49:17: error: format '%ld' expects argument of type 'long int', but argument 7 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=]
       49 |         pr_info("Restored %s timer to %" PRId64 ".%ld -> %" PRId64 ".%ld\n", n,
          |                 ^~~~~~~~~~~~~~~~~~~~~~~~
       50 |                 (int64_t)val->it_value.tv_sec, val->it_value.tv_usec,
       51 |                 (int64_t)val->it_interval.tv_sec, val->it_interval.tv_usec);
          |                                                   ~~~~~~~~~~~~~~~~~~~~~~~~
          |                                                                   |
          |                                                                   __suseconds64_t {aka long long int}
    
    ns.c:234:48: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'time_t' {aka 'long long int'} [-Werror=format=]
      234 |         len = snprintf(buf, sizeof(buf), "%d %ld 0", clk_id, offset);
          |                                              ~~^             ~~~~~~
          |                                                |             |
          |                                                long int      time_t {aka long long int}
          |                                              %lld
    
    msg.c:58:41: error: format '%ld' expects argument of type 'long int', but argument 3 has type '__suseconds64_t' {aka 'long long int'} [-Werror=format=]
       58 |         off += sprintf(buf + off, ".%.3ld: ", tv.tv_usec / 1000);
          |                                     ~~~~^     ~~~~~~~~~~~~~~~~~
          |                                         |                |
          |                                         long int         __suseconds64_t {aka long long int}
          |                                     %.3lld
    
    ../lib/zdtmtst.h:137:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
      137 |                 test_msg("ERR: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \
          |                          ^~~~~~~~~~~~~~
    pthread_timers_h.c:72:17: note: in expansion of macro 'pr_perror'
       72 |                 pr_perror("wrong interval: %ld:%ld", itimerspec.it_interval.tv_sec, itimerspec.it_interval.tv_nsec);
          |                 ^~~~~~~~~
    
    vdso00.c:22:32: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
       22 |         test_msg("%d time: %10li\n", getpid(), tv.tv_sec);
          |                            ~~~~^               ~~~~~~~~~
          |                                |                 |
          |                                long int          __time64_t {aka long long int}
          |                            %10lli
    
    vdso00.c:29:32: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
       29 |         test_msg("%d time: %10li\n", getpid(), tv.tv_sec);
          |                            ~~~~^               ~~~~~~~~~
          |                                |                 |
          |                                long int          __time64_t {aka long long int}
          |                            %10lli
    
    vdso01.c:357:42: error: format '%li' expects argument of type 'long int', but argument 2 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
      357 |         test_msg("gettimeofday: tv_sec %li vdso_gettimeofday: tv_sec %li\n", tv1.tv_sec, tv2.tv_sec);
          |                                        ~~^                                   ~~~~~~~~~~
          |                                          |                                      |
          |                                          long int                               __time64_t {aka long long int}
          |                                        %lli
    
    vdso01.c:357:72: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
      357 |         test_msg("gettimeofday: tv_sec %li vdso_gettimeofday: tv_sec %li\n", tv1.tv_sec, tv2.tv_sec);
          |                                                                      ~~^                 ~~~~~~~~~~
          |                                                                        |                    |
          |                                                                        long int             __time64_t {aka long long int}
          |
    
    vdso01.c:328:43: error: format '%li' expects argument of type 'long int', but argument 2 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
      328 |         test_msg("clock_gettime: tv_sec %li vdso_clock_gettime: tv_sec %li\n", ts1.tv_sec, ts2.tv_sec);
          |                                         ~~^                                    ~~~~~~~~~~
          |                                           |                                       |
          |                                           long int                                __time64_t {aka long long int}
          |                                         %lli
    
    vdso01.c:328:74: error: format '%li' expects argument of type 'long int', but argument 3 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
      328 |         test_msg("clock_gettime: tv_sec %li vdso_clock_gettime: tv_sec %li\n", ts1.tv_sec, ts2.tv_sec);
          |                                                                        ~~^                 ~~~~~~~~~~
          |                                                                          |                    |
          |                                                                          long int             __time64_t {aka long long int}
          |
    
    ../lib/zdtmtst.h:144:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'time_t' {aka 'long long int'} [-Werror=format=]
      144 |                 test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \
          |                          ^~~~~~~~~~~~~~~
    mtime_mmap.c:80:17: note: in expansion of macro 'fail'
       80 |                 fail("mtime %ld wasn't updated on mmapped %s file", mtime_new, filename);
          |                 ^~~~
    
    ../lib/zdtmtst.h:144:26: error: format '%ld' expects argument of type 'long int', but argument 4 has type '__time64_t' {aka 'long long int'} [-Werror=format=]
      144 |                 test_msg("FAIL: %s:%d: " format " (errno = %d (%s))\n", __FILE__, __LINE__, ##arg, errno, \
          |                          ^~~~~~~~~~~~~~~
    mtime_mmap.c:101:17: note: in expansion of macro 'fail'
      101 |                 fail("After migration, mtime changed to %ld", fst.st_mtime);
          |                 ^~~~
    
    [1] https://sourceware.org/git/?p=glibc.git;h=504c98717062cb9bcbd4b3e59e932d04331ddca5
    [2] https://sourceware.org/git/?p=glibc.git;h=3fced064f23562ec24f8312ffbc14950993969e6
    [3] https://en.wikipedia.org/wiki/Year_2038_problem
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 12, 2024
    Configuration menu
    Copy the full SHA
    8cf9722 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. support user set remote mmap vma address

    1. os auto assignment vma addr maybe conflict with vma in gpu living migrate scene;
    2. so, we should give choice to user;
    
    Signed-off-by: haozi007 <[email protected]>
    duguhaotian authored and avagin committed Aug 15, 2024
    Configuration menu
    Copy the full SHA
    6d4eeb7 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2024

  1. test/zdtm: allow to run tests with the mocked cuda-checkpoint tool

    Here is an example how to run one test:
    $ python test/zdtm.py run -t zdtm/static/env00 --ignore-taint --mocked-cuda-checkpoint
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    6f09b49 View commit details
    Browse the repository at this point in the history
  2. criu/plugin: don't call plugin device hooks for non-alive tasks

    Dead tasks don't hold any resources.
    
    Fixes: 2465
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    ad0b196 View commit details
    Browse the repository at this point in the history
  3. scripts/ci: run tests with the mocked cuda-checkpoint tool

    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    8fce2b1 View commit details
    Browse the repository at this point in the history
  4. plugins/amdgpu: fix cross-compilation

    To enable cross-compile we need to use the CC definition from
    criu/scripts/nmk/scripts/tools.mk:
    
    CC := $(CROSS_COMPILE)$(HOSTCC)
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    f1cb868 View commit details
    Browse the repository at this point in the history
  5. ci: enable cross compile testing for amdgpu-plugin

    Skip cross-compilation on armv7 because, among many other errors,
    it fails with the following:
    
    	In file included from ../../include/common/lock.h:9,
    			 from ../../criu/include/files.h:9,
    			 from amdgpu_plugin.c:30:
    	../../include/common/asm/atomic.h:60:2: error: #error ARM architecture version (CONFIG_ARMV*) not set or unsupported.
    	   60 | #error ARM architecture version (CONFIG_ARMV*) not set or unsupported.
    	      |  ^~~~~
    	../../include/common/asm/atomic.h: In function 'atomic_add_return':
    	../../include/common/asm/atomic.h:81:9: error: implicit declaration of function 'smp_mb' [-Werror=implicit-function-declaration]
    	   81 |         smp_mb();
    	      |         ^~~~~~
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    870025c View commit details
    Browse the repository at this point in the history
  6. plugins/amdgpu: use C99-standard types

    Co-developed-by: Andrei Vagin <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    5df8f86 View commit details
    Browse the repository at this point in the history
  7. plugins/amdgpu: fix printf format specifiers

    Errors on aarch64:
    
    	In file included from amdgpu_plugin_drm.h:10,
    			 from amdgpu_plugin.c:33:
    	amdgpu_plugin.c: In function 'amdgpu_plugin_dump_file':
    	amdgpu_plugin_util.h:24:20: error: format '%lld' expects argument of type 'long long int', but argument 6 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
    	   24 | #define LOG_PREFIX "amdgpu_plugin: "
    	      |                    ^~~~~~~~~~~~~~~~~
    	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
    	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
    	      |                                                    ^~~~~~~~~~
    	amdgpu_plugin.c:1236:9: note: in expansion of macro 'pr_info'
    	 1236 |         pr_info("devices:%d bos:%d objects:%d priv_data:%lld\n", args.num_devices, args.num_bos, args.num_objects,
    	      |         ^~~~~~~
    	cc1: all warnings being treated as errors
    
    Errors on ppc64:
    
    	In file included from amdgpu_plugin_drm.h:10,
    			 from amdgpu_plugin.c:33:
    	amdgpu_plugin.c: In function 'amdgpu_plugin_dump_file':
    	amdgpu_plugin_util.h:24:20: error: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
    	   24 | #define LOG_PREFIX "amdgpu_plugin: "
    	      |                    ^~~~~~~~~~~~~~~~~
    	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
    	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
    	      |                                                    ^~~~~~~~~~
    	amdgpu_plugin.c:1236:9: note: in expansion of macro 'pr_info'
    	 1236 |         pr_info("devices:%u bos:%u objects:%u priv_data:%llu\n",
    	      |         ^~~~~~~
    	cc1: all warnings being treated as errors
    	In file included from amdgpu_plugin_util.c:38:
    	amdgpu_plugin_util.c: In function 'print_kfd_bo_stat':
    	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
    	   24 | #define LOG_PREFIX "amdgpu_plugin: "
    	      |                    ^~~~~~~~~~~~~~~~~
    	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
    	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
    	      |                                                    ^~~~~~~~~~
    	amdgpu_plugin_util.c:196:17: note: in expansion of macro 'pr_info'
    	  196 |                 pr_info("%s(), %d. KFD BO Addr: %llx \n", __func__, idx, bo->addr);
    	      |                 ^~~~~~~
    	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
    	   24 | #define LOG_PREFIX "amdgpu_plugin: "
    	      |                    ^~~~~~~~~~~~~~~~~
    	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
    	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
    	      |                                                    ^~~~~~~~~~
    	amdgpu_plugin_util.c:197:17: note: in expansion of macro 'pr_info'
    	  197 |                 pr_info("%s(), %d. KFD BO Size: %llx \n", __func__, idx, bo->size);
    	      |                 ^~~~~~~
    	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
    	   24 | #define LOG_PREFIX "amdgpu_plugin: "
    	      |                    ^~~~~~~~~~~~~~~~~
    	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
    	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
    	      |                                                    ^~~~~~~~~~
    	amdgpu_plugin_util.c:198:17: note: in expansion of macro 'pr_info'
    	  198 |                 pr_info("%s(), %d. KFD BO Offset: %llx \n", __func__, idx, bo->offset);
    	      |                 ^~~~~~~
    	amdgpu_plugin_util.h:24:20: error: format '%llx' expects argument of type 'long long unsigned int', but argument 5 has type '__u64' {aka 'long unsigned int'} [-Werror=format=]
    	   24 | #define LOG_PREFIX "amdgpu_plugin: "
    	      |                    ^~~~~~~~~~~~~~~~~
    	../../criu/include/log.h:47:52: note: in expansion of macro 'LOG_PREFIX'
    	   47 | #define pr_info(fmt, ...) print_on_level(LOG_INFO, LOG_PREFIX fmt, ##__VA_ARGS__)
    	      |                                                    ^~~~~~~~~~
    	amdgpu_plugin_util.c:199:17: note: in expansion of macro 'pr_info'
    	  199 |                 pr_info("%s(), %d. KFD BO Restored Offset: %llx \n", __func__, idx, bo->restored_offset);
    	      |                 ^~~~~~~
    	cc1: all warnings being treated as errors
    
    Co-developed-by: Andrei Vagin <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    0115298 View commit details
    Browse the repository at this point in the history
  8. crit: do not crash on aarch64 doing 'crit x ./ rss'

    Running 'crit x ./ rss' on aarch64 crashes with:
    
        File "/home/criu/crit/crit/__main__.py", line 331, in explore_rss
          while vmas[vmi]['start'] < pme:
                ~~~~^^^^^
      IndexError: list index out of range
    
    This adds an additional check to the while loop to do access indexes out
    of range.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    47f81cd View commit details
    Browse the repository at this point in the history
  9. test: better test for SELinux tools

    Previously the check was just if /sys/fs/selinux is mounted. This
    extends the check to see if all necessary tools are installed.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    190216a View commit details
    Browse the repository at this point in the history
  10. test: only run macvlan tests if macvlan devices can be created

    Some test environments (Actuated runners for example) do not support
    maclvan devices. Skip tests depending on it automatically.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    bc88db2 View commit details
    Browse the repository at this point in the history
  11. coredump: fail on unsupported architectures early

    Currently coredump only works on x86_64. Fail early on any other
    architecture.
    
    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    e67b428 View commit details
    Browse the repository at this point in the history
  12. ci: run aarch64 tests native via actuated

    Signed-off-by: Adrian Reber <[email protected]>
    adrianreber authored and avagin committed Aug 16, 2024
    Configuration menu
    Copy the full SHA
    5ba1f84 View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2024

  1. cuda: unlock on timeout error

    When attempting to checkpoint a container with CUDA processes,
    CRIU could fail with the following error:
    
    	Error (criu/cr-dump.c:1791): Timeout reached. Try to interrupt: 1
    	Error (cuda_plugin.c:143): cuda_plugin: Unable to read output of cuda-checkpoint: Interrupted system call
    	Error (cuda_plugin.c:384): cuda_plugin: PAUSE_DEVICES failed with
    
    In this situation, the target process is locked, but CRIU fails due to
    a timeout and exits with an error. We need to make sure that the target
    PID is unlocked in such case.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 17, 2024
    Configuration menu
    Copy the full SHA
    5a74eee View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2024

  1. scripts/uninstall_module: fix package discovery

    The `uninstall_module.py` script is a wrapper for the `pip uninstall`
    command that enables support for specifying installation prefix
    (i.e., `--prefix`). When this functionality is used, we intentionally
    set `sys.path` to include only search paths for the specified prefix
    to avoid unintentional uninstallation of packages in system paths.
    
    Since `importlib_metadata` version 8.1.0, the `Distribution.from_name()`
    method has been modified [1] to perform additional pre-processing of
    Distribution objects [2] that requires loading distribution metadata
    and results in the following error:
    
      File "/usr/local/lib/python3.12/site-packages/importlib_metadata/__init__.py", line 422, in <lambda>
        buckets = bucket(dists, lambda dist: bool(dist.metadata))
                                                  ^^^^^^^^^^^^^
      File "/usr/local/lib/python3.12/site-packages/importlib_metadata/__init__.py", line 454, in metadata
        from . import _adapters
      File "/usr/local/lib/python3.12/site-packages/importlib_metadata/_adapters.py", line 3, in <module>
        import email.message
      File "/usr/lib64/python3.12/email/message.py", line 11, in <module>
        import quopri
      ModuleNotFoundError: No module named 'quopri'
    
    This error occurs because we have excluded system paths from the list
    of search paths (`sys.path`).
    
    However, this pre-processing is not required for our use case, as we
    only use the discovery mechanism of importlib_metadata to resolve the
    metadata directory path of the module being uninstalled.
    
    To fix this problem, this patch updates `uninstall_module` to avoid the
    `from_name()` method and use `discover(name=package_name)` directly.
    
    [1] python/importlib_metadata@a65c29a
    [2] https://github.com/python/importlib_metadata/blob/a65c29ad/importlib_metadata/__init__.py#L391
    
    Fixes: checkpoint-restore#2468
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Aug 18, 2024
    Configuration menu
    Copy the full SHA
    4ca4a09 View commit details
    Browse the repository at this point in the history

Commits on Sep 12, 2024

  1. codespell: fix typos

    This patch fixes the following typos reported by codespell:
    
    ./test/others/bers/bers.c:394: dependin ==> depending, depend in
    ./criu/kerndat.c:837: hitted ==> hit
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    e94c13c View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2024

  1. criu: Allow disabling freeze cgroups

    Some plugins (e.g., CUDA) may not function correctly when processes are
    frozen using cgroups. This change introduces a mechanism to disable the
    use of freeze cgroups during process seizing, even if explicitly
    requested via the --freeze-cgroup option.
    
    The CUDA plugin is updated to utilize this new mechanism to ensure
    compatibility.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Sep 14, 2024
    Configuration menu
    Copy the full SHA
    e4026fb View commit details
    Browse the repository at this point in the history
  2. fault: allow to check dont_use_freeze_cgroup

    Adds a new "fault" to call dont_use_freeze_cgroup.
    avagin committed Sep 14, 2024
    Configuration menu
    Copy the full SHA
    1190f10 View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2024

  1. plugin/cuda: disable CUDA plugin if /dev/nvidiactl isn't present

    The presence of /dev/nvidiactl indicates that the system has a
    compatible NVIDIA GPU driver installed and that the GPU is accessible to
    the operating system.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Sep 15, 2024
    Configuration menu
    Copy the full SHA
    6551847 View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2024

  1. plugins/amdgpu: Zero ib_info on initialization

    This struct was being used un-initialized, meaning it
    was filled with random garbage.
    
    Mea culpa.
    
    Signed-off-by: David Francis <[email protected]>
    fdavid-amd authored and avagin committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    412cdd2 View commit details
    Browse the repository at this point in the history
  2. plugins/amdgpu - Increase maximum parameter length

    The topology parsing assumed that all parameter names were
    30 characters or fewer, but
    
    recommended_sdma_engine_id_mask
    
    is 31 characters.
    
    Make the maximum length a macro, and set it to 64.
    
    Signed-off-by: David Francis <[email protected]>
    fdavid-amd authored and avagin committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    e451838 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2024

  1. util: dump fsfd log messages

    It should help to investigate errors of fsconfig, fsmount and etc.
    
    Signed-off-by: Andrei Vagin <[email protected]>
    avagin committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    1079a51 View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2024

  1. amdgpu: remove exec permissions on source files

    This patch fixes the following warnings that appear
    when building an RPM package:
    
    + /usr/lib/rpm/redhat/brp-mangle-shebangs
    *** WARNING: ./usr/src/debug/criu-4.0-1.fc42.x86_64/plugins/amdgpu/amdgpu_plugin_util.c is executable but has no shebang, removing executable bit
    *** WARNING: ./usr/src/debug/criu-4.0-1.fc42.x86_64/plugins/amdgpu/amdgpu_plugin_util.h is executable but has no shebang, removing executable bit
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 26, 2024
    Configuration menu
    Copy the full SHA
    34e7134 View commit details
    Browse the repository at this point in the history
  2. Makefile.config: set CR_PLUGIN_DEFAULT variable

    By default, CRIU uses the path "/usr/lib/criu" to install and load
    plugins at runtime. This path is defined by the `PLUGINDIR` variable
    in Makefile.install and `CR_PLUGIN_DEFAULT` in `criu/include/plugin.h`.
    However, some distribution packages might install the CRIU plugins at
    "/usr/lib64/criu" instead. This patch updates the makefile to align
    the path defined by `CR_PLUGIN_DEFAULT` with the value of `PLUGINDIR`.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Sep 26, 2024
    Configuration menu
    Copy the full SHA
    55c8917 View commit details
    Browse the repository at this point in the history

Commits on Oct 3, 2024

  1. images: Add protobuf definition for pidfd

    We only use the last pid from the list in NSpid entry (from
    /proc/<pid>/fdinfo/<pidfd>) while restoring pidfds.
    The last pid refers to the pid of the process in the most deeply nested
    pid namespace. Since CRIU does not currently support nested pid
    namespaces, this entry is the one we want.
    
    After Linux 6.9, inode numbers can be used to compare pidfds. pidfds
    referring to the same process will have the same inode numbers. We use
    inode numbers to restore pidfds that point to dead processes.
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    71b427a View commit details
    Browse the repository at this point in the history
  2. criu: Support C/R of pidfds

    Process file descriptors (pidfds) were introduced to provide a stable
    handle on a process. They solve the problem of pid recycling.
    
    For a detailed explanation, see https://lwn.net/Articles/801319/ and
    http://www.corsix.org/content/what-is-a-pidfd
    
    Before Linux 6.9, anonymous inodes were used for the implementation of
    pidfds. So, we detect them in a fashion similiar to other fd types that
    use anonymous inodes by calling `readlink()`.
    After 6.9, pidfs (a file system for pidfds) was introduced.
    In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
    6.10.
    (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
    After this change, pidfs inodes have no file type in st_mode in
    userspace.
    We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
    Hence, check for pidfds occurs before the check for regular files.
    
    For pidfds that refer to dead processes, we lose the pid of the process
    as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
    So, we create a temporary process for each unique inode and open pidfds
    that refer to this process. After all pidfds have been opened we kill
    this temporary process.
    
    This commit does not include support for pidfds that point to a specific
    thread, i.e pidfds opened with `PIDFD_THREAD` flag.
    
    Fixes: checkpoint-restore#2258
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    d559ebb View commit details
    Browse the repository at this point in the history
  3. zdtm: Check pidfd fdinfo entry is consistent

    Ensures that entries in /proc/<pid>/fdinfo/<pidfd> are same.
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    005a331 View commit details
    Browse the repository at this point in the history
  4. zdtm: Check pidfd can send signal after C/R

    Ensure `pidfd_send_signal()` syscall works as expected after C/R.
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    4cec03a View commit details
    Browse the repository at this point in the history
  5. zdtm: Check pidfd can kill descendant processes

    Validate that pidfds can been used to send signals to different
    processes after C/R using the `pidfd_send_signal()` syscall.
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    6bf8ab1 View commit details
    Browse the repository at this point in the history
  6. zdtm: Check dead pidfd is restored correctly

    After, C/R of pidfds that point to dead processes their inodes might
    change. But if two pidfds point to same dead process they should
    continue to do so after C/R.
    
    This test ensures that this happens by calling `statx()` on pidfds after
    C/R and then comparing their inode numbers.
    
    Support for comparing pidfds by using `statx()` and inode numbers was
    introduced alongside pidfs. So if `f_type` of pidfd is not equal to
    `PID_FS_MAGIC` then we skip this test.
    
    signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    69a6179 View commit details
    Browse the repository at this point in the history
  7. zdtm: Check fd from pidfd_getfd is C/Red correctly

    We get the read end of a pipe using `pidfd_getfd` and check if we can
    read from it after C/R.
    
    signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    bb1b1dc View commit details
    Browse the repository at this point in the history
  8. zdtm: Check pidfd for thread is valid after C/R

    We open a pidfd to a thread using `PIDFD_THREAD` flag and after C/R
    ensure that we can send signals using it with `PIDFD_SIGNAL_THREAD`.
    
    signed-off-by: Bhavik Sachdev <[email protected]>
    bsach64 authored and avagin committed Oct 3, 2024
    Configuration menu
    Copy the full SHA
    56bc739 View commit details
    Browse the repository at this point in the history

Commits on Oct 16, 2024

  1. make/lint: use 'ruff check <path>'

    The command `ruff <path>` has been deprecated and removed:
    https://astral.sh/blog/ruff-v0.5.0#removed-deprecated-features
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    900f94e View commit details
    Browse the repository at this point in the history
  2. pycriu: fix lint errors

    This patch fixes the following errors reported by ruff:
    
    lib/pycriu/images/pb2dict.py:307:24: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
        |
    305 |     elif field.type in _basic_cast:
    306 |         cast = _basic_cast[field.type]
    307 |         if pretty and (cast == int):
        |                        ^^^^^^^^^^^ E721
    308 |             if is_hex:
    309 |                 # Fields that have (criu).hex = true option set
        |
    
    lib/pycriu/images/pb2dict.py:379:13: E721 Use `is` and `is not` for type comparisons, or `isinstance()` for isinstance checks
        |
    377 |     elif field.type in _basic_cast:
    378 |         cast = _basic_cast[field.type]
    379 |         if (cast == int) and is_string(value):
        |             ^^^^^^^^^^^ E721
    380 |             if _marked_as_dev(field):
    381 |                 return encode_dev(field, value)
        |
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 16, 2024
    Configuration menu
    Copy the full SHA
    59afbf3 View commit details
    Browse the repository at this point in the history

Commits on Oct 21, 2024

  1. images/inventory: add field for enabled plugins

    This patch extends the inventory image with a `plugins` field that
    contains an array of plugins which were used during checkpoint,
    for example, to save GPU state. In particular, the CUDA and AMDGPU
    plugins are added to this field only when the checkpoint contains
    GPU state. This allows to disable unnecessary plugins during restore,
    show appropriate error messages if required CRIU plugin are missing,
    and migrate a process that does not use GPU from a GPU-enabled system
    to CPU-only environment.
    
    We use the `optional plugins_entry` for backwards compatibility. This
    entry allows us to distinguish between *unset* and *missing* field:
    
    - When the field is missing, it indicates that the checkpoint was
      created with a previous version of CRIU, and all plugins should be
      *enabled* during restore.
    
    - When the field is empty, it indicates that no plugins were used during
      checkpointing. Thus, all plugins can be *disabled* during restore.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    18f7207 View commit details
    Browse the repository at this point in the history
  2. zdtm: add inventory test plugins

    This patch adds two test plugins to verify that CRIU plugins listed
    in the inventory image are enabled, while those that are not listed
    can be disabled.
    
    Signed-off-by: Radostin Stoyanov <[email protected]>
    rst0git authored and avagin committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    f5d59ec View commit details
    Browse the repository at this point in the history
  3. pidfd: block SIGCHLD during tmp process creation

    This patch blocks SIGCHLD during temporary process creation to prevent a
    race condition between kill() and waitpid() where sigchld_handler()
    causes `criu restore` to fail with an error.
    
    Fixes: checkpoint-restore#2490
    
    Signed-off-by: Bhavik Sachdev <[email protected]>
    Signed-off-by: Radostin Stoyanov <[email protected]>
    bsach64 authored and avagin committed Oct 21, 2024
    Configuration menu
    Copy the full SHA
    dfb56ee View commit details
    Browse the repository at this point in the history

Commits on Oct 24, 2024

  1. pass real fd when restore and dump ext file

    Signed-off-by: haozi007 <[email protected]>
    duguhaotian committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    23b64b9 View commit details
    Browse the repository at this point in the history