Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use copy_file_range() in shutil.copyfile() (server-side copy) #81340

Open
giampaolo opened this issue Jun 5, 2019 · 7 comments
Open

Use copy_file_range() in shutil.copyfile() (server-side copy) #81340

giampaolo opened this issue Jun 5, 2019 · 7 comments
Labels
3.12 bugs and security fixes performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@giampaolo
Copy link
Contributor

giampaolo commented Jun 5, 2019

BPO 37159
Nosy @facundobatista, @ncoghlan, @vstinner, @giampaolo, @encukou, @albertz, @vadmium, @desbma, @pablogsal
Files
  • patch.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-06-05.05:24:51.971>
    labels = ['library', '3.9', 'performance']
    title = 'Use copy_file_range() in shutil.copyfile() (server-side copy)'
    updated_at = <Date 2020-12-29.13:05:56.586>
    user = 'https://github.com/giampaolo'

    bugs.python.org fields:

    activity = <Date 2020-12-29.13:05:56.586>
    actor = 'Albert.Zeyer'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-06-05.05:24:51.971>
    creator = 'giampaolo.rodola'
    dependencies = []
    files = ['48392']
    hgrepos = []
    issue_num = 37159
    keywords = ['patch']
    message_count = 6.0
    messages = ['344671', '344679', '344680', '344691', '344693', '383996']
    nosy_count = 11.0
    nosy_names = ['facundobatista', 'ncoghlan', 'vstinner', 'giampaolo.rodola', 'StyXman', 'petr.viktorin', 'neologix', 'Albert.Zeyer', 'martin.panter', 'desbma', 'pablogsal']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue37159'
    versions = ['Python 3.9']

    Linked PRs

    @giampaolo
    Copy link
    Contributor Author

    This is a follow up of bpo-33639 (zero-copy via sendfile()) and bpo-26828 (os.copy_file_range()). On [Linux 4.5 / glib 2.27] shutil.copyfile() will use os.copy_file_range() instead of os.sendfile(). According to my benchmarks performances are the same but when dealing with NFS copy_file_range() is supposed to attempt doing a server-side copy, meaning there will be no exchange of data between client and server, making the copy operation an order of magnitude faster.

    Before proceeding unit-tests for big-file support should be added first (bpo-37096). We didn't hit the 3.8 deadline but I actually prefer to land this in 3.9 as I want to experiment with it a bit (copy_file_range() is quite new, bpo-26828 is still a WIP).

    @giampaolo giampaolo added 3.9 only security fixes stdlib Python modules in the Lib dir performance Performance or resource usage labels Jun 5, 2019
    @giampaolo giampaolo changed the title Have shutil.copyfile() use copy_file_range() Use copy_file_range() in shutil.copyfile() (server-side copy) Jun 5, 2019
    @vstinner
    Copy link
    Member

    vstinner commented Jun 5, 2019

    Oh, I already created https://bugs.python.org/issue37157

    Can we move the discussion there?

    @giampaolo
    Copy link
    Contributor Author

    bpo-37157 is for reflink / CoW copy, this one is not.

    @vstinner
    Copy link
    Member

    vstinner commented Jun 5, 2019

    bpo-37157 is for reflink / CoW copy, this one is not.

    Oh sorry, it seems like I misunderstood copy_file_range(). So it doesn't use/support CoW?

    @giampaolo
    Copy link
    Contributor Author

    Nope, it doesn't (see man page). We can simply use FICLONE (cp does the same).

    @albertz
    Copy link
    Mannequin

    albertz mannequin commented Dec 29, 2020

    According to the man page of copy_file_range (https://man7.org/linux/man-pages/man2/copy_file_range.2.html), copy_file_range also should support copy-on-write:

      copy_file_range() gives filesystems an opportunity to implement
      "copy acceleration" techniques, such as the use of reflinks
      (i.e., two or more inodes that share pointers to the same copy-
      on-write disk blocks) or server-side-copy (in the case of NFS).
    

    Is this wrong?

    However, while researching more about FICLONE vs copy_file_range, I found e.g. this: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24399

    Which suggests that there are other problems with copy_file_range?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @illia-v
    Copy link
    Contributor

    illia-v commented Apr 16, 2022

    FYI, GNU Coreutils 9.0 (released in September 2021) changed cp to:

    • use copy offload via copy_file_range where available;
    • enable CoW by default.

    https://lists.gnu.org/archive/html/info-gnu/2021-09/msg00010.html

    @iritkatriel iritkatriel added 3.12 bugs and security fixes and removed 3.9 only security fixes labels Aug 26, 2022
    barneygale added a commit that referenced this issue Jun 14, 2024
    Add a `Path.copy()` method that copies the content of one file to another.
    
    This method is similar to `shutil.copyfile()` but differs in the following ways:
    
    - Uses `fcntl.FICLONE` where available (see GH-81338)
    - Uses `os.copy_file_range` where available (see GH-81340)
    - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.
    
    The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.
    
    Incorporates code from GH-81338 and GH-93152.
    
    Co-authored-by: Eryk Sun <[email protected]>
    mrahtz pushed a commit to mrahtz/cpython that referenced this issue Jun 30, 2024
    Add a `Path.copy()` method that copies the content of one file to another.
    
    This method is similar to `shutil.copyfile()` but differs in the following ways:
    
    - Uses `fcntl.FICLONE` where available (see pythonGH-81338)
    - Uses `os.copy_file_range` where available (see pythonGH-81340)
    - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.
    
    The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.
    
    Incorporates code from pythonGH-81338 and pythonGH-93152.
    
    Co-authored-by: Eryk Sun <[email protected]>
    noahbkim pushed a commit to hudson-trading/cpython that referenced this issue Jul 11, 2024
    Add a `Path.copy()` method that copies the content of one file to another.
    
    This method is similar to `shutil.copyfile()` but differs in the following ways:
    
    - Uses `fcntl.FICLONE` where available (see pythonGH-81338)
    - Uses `os.copy_file_range` where available (see pythonGH-81340)
    - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.
    
    The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.
    
    Incorporates code from pythonGH-81338 and pythonGH-93152.
    
    Co-authored-by: Eryk Sun <[email protected]>
    estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
    Add a `Path.copy()` method that copies the content of one file to another.
    
    This method is similar to `shutil.copyfile()` but differs in the following ways:
    
    - Uses `fcntl.FICLONE` where available (see pythonGH-81338)
    - Uses `os.copy_file_range` where available (see pythonGH-81340)
    - Uses `_winapi.CopyFile2` where available, even though this copies more metadata than the other implementations. This makes `WindowsPath.copy()` more similar to `shutil.copy2()`.
    
    The method is presently _less_ specified than the `shutil` functions to allow OS-specific optimizations that might copy more or less metadata.
    
    Incorporates code from pythonGH-81338 and pythonGH-93152.
    
    Co-authored-by: Eryk Sun <[email protected]>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.12 bugs and security fixes performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants