Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support moving across filesystems in pathlib.Path, as shutil.move() does #73991

Closed
LaurentMazuel mannequin opened this issue Mar 13, 2017 · 66 comments
Closed

Support moving across filesystems in pathlib.Path, as shutil.move() does #73991

LaurentMazuel mannequin opened this issue Mar 13, 2017 · 66 comments
Labels
stdlib Python modules in the Lib dir topic-pathlib type-feature A feature request or enhancement

Comments

@LaurentMazuel
Copy link
Mannequin

LaurentMazuel mannequin commented Mar 13, 2017

BPO 29805
Nosy @brettcannon, @pfmoore, @ericvsmith, @tjguk, @zware, @eryksun, @zooba

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2017-03-13.21:03:44.694>
labels = ['3.8', 'type-feature', 'library', '3.9', '3.10']
title = 'Support moving across filesystems in pathlib.Path, as shutil.move() does'
updated_at = <Date 2021-03-15.21:26:30.922>
user = 'https://bugs.python.org/LaurentMazuel'

bugs.python.org fields:

activity = <Date 2021-03-15.21:26:30.922>
actor = 'eryksun'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2017-03-13.21:03:44.694>
creator = 'Laurent.Mazuel'
dependencies = []
files = []
hgrepos = []
issue_num = 29805
keywords = []
message_count = 6.0
messages = ['289549', '289552', '289559', '289630', '289687', '289688']
nosy_count = 8.0
nosy_names = ['brett.cannon', 'paul.moore', 'eric.smith', 'tim.golden', 'Laurent.Mazuel', 'zach.ware', 'eryksun', 'steve.dower']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue29805'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

Linked PRs

@LaurentMazuel
Copy link
Mannequin Author

LaurentMazuel mannequin commented Mar 13, 2017

Trying to use Pathlib and Path.replace on Windows if drive are different leads to an issue:

  File "D:\\myscript.py", line 184, in update
    client_generated_path.replace(destination_folder)
  File "c:\\program files (x86)\\python35-32\\Lib\\pathlib.py", line 1273, in replace
    self.\_accessor.replace(self, target)
  File "c:\\program files (x86)\\python35-32\\Lib\\pathlib.py", line 377, in wrapped
    return strfunc(str(pathobjA), str(pathobjB), \*args)
OSError: [WinError 17] The system cannot move the file to a different disk drive: 'C:\\\\MyFolder' -\> 'D:\\\\MyFolderNewName'

This is a known situation of os.rename, and workaround I found is to use shutil or to copy/delete manually in two steps (e.g. http://stackoverflow.com/questions/21116510/python-oserror-winerror-17-the-system-cannot-move-the-file-to-a-different-d)

When using Pathlib, it's not that easy to workaround using shutil (even if thanks to Brett Cannon now shutil accepts Path in Py3.6, not everybody has Py3.6). At least this should be documented with a recommendation for that situation. I love Pathlib and it's too bad my code becomes complicated when it was so simple :(

@brettcannon brettcannon added stdlib Python modules in the Lib dir and removed topic-IO labels Mar 13, 2017
@LaurentMazuel
Copy link
Mannequin Author

LaurentMazuel mannequin commented Mar 13, 2017

Just to confirm, I was able to workaround it with Py3.6:

    # client_generated_path.replace(destination_folder)
    shutil.move(client_generated_path, destination_folder)

It is reasonable to think about adding a similar feature to pathlib if it doesn't support it and just special-case the drive-to-drive scenario for Windows

@eryksun
Copy link
Contributor

eryksun commented Mar 14, 2017

Moving a file across volumes isn't atomic. Getting an exception in this case can be useful. There could be a "strict" keyword-only parameter that defaults to False. If it's true, then replace() won't try to move the file.

@ericvsmith
Copy link
Member

I agree this needs to be different from replace(), due to not being atomic. That makes it an enhancement, so I'm removing 3.5.

I'm +1 on Path supporting something like shutil.move().

@ericvsmith ericvsmith added the type-feature A feature request or enhancement label Mar 15, 2017
@brettcannon
Copy link
Member

I also support the idea of getting something like shutil.move() into pathlib.

@brettcannon
Copy link
Member

I should also mention that rename() (https://docs.python.org/3/library/pathlib.html#pathlib.Path.rename) and replace() (https://docs.python.org/3/library/pathlib.html#pathlib.Path.replace) already do exist, so it might be best to add a keyword-only flag to one of those for this use-case.

@eryksun eryksun added 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes and removed OS-windows labels Mar 15, 2021
@eryksun eryksun changed the title Pathlib.replace cannot move file to a different drive on Windows if filename different Support moving across filesystems in pathlib.Path, as shutil.move() does Mar 15, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@barneygale
Copy link
Contributor

I'm also +1 on adding pathlib.Path.move(). However I'd prefer we move the shutil implementation into pathlib, as proposed here. The implementation in shutil would become a stub that calls into pathlib, roughly:

def move(src, dst, copy_function=copy2):
    return pathlib.Path(src).move(dst, copy_function=copy2)

I'm interested in this approach because I'm hoping to add a pathlib.AbstractPath class in future. Users would be able to subclass AbstractPath, and by implementing a small handful of low-level abstract methods like stat(), open() and iterdir(), benefit from high-level methods like move() in their objects. This would work even if the source and destination are different kinds of path (e.g. a local filesystem path and a path in a tar archive).

To achieve this we'd need to make pathlib accept bytes paths, as shutil does. Personally I'm fine with this - it's not much work and perfectly reasonable.

We'd also need to move other shutil functions that move() relies on to pathlib, notably the copy*() functions, excluding copyfileobj() and friends.

On rename() / replace(): I'd like these to call through to move(atomic=True). IIRC these methods are identical on POSIX and only differ slightly on Windows. That's a wrinkle we should smooth out in pathlib IMO.

@zooba
Copy link
Member

zooba commented Jan 31, 2023

I'd prefer we move the shutil implementation into pathlib

Sounds reasonable to me, though would we want to include copy_function in the API? It seems like an opportunity to say that Path is opinionated about the "most move-like way to copy", and if you need to override that then handle it yourself or stick with shutil.

@ericvsmith
Copy link
Member

I think @barneygale 's plan is sound.

@pfmoore
Copy link
Member

pfmoore commented Feb 1, 2023

I agree with @zooba, exposing the copy_function in the pathlib API seems incompatible with the general style of pathlib. Otherwise the plan looks good.

@barneygale
Copy link
Contributor

barneygale commented Feb 1, 2023

I agree about copy_function. Perhaps we could do something like this?

    def copy(self, dst, *, follow_symlinks=True, copy_mode=True, copy_stat=False):
        ...

    def move(self, dst, *, atomic=False, **kwargs):
        if atomic:
            return os.rename(self, dst)
        self.copy(dst, **kwargs)
        self.unlink()

@zooba
Copy link
Member

zooba commented Feb 1, 2023

I don't like the atomic argument, and especially its default. Implementations should get to implement it as efficiently as possible without needing to respect that kind of argument, and callers shouldn't expect that level of control over semantics.

For example, on Windows we explicitly disallow os.rename from moving files to other drives, because it wouldn't match our documented behaviour. But a new WindowsPath.move function should be able to allow that, and take advantage of the OS optimisations (yes, they exist!) for moving a file rather than copying it across drives.

If we were to define semantics for atomic, the only way to allow this would be "atomic=True will try to move, and atomic=False won't bother trying to move", at which point why would anybody ever pass False? Especially since there's no back-compat here, and so code on older versions will have to stick with try: ...rename ... except: ... copy.

We should make these functions an improvement in semantics, not just discoverability, and ideally more appropriate for whatever platform they're being used on than the POSIX semantics inherent to posixmodule.

So I think the only option we want on move is follow_symlinks=True,1 and the method "does whatever is necessary to move the file or directory as efficiently as possible into the new location while matching the semantics of an in-place rename" (roughly, copy_mode=True, copy_stat=True)

Similarly, I think we should simplify copy down to only have follow_symlinks=True,2 and the method "does whatever is necessary to copy the file or directory as efficiently as possible into the new location while matching the semantics of a newly created file" (roughly, copy_mode=some bits, copy_stat=some bits - and this one is why the vague definition is important 😉 )

Footnotes

  1. Where follow_symlinks=False is carefully defined to mean "if self is a link, it moves the link unless that would cause the new link to fail to refer to the same target as the original link, in which case the target is copied into the new location and the original link is removed".

  2. I think the same definition as for move works here, but I haven't thought through this one as thoroughly.

@barneygale
Copy link
Contributor

barneygale commented Feb 1, 2023

My example was a bit rubbish - here's a corrected version:

    def copy(self, dst, *, follow_symlinks=True, copy_mode=True, copy_stat=True):
        ...

    def move(self, dst, *, atomic=False, **kwargs):
        try:
            return os.rename(self, dst)
        except OSError:
            if atomic:
                raise  # Move can't be achieved atomically
        self.copy(dst, **kwargs)
        self.unlink()

So move(atomic=True) would call os.rename() only and allow exceptions to propagate, whereas atomic=False will try a rename, and failing that use a copy/delete. I think there's common use cases for both.

I think I agree with you on removing copy_mode and copy_stat.

@barneygale
Copy link
Contributor

I suppose one could argue that folks should just use os.rename() if they want an atomic rename! :)

@zooba
Copy link
Member

zooba commented Feb 2, 2023

I suppose one could argue that folks should just use os.rename() if they want an atomic rename! :)

Yeah, I'll argue that one ;)

We'll have to explain that move may result in both old and new names existing simultaneously while the move takes place, depending on the operating system and file systems in use. That's a perfectly fine place to say that if you'd rather that not happen, and are willing to accept exceptions if it's unavoidable, then you should use rename.

@barneygale
Copy link
Contributor

Should we deprecate pathlib.Path.rename() and replace() when we add move()?

@barneygale
Copy link
Contributor

Do we even need a copyfile() function, if copy() will do the same?

Good point - perhaps not!

barneygale added a commit to barneygale/cpython that referenced this issue Jul 27, 2024
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
barneygale added a commit to barneygale/cpython that referenced this issue Jul 27, 2024
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)
barneygale added a commit that referenced this issue Aug 7, 2024
Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
barneygale added a commit that referenced this issue Aug 11, 2024
Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)

Co-authored-by: Adam Turner <[email protected]>
barneygale added a commit to barneygale/cpython that referenced this issue Aug 12, 2024
barneygale added a commit to barneygale/cpython that referenced this issue Aug 12, 2024
barneygale added a commit to barneygale/cpython that referenced this issue Aug 19, 2024
Remove the *ignore_errors* and *on_error* arguments from `Path.delete()`.
This functionality was carried over from `shutil`, but its design needs to
be re-considered in its new context. For example, we may wish to support a
*missing_ok* argument (like `Path.unlink()`), or automatically `chmod()`
and retry operations when we hit a permission error (like
`tempfile.TemporaryDirectory`), or retry operations with a backoff (like
`test.support.os_helper.rmtree()`), or utilise exception groups, etc. It's
best to leave our options open for now.
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…n#122368)

Rename `pathlib.Path.rmtree()` to `delete()`, and add support for deleting
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `copy()` methods (which will also
accept any type of file.)
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…n#122369)

Rename `pathlib.Path.copy()` to `_copy_file()` (i.e. make it private.)

Rename `pathlib.Path.copytree()` to `copy()`, and add support for copying
non-directories. This simplifies the interface for users, and nicely
complements the upcoming `move()` and `delete()` methods (which will also
accept any type of file.)

Co-authored-by: Adam Turner <[email protected]>
barneygale added a commit that referenced this issue Aug 25, 2024
Add a `Path.move()` method that moves a file or directory tree, and returns a new `Path` instance pointing to the target.

This method is similar to `shutil.move()`, except that it doesn't accept a *copy_function* argument, and it doesn't check whether the destination is an existing directory.
barneygale added a commit to barneygale/cpython that referenced this issue Aug 25, 2024
These two methods accept an *existing* directory path, onto which we join
the source path's base name to form the final target path.

A possible alternative implementation is to check for directories in
`copy()` and `move()` and adjust the target path, which is done in several
`shutil` functions. This behaviour is helpful in a shell context, but
less so in a stored program that explicitly specifies destinations. For
example, a user that calls `Path('foo.py').copy('bar.py')` might not
imagine that `bar.py/foo.py` would be created, but under the alternative
implementation this will happen if `bar.py` is an existing directory.
barneygale added a commit to barneygale/cpython that referenced this issue Aug 25, 2024
Per feedback from Paul Moore on pythonGH-123158, it's better to defer making
`Path.delete()` public than ship it with under-designed error handling
capabilities.

We leave a remnant `_delete()` method, which is used by `move()`. Any
functionality not needed by `move()` is deleted.
@picnixz picnixz removed the 3.14 new features, bugs and security fixes label Aug 26, 2024
barneygale added a commit that referenced this issue Aug 26, 2024
These two methods accept an *existing* directory path, onto which we join
the source path's base name to form the final target path.

A possible alternative implementation is to check for directories in
`copy()` and `move()` and adjust the target path, which is done in several
`shutil` functions. This behaviour is helpful in a shell context, but
less so in a stored program that explicitly specifies destinations. For
example, a user that calls `Path('foo.py').copy('bar.py')` might not
imagine that `bar.py/foo.py` would be created, but under the alternative
implementation this will happen if `bar.py` is an existing directory.
barneygale added a commit to barneygale/cpython that referenced this issue Aug 26, 2024
barneygale added a commit to barneygale/cpython that referenced this issue Aug 26, 2024
Remove *ignore* and *on_error* arguments from `pathlib.Path.copy[_into]()`,
because these arguments are under-designed. Specifically:

- *ignore* is appropriated from `shutil.copytree()`, but it's not clear
  how it should apply when the user copies a non-directory. We've changed
  the callback signature from the `shutil` version, but I'm not confident
  the new signature is as good as it can be.
- *on_error* is a generalisation of `shutil.copytree()`'s error handling,
  which is to accumulate exceptions and raise a single `shutil.Error` at
  the end. It's not obvious which solution is better.

Additionally, this arguments may be challenging to implement in future user
subclasses of `PathBase`, which might utilise a native recursive copying
method.
barneygale added a commit that referenced this issue Aug 26, 2024
Per feedback from Paul Moore on GH-123158, it's better to defer making
`Path.delete()` public than ship it with under-designed error handling
capabilities.

We leave a remnant `_delete()` method, which is used by `move()`. Any
functionality not needed by `move()` is deleted.
barneygale added a commit that referenced this issue Aug 26, 2024
…23337)

Remove *ignore* and *on_error* arguments from `pathlib.Path.copy[_into]()`,
because these arguments are under-designed. Specifically:

- *ignore* is appropriated from `shutil.copytree()`, but it's not clear
  how it should apply when the user copies a non-directory. We've changed
  the callback signature from the `shutil` version, but I'm not confident
  the new signature is as good as it can be.
- *on_error* is a generalisation of `shutil.copytree()`'s error handling,
  which is to accumulate exceptions and raise a single `shutil.Error` at
  the end. It's not obvious which solution is better.

Additionally, this arguments may be challenging to implement in future user
subclasses of `PathBase`, which might utilise a native recursive copying
method.
@barneygale
Copy link
Contributor

It's done! Thank you everyone who helped formulate and review this feature.

We've added new copy(), copy_into(), move() and move_into() methods, which work on both files and directories. The interfaces are refined to pathlib standards, with certain nonessential functionality (like support for filtering in copy()) omitted for now. Users may wish to propose new capabilities on the ideas forum.

The new methods expect exact targets (or target directories for the _into() variants), whereas shutil checks for existing directories and adjusts the target as necessary. The latter behaviour is helpful in an interactive shell with tab-complete, but less so in a stored program where state of the filesystem might not be known (for example, shutil.copy('a', 'b') could create either b or b/a).

The copy() and copy_into() methods use fcntl.FICLONE or os.copy_file_range where available (see GH-81338, GH-81340). This work has stalled for years in shutil for backwards-compatibility concerns, happily not present here.

Feedback most welcome! We have ~9 months until 3.14 beta 1 where the interface is more locked down, so there's still plenty of time make changes. I'll close this issue, but I can re-open it (or log new issues) if problems are identified. Thanks again, all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-pathlib type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

10 participants