Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickle ignores custom getstate methods on TextIOWrapper in Python 3.12 #122559

Open
SimonSorgQC opened this issue Aug 1, 2024 · 6 comments
Open
Assignees
Labels
topic-IO type-bug An unexpected behavior, bug, or error

Comments

@SimonSorgQC
Copy link

SimonSorgQC commented Aug 1, 2024

Bug report

Bug description:

So I am not entirely sure whether this is unintended behaviour, but it is definitely a noticeable change between 3.11 and 3.12 that is rather unintuitive

import pickle
from io import BytesIO, TextIOWrapper

class EncodedFile(TextIOWrapper):
	def __getstate__(self):
		return "string"
	def __setstate__(self, state):
		pass

file = EncodedFile(BytesIO(b"string"))
pickle.dumps(file)

This works in Python 3.11 and 3.10, but fails in 3.12 with

pickle.dumps(file)
TypeError: cannot pickle 'EncodedFile' instances

CPython versions tested on:

3.10, 3.11, 3.12

Operating systems tested on:

macOS

Linked PRs

@SimonSorgQC SimonSorgQC added the type-bug An unexpected behavior, bug, or error label Aug 1, 2024
@tomasr8
Copy link
Member

tomasr8 commented Aug 1, 2024

@SimonSorgQC
Copy link
Author

Interesting! While it makes sense that the original IOBase can not be pickled, I feel like it should be possible for subclasses overriding getstate

@SimonSorgQC
Copy link
Author

So I can verify that overriding reduce and reduce_ex worked for me. I would leave the issue open for now as I do not think getstate and setstate not working is nice - especially since reduce and reduce_ex of IOBase are buried in the C code.

Please feel free to correct me if you think this has been resolved.

@serhiy-storchaka
Copy link
Member

Pickling file objects was explicitly forbidden in bpo-10180 by adding __getstate__ methods that raise TypeError. These methods were removed from the C implementation in bpo-33138, because these classes were recognized as non-pickleable by default. Then these classes were made heap types in #101948 (gh-101819), and new __reduce__ and __reduce_ex__ methods that raise TypeError were added, because condition that prevented them from pickling was gone.

The right way is to remove __reduce__ and __reduce_ex__ methods and restore __getstate__ methods. This will make the C implementation consistent with the Python implementation.

@serhiy-storchaka
Copy link
Member

It turned out that this issue is a bit more complex. There are more differences between implementations. Classes are not tested separately, existing tests do not cover all cases. I am working on this problem.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Aug 2, 2024
…dule about pickling

In the C implementation, remove __reduce__ and __reduce_ex__ methods
that always raise TypeError and restore __getstate__ methods that always
raise TypeErrori.

This restores fine details of the pre-3.12 behavior and unifies
both implementations.
@serhiy-storchaka
Copy link
Member

I added tests as a guard against such type of regression, but in general there are no guaranties about pickleability (especially with protocols 0 and 1).

In meantime, the workaround is to reset __reduce__ and __reduce_ex__ in your subclass to default:

class EncodedFile(TextIOWrapper):
    ...
    __reduce__ = object.__reduce__
    __reduce_ex__ = object.__reduce_ex__

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-IO type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

3 participants