Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use better temporary files mechanism in install_unpacked_wheel #7699

Closed
sbidoul opened this issue Feb 5, 2020 · 8 comments · Fixed by #7929
Closed

Use better temporary files mechanism in install_unpacked_wheel #7699

sbidoul opened this issue Feb 5, 2020 · 8 comments · Fixed by #7929
Labels
auto-locked Outdated issues that have been locked by automation type: refactor Refactoring code

Comments

@sbidoul
Copy link
Member

sbidoul commented Feb 5, 2020

install_unpacked_wheel creates files such as INSTALLER or RECORD with a .pip extension then renames them to the intended file in order to have an atomic file creation.

Example:

# Record details of all files installed
record = os.path.join(dest_info_dir, 'RECORD')
temp_record = os.path.join(dest_info_dir, 'RECORD.pip')
with open_for_csv(record, 'r') as record_in:
with open_for_csv(temp_record, 'w+') as record_out:
reader = csv.reader(record_in)
outrows = get_csv_rows_for_installed(
reader, installed=installed, changed=changed,
generated=generated, lib_dir=lib_dir,
)
writer = csv.writer(record_out)
# Sort to simplify testing.
for row in sorted_outrows(outrows):
writer.writerow(row)
shutil.move(temp_record, record)

The goal of this issue is to use a better and safer approach.

@triage-new-issues triage-new-issues bot added the S: needs triage Issues/PRs that need to be triaged label Feb 5, 2020
@sbidoul sbidoul added the type: refactor Refactoring code label Feb 5, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label Feb 5, 2020
@sbidoul sbidoul added the good first issue A good item for first time contributors to work on label Feb 5, 2020
@rudolfovic
Copy link

adjacent_tmp_file opens files as binary while the csvwriter insists on writing strings no matter what is passed to it. A possible solution would be to add a parameter to adjacent_tmp_file so that it passes it accordingly to NamedTemporaryFile.
NamedTemporaryFile uses mode='w+b' by default and adjacent_tmp_file does not override this.
This fix passes the tests.
Mode can either be added to adjacent_tmp_file as a named parameter so that it does not have to be specified in other cases (of which there are currently two).
From what I found it is in line with the pip conventions
e.g. def decode(input, fallback_encoding, errors='replace'): in webencodings/__init__.py.
Let me know what you think.

@uranusjr
Copy link
Member

uranusjr commented Feb 6, 2020

Would it be enough to keep csv.writer happy if we wrap the file handle in an io.TextIOWrapper?

@siAyush
Copy link

siAyush commented Mar 19, 2020

Hey, @uranusjr can you please clear it more. About this "if we wrap the filehandle in an io.TextIOWrapper" .
Thanks

@uranusjr
Copy link
Member

So the problem here is that csvwriter expects a str-based file object, but on Python 3 the caller (adjacent_tmp_file) writes file in binary mode. So I was wondering whether it would be possible to wrap the file object opened by adjacent_tmp_file with an io.TextIOWrapper on Python 3 to turn it into a text file-like object, and pass that to csvwriter instead. The wrapper would perform the necessariy encoding when csvwriter writes to it, to turn the written data into bytes for the NamedTemporaryFile file object.

@pradyunsg
Copy link
Member

@uranusjr I suggest avoiding complicating this issue by discussing trying to do this in-memory (since it's a good first issue).

@siAyush
Copy link

siAyush commented Mar 20, 2020

Thank you @uranusjr for simplifying it. Let me try it.

@sbidoul sbidoul removed the good first issue A good item for first time contributors to work on label Mar 27, 2020
@McSinyx
Copy link
Contributor

McSinyx commented Mar 29, 2020

@siAyush, may I pick this up?

@siAyush
Copy link

siAyush commented Mar 29, 2020

Sure

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label May 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation type: refactor Refactoring code
Projects
None yet
6 participants