-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Deno.writeFile(readableStream)
can result in "chimera" files (partially overwritten files)
#19697
Comments
I confirmed this behavior as well. I was refactoring some code dealing with Tar and Untar from the stdlib for copying files across folders. I was trying to get rid of usage of A subtle but frustrating issue, and I'm surprised it's persisted for so long. I'm at least one person that would appreciate your PR! EDIT: after writing this I discovered that |
Hi @jkscx, looks like a working PR #23330 is dealing with this issue. I think it's making I'm not sure if a non-truncating option (like Python's Edit: Sorry I just found out I could still use const f = await Deno.open('o.txt', {write: true, create: true});
await f.write(new Uint8Array([65, 66, 67]));
f.close();
const q = await Deno.open('o.txt', {write: true});
await q.write(new Uint8Array([68]));
q.close() And a new issue should be opened if I'd want to ask about |
TLDR:
Deno.writeFile
doesn't truncate the file when the data source is aReadableStream
. That produces a “chimera” (partially overwritten) file when overwriting a larger file with a smaller file. In my case it “only” resulted in an un-parsable file, but could also be a security issue similar to aCropalypse (CVE 2023-21036), where an overwritten file’s content is partially recoverable, if the resulting file was uploaded somewhere on the internet, but I’d say that the likelihood of that happening is rather low.Merge request which introduced the issue: PR 17329
I could provide a short PR with the fix and a test case for this if you are willing to accept it.
Full story:
I wrote a script which processes a list of URLs. For each of these URLs, I get its HTTP body, save it to a temp file and process the file. Because I don’t need the file I “discard” it by overwriting it with the HTTP body of the next URL, i.e. I reuse the same temp file for each URL.
While running my script, I noticed that sometimes my file processing code failed with a file-not-parsable error. That was strange to me as whenever I manually downloaded the URL for which processing failed and run my file processing code it got processed successfully.
After playing around with it a bit, I noticed that when my automatic processing failed, the file’s size didn’t match the HTTP body’s size. On closer inspection, I realized that it matched the HTTP body of the previous URL, i.e. the one which was downloaded before this one. Weird…
It seemed that instead of the new file overwriting the old file, it resulted in a file which had the same size as the old file, but which failed to parse. This lead me to believe that the overwriting somehow silently failed and resulted in a corrupt file. But why was it only failing sometimes in overwriting it and not always?
At first I thought that maybe writing the file to disk was not instantaneous (or that maybe
Deno.writeFile
returned a tad too prematurely) and then my processing code read a broken state while it was still being flushed to disk? Adding some delays didn’t help so it had to be something else…After trying out some other ideas in a test environment and reducing the problem further and further in scope, I managed to find that the condition for a faulty overwrite was that the new file has to be smaller than the old file. Eureka!
This immediately made me believe that it’s a file truncation issue and upon checking the Deno source code I confirmed that it was so. Someone forgot to set the truncate option and when it was not set it defaulted to not truncating…
Code to reproduce the issue:
Code output:
The text was updated successfully, but these errors were encountered: