-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add encoding charset sniffing #736
Conversation
e0bbcbb
to
7d576c9
Compare
d0eceeb
to
4392921
Compare
What about these changes fail in Node 10? I'm not opposed to dropping support, but I don't want to without good reason |
I see. That makes sense. I looked into it and found that the problem seemed to be where the buffer read into memory for sniffing was being reused to create the response stream. Commenting out the line made everything fine. Some errors said that "Cannot set headers after they are sent to the client"1, so it seems there may be differences in the event loop (or something like that) between v10.x and newer versions, but I'm not sure. So, I think we have two options:
I think Option 1 may be better at this time because it doesn't matter to read text files twice in most cases and it can ship this as a minor update. But also, dropping support for unmaintained versions itself can be a good reason for a major update. What do you think? Footnotes |
4392921
to
373edef
Compare
Rebased onto master |
Yeah I think I'm actually leaning toward dropping support if that's the case, I'd rather drop 10 than introduce reading files twice. You're right that it shouldn't matter too much, but less I/O is always better.
That error message can (but not always) mean the server timed out, and so once the request finally completes, it tries to send a response which has already been sent. I've seen the same in Apache, it may mean we have some tests which are too slow. |
Node 10 dropped in master! Another merge/rebase should fix the tests! |
373edef
to
732400c
Compare
Thanks! This helped me understand what had happened. Though I rebased this branch onto master, there are still some errors on 12.x, macOS-latest1. It says: 394 passing (31s)
4 failing
1) test/cli.test.js setting mimeTypes via cli - directly test unfinished:
Error: test unfinished
at Object.<anonymous> (test/cli.test.js:84:1)
2) test/cli.test.js setting mimeTypes via cli - directly test count !== plan:
test count !== plan
+ expected - actual
-1
+4
3) test/cli.test.js child test left in queue: t.test --proxy requires you to specify a protocol:
child test left in queue: t.test --proxy requires you to specify a protocol
4) test/cli.test.js test count !== plan:
test count !== plan
+ expected - actual
-4
+3 According to the messages (eg. Footnotes |
Yeah those are some non-deterministic errors we're getting at random, usually re-running the tests fixes them. I've been trying to weed those out, including adding more plans. I've managed to make them better recently, but they still pop up. I reran tests for this PR and they're passing now. But there's a conflict now to clear up, probably just need to recreate the package-lock again. Once we get this merged I'll probably make the next major release! |
732400c
to
b0cb863
Compare
Fixed the conflict and passed all tests! I'm looking forward to the next release!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it! This is a great improvement, thank you!!
Please ensure that your pull request fulfills these requirements:
master
branchWhat is the purpose of this pull request? (bug fix, enhancement, new feature,...)
Resolves #426
What changes did you make?
Added charset sniffing for text files.
Is there anything you'd like reviewers to focus on?
To sniff charset, text files need to be read in sync (fs.readFileSync
), rather than in stream (fs.createReadStream
). This change would cause more memory usage and thus, if a file exceeds buffer max length (about 4GB on 64-bit architectures1), Node.js will throw an error.However, I think there are not so many cases that a plain text file is over 4GB and this is acceptable for charset sniffing. What do you think about this?UPDATE:
I added file size checking on 4392921, so the above problem is fixed.
Please provide testing instructions, if applicable:
npm run test
Footnotes
https://nodejs.org/api/buffer.html#buffer_buffer_constants_max_length ↩