Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory usage for huge files #160

Closed
lcoombe opened this issue Nov 21, 2017 · 21 comments
Closed

High memory usage for huge files #160

lcoombe opened this issue Nov 21, 2017 · 21 comments
Labels

Comments

@lcoombe
Copy link

lcoombe commented Nov 21, 2017

Hello,

I'm running the following command:
mlr --tsvlite filter $Depth < 5 preARCS.bed.depth.tsv

I would expect this command to stream through the file, but it appears that the 400GB file being filtered is being read into memory?
From 'top':
0.399t 0.248t 808 R 76.3 10.1 1819:03 mlr --tsvlite filter $Depth < 5 preARCS.bed.depth.tsv

The output file is being written to OK:

Rname   Pos     Depth
1       1       0
1       2       0
1       3       0
1       4       0

The command is also going quite slowly -- it has been running for ~30h now. Any idea why the memory usage is so high?

(cc: @sjackman)

@johnkerl
Copy link
Owner

sounds like a memory leak. can you send me at least some of the file contents as a paste perhaps, for a repro?

@sjackman
Copy link
Contributor

The contents of the input are the same as the output listed above. Rname, Pos, and Depth are all integers. There's something around 20-billion rows.

@johnkerl
Copy link
Owner

@Icoombe valgrind not showing me anything on CentOS :(. What platform is this on?

@sjackman
Copy link
Contributor

sjackman commented Nov 21, 2017

I believe (correct me if I'm wrong)

❯❯❯ uname -a
Linux hpce706 3.10.0-229.14.1.el7.x86_64 #1 SMP Tue Sep 15 15:05:51 UTC 2015 x86_64 GNU/Linux
❯❯❯ mlr --version
Miller 5.2.2

@lcoombe
Copy link
Author

lcoombe commented Nov 21, 2017

Yup, that's right @sjackman

@johnkerl
Copy link
Owner

got a repro; will keep digging

@johnkerl
Copy link
Owner

valgrind shows no memory leaks at exit but if I run a large enough file, I can see RSS growth in htop. Which would explain why I haven't seen this in valgrind runs. :^/

@johnkerl
Copy link
Owner

johnkerl commented Nov 23, 2017

Short answer: try

mlr --tsvlite filter '$Depth < 5' < preARCS.bed.depth.tsv

or

mlr --no-mmap --tsvlite filter '$Depth < 5' preARCS.bed.depth.tsv

The issue is that Miller uses mmap by default, as it's maybe 10% faster than using stdio. But what this does is page the file in a bit at a time. Quite obvious in retrospect. :^/

Besides this being a great FAQ issue, the better fix would be either to (a) disable mmap if input files are over a certain size, or (b) make mmap simply not be the default ever.

@sjackman
Copy link
Contributor

sjackman commented Nov 23, 2017

Ah, cool. In that case, the high memory usage may be a red herring, and not the cause of the slowness.
@lcoombe Can you test whether < or --no-mmap is in fact faster than mmap?
@johnkerl You could try posix_madvise POSIX_MADV_SEQUENTIAL to hint that the pages may be freed after they're read.
http://man7.org/linux/man-pages/man3/posix_madvise.3.html
Note that even <foo.tsv could use mmap on the stdin file descriptor, when it's backed by a file rather than a pipe/stream.

@lcoombe
Copy link
Author

lcoombe commented Nov 23, 2017

Looks like using --no-mmap fixed the issue!

I tested this command:
mlr --no-mmap --tsvlite filter '$Depth < 5' preARCS.bed.depth.tsv
The run finished in just under 8 hours, and uses negligible memory. I killed my original command after 48h.

Thanks @johnkerl !

@sjackman
Copy link
Contributor

That's super-interesting to me. At least in theory that shouldn't be the behaviour. The OS should map the file to virtual memory, page it in as it's accessed, detect the sequential access pattern, and page out files as they're no longer needed. I can't explain this unexpected behaviour. I'd be curious to learn whether posix_madvise POSIX_MADV_SEQUENTIAL resolves it.

@johnkerl
Copy link
Owner

No change in htop usage using any madvise flags on either MacOSX or CentOS. :( Thanks for the idea though @sjackman!

@sjackman
Copy link
Contributor

Ah, well. Worth a shot.

Besides this being a great FAQ issue, the better fix would be either to (a) disable mmap if input files are over a certain size, or (b) make mmap simply not be the default ever.

Either of these would suit me. Making --no-mmap the default adding a --mmap option would be the easier.

@johnkerl
Copy link
Owner

Ahoy, I was too impatient. With various madvise flags I saw memory usage shoot straight up just as before. But if I wait longer ... MADV_DONTNEED lets pages get reclaimed, not right away, but as soon as there starts to be page pressure which is precisely the right situation.

@sjackman
Copy link
Contributor

sjackman commented Nov 27, 2017

MADV_DONTNEED: Do not expect access in the near future. https://linux.die.net/man/2/madvise
Do you mean advising MADV_DONTNEED after each page has been processed by mlr?

@johnkerl
Copy link
Owner

... same with MADV_SEQUENTIAL.

These are advisory. In cat/filter/put/ etc. (streaming) contexts pages can be after-freed. In sort/tac/ etc. (non-streaming) contexts, if pages are after-freed in the ingest pages, then record fields with mmap-backed pointers are accessed in random order, pages can be faulted back in.

@johnkerl
Copy link
Owner

... also, I'm overnarrating. I should dig a bit more before posting. I ran without madvise flags and also saw RSS dropping off after the initial ramp-up, on my Mac laptop. I need to try experiment more thoroughly.

@johnkerl
Copy link
Owner

johnkerl commented Nov 29, 2017

OK so: I ran with the as-is code, with madvise and MADV_DONTNEED, and with madvise and MADV_SEQUENTIAL. This on MacOSX. In all three cases the RSS shot up steadily with progress through the file; then began to knock down in the face of page pressure. But that is false comfort, because in all three cases the Miller executable was nonetheless OOM-killed. (The data file was larger than system memory + swap.)

So. This burns, really, because (a) I should have caught it sooner (it's obvious in retrospect), and (b) I put serious time a couple years ago into supporting mmapped I/O for its performance benefits. If I make mmapped I/O non-default then essentially no one will use it, and Miller will be suddenly slower (not a lot, but it will be a performance regression) as of the next release.

My thought is to use mmap below some file-size threshold and stdio above, where the threshold defaults to something like a few GB but is itself specifiable. This way we get (out of the box) faster-by-default for non-huge files, and non-OOM for huge files -- and detailed control for those who seek it out.

@sjackman
Copy link
Contributor

This behaviour is so strange. I don't understand it at all. The old pages should be dropped from resident memory, and there's no reason the OOM should be invoked.

Your workaround seems reasonable to me.

@johnkerl
Copy link
Owner

madvise is advisory, not mandatory, and maybe MacOSX isn't participating in this. Maybe it would work fine on other platforms. But I don't want to get platform-specific in the code ...

@johnkerl
Copy link
Owner

johnkerl commented Dec 5, 2017

P.S. I tried a hugefile (> RAM+swap) on CentOS and the mmap failed before the madvise was even reached.

@johnkerl johnkerl changed the title Filter: high memory usage High memory usage for huge files Jan 1, 2018
@johnkerl johnkerl removed the active label Sep 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants