Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

malloc() size limit? File read size limit? #2653

Closed
GabeAl opened this issue Nov 11, 2017 · 4 comments
Closed

malloc() size limit? File read size limit? #2653

GabeAl opened this issue Nov 11, 2017 · 4 comments

Comments

@GabeAl
Copy link

GabeAl commented Nov 11, 2017

Hello,
I'm using the Fall Creator's update with the Store version of Ubuntu 16.04.3. I have 96GB of RAM (and 40 threads) running Windows 10 Pro.

When I try to run memory-intensive applications that require reading in large files (>4GB) into arrays that are malloc()'d beforehand, the program will fail without reporting that the allocation failed or that reading the file failed. It just happens too quickly, doesn't use the expected amount of RAM, and doesn't produce the correct results.

This has happened in a few programs, but one of them is BURST (GitHub.com/knights-lab/BURST) on a 50GB database file.

The same binary executes on the same machine running native Ubuntu 16.04.3 and produces the correct result. The binary, when compiled using GCC on windows (not WSL) also runs correctly. In all cases, the system has >95% of the memory free.

So my question is whether there is a (secret) limit in either:

  1. how much memory can be malloc'd in a single malloc call
  2. how large a file can be read directly into memory with fread()

Since these are system calls likely translated in some way by the kernel (sbrk, read), I suspect there is something going on. If you are unable to confirm any such limitations, I will try to probe further with a toy program.

Thanks,
Gabe

@therealkenc
Copy link
Collaborator

You generally get better service by following the issue template. But anyway, dunno how this one slipped though after a year and a half. Not a dupe AFAICT. Test case is shorter than your post.

#include <stdlib.h>
#include <memory.h>
#include <stdio.h>

int main()
{
  const size_t BIG = 6*1024*1024*1024L; // 6GB
  void* mem = malloc(BIG);
  FILE* fp = fopen("big.bin", "wr+");
  if (mem != NULL && fp != NULL &&
      (memset(mem, 0xff, BIG) != NULL) &&
      (fwrite(mem, 1, BIG, fp) == BIG) &&
      (fseek(fp, 0, SEEK_SET) == 0) &&
      (fread(mem, 1, BIG, fp) == BIG))
    printf("whee!\n");
  else 
    printf("oops.\n");
  return 0;
}

The allocation succeeds (indeed the GHC guys allocate terrabytes), but the write() (and certainly by extension a read()) fails on large buffers. Fine on Real Linux™ natch.

[... blah blah blah]
mmap(NULL, 6442455040, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe6e2a20000
brk(NULL)                               = 0xb74000
brk(0xb95000)                           = 0xb95000
open("big.bin", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
fstat(3, {st_mode=S_IFREG|0666, st_size=0, ...}) = 0
write(3, "\377\377\377[...]\377\377\377\377"..., 6442450944) = -1 ENOMEM (Cannot allocate memory)

@GabeAl
Copy link
Author

GabeAl commented Nov 12, 2017

Cool, thanks. And indeed your test case is shorter than my spiel and reproduces the behavior on my end.
Love the "whee!" vs "oops." reporting on your "BIG" allocation, by the way.

Cheers,
Gabe

@jstarks jstarks added the bug label Nov 14, 2017
@jstarks jstarks assigned SvenGroot and unassigned SvenGroot Nov 14, 2017
@jstarks
Copy link
Member

jstarks commented Nov 19, 2017

In Linux, read on giant buffers caps the buffer size at 0x7fffff000 and proceeds with a truncated read. fread then calls read in a loop until the entire buffer is read.

In WSL, read returns ENOMEM instead of truncating.

We'll work on a fix.

@jstarks
Copy link
Member

jstarks commented Dec 19, 2017

This should be fixed in insider build 17063.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants