Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance differences between Windows and Linux #921

Open
jmather-sesi opened this issue Aug 7, 2024 · 5 comments
Open

Performance differences between Windows and Linux #921

jmather-sesi opened this issue Aug 7, 2024 · 5 comments

Comments

@jmather-sesi
Copy link

jmather-sesi commented Aug 7, 2024

I've been running our software through our benchmark suite with both mimalloc 1.8.7 and mimalloc 2.1.7 and have noticed some differences in its behaviour across OSes. On Windows, 2.1.7 pretty much always comes out on top in speed, vsize and rss. However on Linux, it's a bit more complicated. In some tests, I've seen 2.1.7 consume up to 75% more virtual memory than 1.8.7. rss is generally the same between both versions.

I'm wondering if this has been seen before, and if there's anything we can do about it. Ideally we'd like to use the same allocator on both OSes as its less to maintain. We need to be cautious about our virtual memory size as some users monitor the vsize as a way to detect swapping, and will kill the process if vsize exceeds the amount of physical memory.

@daanx
Copy link
Collaborator

daanx commented Aug 12, 2024

One difference between Linux and Windows is that on Linux MIMALLOC_ARENA_EAGER_COMMIT is enabled (as Linux allows overcommit and it is "fine" to use more virtual memory as it's free). That means mimalloc will commit 1GiB at a time for each arena (on Linux, "commit" means it has read/write access PROT_READ|PROT_WRITE instead of reserved but no access PROT_NONE). This may be the cause for the difference you see -- can you try running mimalloc with MIMALLOC_ARENA_EAGER_COMMIT=0 and see if that reduces the vsize ?

Now, having said that, it seems you see the difference also between v1.8.7 and v2.1.7 and each of those use the same arena implementation (only the handling of (thread local) segments in segment.c differs). I am surprised by this and I wonder if there is something going on.

  • Can you test with MIMALLOC_ARENA_EAGER_COMMIT=0 and see what happens?
  • What does vsize measure exactly? Is it all reserved virtual memory, or only virtual memory that is committed (i.e. not PROT_NONE) (but may still be untouched) ?

Thanks!

@jmather-sesi
Copy link
Author

jmather-sesi commented Aug 13, 2024

Hi Daan,

I just tried setting MIMALLOC_ARENA_EAGER_COMMIT=0 and ran the test that performs the worst (on Linux). Unfortunately it didn't seem to have any effect.

Here is the vsize and rss graph over time for multiple allocators. As you can see, 2.1.7 uses the most vsize by far:

comparison

When exporting MIMALLOC_ARENA_EAGER_COMMIT=0 and re-running the test with just 2.1.7, the results are pretty much identical:

eager_comparison

Regarding vsize and rss, the script that I'm using to generate these graphs (https://github.com/jeetsukumaran/Syrupy) simply tracks the output of the rss and vsz fields from ps over time. On my machine, man ps states:

       vsz         VSZ       virtual memory size of the process in KiB
                             (1024-byte units).  Device mappings are currently
                             excluded; this is subject to change.  (alias
                             vsize).
       rss         RSS       resident set size, the non-swapped physical
                             memory that a task has used (in kilobytes).
                             (alias rssize, rsz).

Here are the stats for each run:

1.8.7
--------------------------------------------------------------------------------------------------
heap stats:     peak       total       freed     current        unit       count   
  reserved:     5.0 GiB     5.0 GiB     0           5.0 GiB                          
 committed:     1.8 GiB     5.0 GiB    50.4 GiB   -45.3 GiB                          ok
     reset:    13.4 MiB
    purged:    42.4 GiB
   touched:     0           0         141.3 GiB  -141.3 GiB                          ok
  segments:   460          28.6 Ki     28.2 Ki    438                                not all freed
-abandoned:     1           1           0           1                                not all freed
   -cached:     0           0           0           0                                ok
     pages:     0           0         471.3 Ki   -471.3 Ki                           ok
-abandoned:     3           3           0           3                                not all freed
 -extended:     0      
 -noretire:     0      
    arenas:     5      
-crossover:     0      
 -rollback:     0      
     mmaps:     0      
   commits:     0      
    resets:     6      
    purges:    98.1 Ki 
   threads:    28          28           2          26                                not all freed
  searches:     0.0 avg
numa nodes:     1
   elapsed:   464.020 s
   process: user: 1999.568 s, system: 117.681 s, faults: 65, rss: 4.9 GiB, commit: 1.8 GiB

2.1.7
--------------------------------------------------------------------------------------------------
heap stats:     peak       total       freed     current        unit       count   
  reserved:    10.1 GiB    12.8 GiB     2.8 GiB    10.0 GiB                          
 committed:     2.1 GiB    12.8 GiB   145.1 GiB  -132.2 GiB                          ok
     reset:     0      
    purged:    83.0 GiB
   touched:   192.7 KiB   129.9 MiB   185.9 GiB  -185.8 GiB                          ok
  segments:    70           2.0 Ki      1.9 Ki     65                                not all freed
-abandoned:     1           1           0           1                                not all freed
   -cached:     0           0           0           0                                ok
     pages:     0           0         573.6 Ki   -573.6 Ki                           ok
-abandoned:     3           3           0           3                                not all freed
 -extended:     0      
 -noretire:     0      
    arenas:     9      
-crossover:     0      
 -rollback:     0      
     mmaps:     0      
   commits:     0      
    resets:     0      
    purges:    33.7 Ki 
   threads:    26          26           2          24                                not all freed
  searches:     0.0 avg
numa nodes:     1
   elapsed:   467.501 s
   process: user: 2004.478 s, system: 120.140 s, faults: 20, rss: 4.4 GiB, commit: 2.1 GiB

2.1.7 - eager=0
--------------------------------------------------------------------------------------------------
heap stats:     peak       total       freed     current        unit       count   
  reserved:    10.1 GiB    12.8 GiB     2.8 GiB    10.0 GiB                          
 committed:   634.5 MiB     7.1 GiB   151.2 GiB  -144.1 GiB                          ok
     reset:     0      
    purged:    85.7 GiB
   touched:   192.7 KiB   132.4 MiB   186.2 GiB  -186.1 GiB                          ok
  segments:    69           2.0 Ki      2.0 Ki     65                                not all freed
-abandoned:     1           1           0           1                                not all freed
   -cached:     0           0           0           0                                ok
     pages:     0           0         573.9 Ki   -573.9 Ki                           ok
-abandoned:     3           3           0           3                                not all freed
 -extended:     0      
 -noretire:     0      
    arenas:     9      
-crossover:     0      
 -rollback:     0      
     mmaps:     0      
   commits:     5.8 Ki 
    resets:     0      
    purges:    33.7 Ki 
   threads:    28          28           2          26                                not all freed
  searches:     0.0 avg
numa nodes:     1
   elapsed:   465.997 s
   process: user: 1998.885 s, system: 120.455 s, faults: 35, rss: 4.4 GiB, commit: 634.5 MiB

Thanks for your assistance!

@daanx
Copy link
Collaborator

daanx commented Aug 13, 2024

Very interesting.. but that does look a bit unexpected -- not sure what is causing the big vsize difference between 1.8.7 and 2.1.7. Is there any way I can reproduce this? (this may be a mimalloc bug)

@jmather-sesi
Copy link
Author

jmather-sesi commented Aug 14, 2024

Hi Daan, we can definitely set you up with a way to reproduce this. May I contact you by email with instructions? I can get your email address from the git logs.

@daanx
Copy link
Collaborator

daanx commented Aug 22, 2024

Yes, I would like to investigate this -- thanks! (either daan at microsoft.com or effp.org works, put "mimalloc" in the subject if you can) . It is best if I can locally build and test but an Ubuntu binary (with debug info) would probably also work as I could preload mimalloc. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants