Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bisq using up too much memory on Linux even after closing #3918

Closed
FKrauss opened this issue Jan 27, 2020 · 11 comments
Closed

Bisq using up too much memory on Linux even after closing #3918

FKrauss opened this issue Jan 27, 2020 · 11 comments
Assignees
Labels
a:bug is:critical https://bisq.wiki/Critical_bug is:priority PR or issue marked with this label is up for compensation on:Linux

Comments

@FKrauss
Copy link

FKrauss commented Jan 27, 2020

Description

Bisq will take more and more memory until the system freezes when none remains.
Additionally if shutdown before no memory is left, not all of it is released. About 4-7GB are shown to be additionally occupied, but no process seems to hold it. Logging out and in again does not fix it. A restart does.

Version

1.2.5

Steps to reproduce

Unknown, just start the application for me.

Expected behaviour

I expect bisq to take maybe 3GB of ram and released it after closing the application.

Actual behaviour

More and more ram will be occupied until none is left and the system starts freezing. Sometimes bisq gets terminated by the system other times I just force reboot.

Device or machine

Ubuntu 18.04.3
Ryzen 5 2600X
G.Skill Aegis DIMM Kit 16GB

Additional info

The log contains a case where, after a reboot I started Bisq and got lucky. The application got terminated (I reacted too slow and could no longer move the mouse) after the ram was full. About 5.5GB of ram were shown to be still occupied after the termination.

bisq.log

@ripcurlx
Copy link
Contributor

Related to #3128, #3657.

@ripcurlx ripcurlx added the is:priority PR or issue marked with this label is up for compensation label Feb 18, 2020
@ripcurlx
Copy link
Contributor

Related to: #3128, #3657, #3918, #3917, #3787, #3786, #3686, #3677, #3343

@ghubstan
Copy link
Member

ghubstan commented Feb 19, 2020

I think this is due to one or several JavaFx bugs related to bug JDK-8188094

During profiling, I noticed that the JavaFX Application Thread had been allocated 66.6 GiB
JavaFX-App-Thread-Alloc-66Gib-2020-02-19 17-45-46
This couldn't be right, as my vm is 8 GiB, and my bisq instance had -Xmx=4g. I thought it was just a JDK Mission Control bug, but I've been reproducing the problem with a tiny javafx demo app that constantly leaks from the JavaFx app thread (the same demo app found in the OpenJDK bug report). The OS does not get the RAM back until I quit the IDE too.

I have not found a solution. I'm running Ubuntu 18 & 19(vm), using an nvidia gpu.

@ghubstan
Copy link
Member

ghubstan commented Mar 4, 2020

What I think is happening on my Linux machine is the java vm thinks it has 128 GB of RAM to work with.

My OpenJDK 10 and 11 VMs have a very large default -XX:MaxRAM setting of 128 GB. It looks like "because" of this -- I need to be more careful with that word -- there are too many minor page faults, and libc malloc()s are creating a lot of extra memory segments via mmap() syscalls, but not using heap memory for alloctions via brk() calls. During tracing sessions, I saw mmap() calls, but not brk() calls.

The MaxRAM setting is described in the OpenJDK source.

OpenJDK 9:
	https://hg.openjdk.java.net/jdk9/jdk9/hotspot/file/b756e7a2ec33/src/share/vm/runtime/globals.hpp#l2030	
	product_pd(uint64_t, MaxRAM, "Real memory size (in bytes) used to set maximum heap size") 
	range(0, 0XFFFFFFFFFFFFFFFF)                    
OpenJDK 11:
	https://hg.openjdk.java.net/jdk/jdk11/file/1ddf9a99e4ad/src/hotspot/share/runtime/globals.hpp
	define_pd_global(uint64_t,MaxRAM,                    1ULL*G);

You can check your own VM's default MaxRAM setting:

OpenJDK 11
$ java -XX:+PrintFlagsFinal -version | grep MaxRAM
	 uint64_t MaxRAM = 137438953472
	 
OpenJDK 10:
$ java -XX:+PrintFlagsFinal -version | grep MaxRAM
 	uint64_t MaxRAM = 137438953472

137438953472 bytes is 128 gigabytes

I can set MaxRAM=2GiB in JAVA_OPTS

        export JAVA_OPTS="-XX:MaxRAM=2147483648"

and Bisq's RES (SEE htop) stays below ~1.2 GB, instead of growing to ~4.8 GB. (But I haven't let it run for days...)


After libc malloc()s expand the extra virtual memory via mmap() syscalls, the app causes MMU violations when it tries to use the allocation memory range(s) and finds no mapping for an address or addresses.

I saw MMU errors in GC logs last night, when using -XX:+ExtendedDTraceProbes + GC logging, but only today realized there may be a connection to my VM's detaults MaxRAM=128GB setting. I had assumed it was due to the extra overhead from using -XX:+DebugNonSafepoints -XX:+PreserveFramePointer, but especially due to -XX:+ExtendedDTraceProbes.

I believe these MMU violations in turn result in page faults, and causes consumption of extra pages of physical memory -- the RES, or resident set size you can see in htop.

Using $ pmap -x $(pgrep -f BisqAppMain) | more ...

I find a memory mapped file with an RSS ~3.8 GB

       0000000707200000 3994624 3984976 3984976 rw---   [ anon ]

I don't know what this anonymous mapped file is, but setting the VM's MaxRAM to 2GB in JAVA_OPTS reduces this mmapped file's size to ~0.5 GB.

Corresponding to this huge 3.8GB memory mapped file size is htop's RES value of ~ 4.8 GB, and VIRT ~10.4 GB (when running Bisq with no MaxRAM option).


Here are measurements after starting Bisq with an empty data.dir, using MaxRAM 2 GB, and letting it run for 30 minutes (not touching the GUI).

export JAVA_OPTS="-XX:MaxRAM=2147483648"
    (1.5 GB is not enough MaxRAM and the OOM Killer crashes Bisq)

RES	1118 MB,  VIRT 6986 MB		(SEE $ htop)

The java process' largest memory mapped file is ~0.5 GB  
    (Use $ pmap -x $(pgrep -f BisqAppMain) | more)	

00000000e0000000  524288  524288  524288 rw---   [ anon ]

About 7-8 hours after starting Bisq with a 3 GB MaxRAM setting, I looked at

$htop -> RES: ~2 GB, VIRT: ~4 GB

Major & Minor page fault counts:
$ ps -o pid,min_flt,maj_flt,cmd ->

      PID     MINFL  MAJFL CMD
     14615 777788    235     openjdk/jdk-11/bin/java -XX:MaxRAM=3221225472 -classpath ...

I unset JAVA_OPTS, restarted Bisq (with an empty data.dir), and checked minor/major page fault counts after uptime ~2.75 hrs

$htop -> RES: ~4.7 GB, VIRT: ~10.4 GB

Major & Minor page fault counts:
$ ps -o pid,min_flt,maj_flt,cmd ->

     PID      MINFL  MAJFL CMD
    26736 1312906    14      openjdk/jdk-11/bin/java -classpath ...

The bisq instance running with the default 128 GB MaxRAM setting is almost doubling the number of minor page faults in less than half the time as the instance running with a 3 GB MaxRAM setting.


Attached below are two page fault flame graphs showing very large differences in the number of faults while tracing page_fault_user events for 120s, begun 30s after bisq was started with an empty data.dir.

bisq-page-faults-2g-maxram.svg 		shows 110,393 pages 
bisq-page-faults-128g-maxram.svg	shows 954,347 pages

I had to zip them up to upload them to here because you can't upload svg files to comments, and the flamegraph tool does not produce png files.

page-fault-flame-graphs.zip
page-fault-flame-graphs.zip.md5sum.txt

Open them in a browser, and hover your mouse over and drill down to levels representing "do_page_fault" event counts for libjvm.so and java method code paths. To return to the original view, click "Reset Zoom" at the top left of page.

@ghubstan
Copy link
Member

ghubstan commented Mar 16, 2020

There is a glibc setting on Linux that can be used to define a hard limit of the number of arenas -- thread safe memory pools used by malloc() -- that can be created. The trade off for using this is more thread contention among Bisq's allocating threads versus less memory usage.

export MALLOC_ARENA_MAX=4  

Using this on my Ubuntu 18 laptop reduces the RES memory by ~300 mb, and VIRT by ~3.7 GB.

In addition, configuring the jvm's starting and max heap size reduces both RES and VIRT by another ~200 mb.

export JAVA_OPTS="-Xms820m -Xmx820m"

Used together

export MALLOC_ARENA_MAX=4		
export JAVA_OPTS="-Xms820m -Xmx820m"

RES memory is reduced by ~500 mb, and VIRT memory by ~3.9 GB.

I will test these on VMs next (Arch Linux, Debian, Fedora).

These settings, any of which disable dynamic adjustment of glibc's mmap threshold, save another 100 mb in RES and VIRT, but I haven't experimented enough to suggest anyone else try them yet.

export MALLOC_MMAP_THRESHOLD_=131072
export MALLOC_TRIM_THRESHOLD_=131072
export MALLOC_TOP_PAD_=131072
export MALLOC_MMAP_MAX_=65536

By the way, I am looking for leaks too...

@ghubstan
Copy link
Member

On my Ubuntu 18 hosted VMs Arch Linux kernel 5.5.8, Fedora 31, and Debian 10.3, the settings

export MALLOC_ARENA_MAX=4
export JAVA_OPTS="-Xms820m -Xmx820m"	

reduced resident memory use by ~400 MB, and virtual memory by ~1.9 GB.

I assume the overall reduced memory footprint on the VMs is because they don't have access to the GPU.

@ghubstan
Copy link
Member

Tracing linux glibc memory_arena_new calls via

sudo bpftrace -e 'usdt:/proc/<PID>/root/lib/x86_64-linux-gnu/libc-2.27.so:libc:memory_arena_new { printf("PID %d: created new arena at %p, size %d\n", pid, arg0, arg1); }'

is showing Bisq malloc() calls creating 20 new arenas between startup (with emtpy data.dir) and the time the BlockParser is done.

When I started Bisq after setting

export MALLOC_ARENA_MAX=4
export MALLOC_MMAP_THRESHOLD_=131072

the same trace shows Bisq is does not create any new arenas during the same interval; it is using existing arenas.

@freimair
Copy link
Member

freimair commented Apr 1, 2020

so all in all, it seems that our memory issues on linux are mainly caused by the JVM, its implementations and (default) configuration?

what do you suggest as an action plan? get the malloc stuff into our code? do more tests?

@ghubstan
Copy link
Member

ghubstan commented Apr 1, 2020

To sum up, I agree the biggest problem on Linux is off-heap memory growth. Your adding the MaxRAM jvm option reduced resident memory > 50%

The next experiment should be adding

export MALLOC_ARENA_MAX=4
export MALLOC_MMAP_THRESHOLD_=131072

to the bisq-desktop and bisq-daemon startup scripts; this requires no changes to the binaries.
If testers don't see any performance drag from this after using this setting for X days, they might try to save another ~100 or so MB RAM by setting the jvm's start & max heap size:

-Xms820m  -Xmx820m

820 MB should be sufficient, and I think Linux bisq users would be happy to see a ~ 1.0 GB resident memory size after starting bisq with an empty data dir. (When we started investigating, bisq's resident memory size was growing to ~ 4.8 GB. So problem not solved, but this is a significant improvement.)

Regarding the Java Heap: I certain JavaFX 11 has leaks, and their dev team has been fixing them in a version we cannot use yet. Can't do much about that yet. There are probably other on-heap leaks, but I have not been able to identify with certainty.

cd2357 added a commit to cd2357/bisq that referenced this issue May 7, 2020
Update the gradle dependency to JavaFX 14.

This brings to Bisq the latest JavaFX fixes and improvements, especially
 in the areas of UI performance, memory management and security.

JavaFX can be upgraded independently of the JDK used to build the
application, so this change is modular and does not affect other parts
of the build process.

Related / likely related to: bisq-network#350 bisq-network#2135 bisq-network#2509 bisq-network#3128 bisq-network#3307 bisq-network#3308 bisq-network#3343
bisq-network#3430 bisq-network#3657 bisq-network#3677 bisq-network#3683 bisq-network#3686 bisq-network#3786 bisq-network#3787 bisq-network#3892 bisq-network#3917 bisq-network#3918 bisq-network#3936
@chimp1984
Copy link
Contributor

Thanks a lot @ghubstan for your investigations!!!!

@ripcurlx Should we add those options to the build scripts?

cd2357 added a commit to cd2357/bisq that referenced this issue Sep 16, 2020
Update the gradle dependency to JavaFX 14.

This brings to Bisq the latest JavaFX fixes and improvements, especially
 in the areas of UI performance, memory management and security.

JavaFX can be upgraded independently of the JDK used to build the
application, so this change is modular and does not affect other parts
of the build process.

Related / likely related to: bisq-network#350 bisq-network#2135 bisq-network#2509 bisq-network#3128 bisq-network#3307 bisq-network#3308 bisq-network#3343
bisq-network#3430 bisq-network#3657 bisq-network#3677 bisq-network#3683 bisq-network#3686 bisq-network#3786 bisq-network#3787 bisq-network#3892 bisq-network#3917 bisq-network#3918 bisq-network#3936
@cd2357
Copy link
Contributor

cd2357 commented May 9, 2021

Should be fixed in the most recent release (v1.6.4) which brings several UI performance improvements and generally reduces system resource consumption (especially RAM). Please try it out and let us know if it's still an issue for you.

@cd2357 cd2357 closed this as completed May 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:bug is:critical https://bisq.wiki/Critical_bug is:priority PR or issue marked with this label is up for compensation on:Linux
Projects
None yet
Development

No branches or pull requests

7 participants