2024 Beta RIO 1 Out-Of-Memory's after some deploys #39

CoryNessCTR · 2023-10-19T19:07:25Z

Describe the bug
After a couple java project deploys, on a roboRIO 1, the DS will report an out of memory exception

To Reproduce
Steps to reproduce the behavior:

Format/power cycle roboRIO 1
Create a new Timed Robot Skeleton Java Project
Construct a Talon object with PWM channel 0
Deploy project to roboRIO 1
Increment channel
Repeat steps 4-5
Eventually (in less then 10 repeats), get the following error:

OpenJDK Client VM warning: INFO: os::commit_memory(0xb0000000, 4194304, 0) failed; error='Not enough space' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 4194304 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /tmp/hs_err_pid7540.log

Expected behavior
Out of memory exception does not occur.

Desktop (please complete the following information):

OS: Windows 10
Project Information:

WPILib Information:
Project Version: 2024.1.1-beta-1
VS Code Version: 1.83.1
WPILib Extension Version: 2024.1.1-beta-1
C++ Extension Version: 1.17.5
Java Extension Version: 1.23.0
Java Debug Extension Version: 0.52.0
Java Dependencies Extension Version 0.23.0
Java Version: 17
Java Location: C:\Users\Public\wpilib\2024\jdk
Vendor Libraries:
   WPILib-New-Commands (1.0.0)

Additional context
I collected memory information before and after each deploy, available as a zip below:
Deploy 0 is collected immediately after power cycling the roboRIO, Deploy 5 is after the Out of Memory error occurred.
MemoryIssues.zip

I've also attached the log file of the out of memory error:
hs_err_pid7540.log

I've also repeated this experiment on the 2023_v3.2 image for a comparison, and stopped my testing after 30 consecutive deploys without issue. This appears to be a new or worsened issue for the 2024 libraries.

The text was updated successfully, but these errors were encountered:

EyalKeysar · 2023-12-07T13:34:33Z

We have also encountered this issue, and we found a way to solve it temporarily until NI releases an update.
I want to clarify that this solution is not official.

From what we understand the issue is caused because of multiple processes that take a lot of memory.
To see what processes are currently running in the roboRIO you first need to connect with SSH to the roboRIO (https://docs.wpilib.org/en/stable/docs/software/roborio-info/roborio-ssh.html).
Once you are connected to the roboRIO with SSH you can view the currently running processes using the "top" command (https://man7.org/linux/man-pages/man1/top.1.html).
Now you can see that few processes take more memory than others, these processes are the processes that run when you deploy, ideally, they should not run after you deploy again but they do, and because of this after a few deploys you get this error.
To solve this we are killing these processes when we get this error and it solves the problem.
To find the specific process we want to kill we use the "grep" command like this:
top | grep "JRE"
the output of this command is every process that has "JRE" in its "top" attributes. Now remember the PIDs (Process ID) of the output processes.
So now to kill the processes we need to use the "kill" command (https://man7.org/linux/man-pages/man1/kill.1.html), So if the PID is 2230 we will use it like this:
kill -9 2230
Run this command for every PID that you got from the filtered top (top | grep "JRE") command.
This should solve the problem.
In this example the PID is 4962:

calcmogul · 2023-12-07T18:33:38Z

You could use this instead to force-kill all processes with JRE in their name:

pgrep JRE | xargs kill -9

The following may work for remote kill, but I haven't tested whether ssh allows embedding pipes like that.

ssh [email protected] 'pgrep JRE | xargs kill -9'

EyalKeysar · 2023-12-08T05:40:54Z

When connected to roboRIO with USB the IP you want to SSH to is 172.22.11.2.
(https://docs.wpilib.org/he/stable/docs/software/roborio-info/roborio-ssh.html)

aaronleetw · 2023-12-09T16:20:35Z

We are also getting this issue. I'll test the remote kill ssh command.

Crossle86 · 2023-12-13T23:02:31Z

The problem in #40 is likely related but the symptoms are not quite the same... Some observations:
Killing the JRE process just causes another to be started in its place. From what I can see, that restart sometimes helps sometimes not.

Download of code seems to always be successful, the problem appears to be in the startup of the code. Sometimes starts ok, most other times starts with trash in the riolog or an incomplete riolog or an apparently good startup but starts logging lots of errors from CAN devices. Power off then on works.

aaronleetw · 2023-12-14T00:16:08Z

I'm not sure if it is related, but the "Restart Robot Code" option also does not work regardless of its state.

sciencewhiz · 2023-12-22T19:19:39Z

Does this still occur with the WPILib beta 4?

Crossle86 · 2023-12-24T02:02:50Z

Have not had a chance to test B4 yet. Not sure when I can do it now that xmas is here. Will try sometime next week.

stephenjust · 2024-01-13T23:02:12Z

I'm still reproducing this on the Kickoff release

aaronleetw · 2024-01-18T23:43:05Z

I am still reproducing this issue, albeit much less, in the kickoff release. After three days of testing, it failed one time.

Crossle86 · 2024-01-27T21:38:28Z

Our team has not seen any problems with deployment since kickoff release.

JaiCode08 · 2024-02-22T23:13:09Z

Hello. This issue for me is still occurring. I'm not doing any heavy logging or heavy computing on the roboRIO. The memory leaks occasionally happened in WPILib 2024.2.1 but has gotten worse with 2024.3.1. The roboRIO is on the latest firmware.

nkalupahana · 2024-02-27T03:59:33Z

We're also having this issue whenever we add any sort of logging to our code: https://github.com/FRC-7525/2024-Robot

Crossle86 · 2024-02-28T18:37:20Z

An update on this for our team. We stopped having the fail to deploy issue and things seemed normal until we started loading Autos created with PathPlanner. With only a couple Autos we started having out of memory errors to the point we bit the bullet and took a RIO2 out of last year's robot and that solved the out of memory issue. I was being cheap trying to use a RIO1 for this years robot.

CoryNessCTR mentioned this issue Nov 30, 2023

SigNotUpdated error on 2024 beta CrossTheRoadElec/Phoenix-Releases#58

Closed

rzblue transferred this issue from wpilibsuite/allwpilib Dec 4, 2023

PeterJohnson mentioned this issue Dec 11, 2023

Deployment Failures #40

Open

TheComputer314 mentioned this issue Dec 15, 2023

Figure out why the RIO memory leaks after a couple deploys icrobotics-team167/2024_Crescendo#9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024 Beta RIO 1 Out-Of-Memory's after some deploys #39

2024 Beta RIO 1 Out-Of-Memory's after some deploys #39

CoryNessCTR commented Oct 19, 2023

EyalKeysar commented Dec 7, 2023 •

edited

Loading

calcmogul commented Dec 7, 2023 •

edited

Loading

EyalKeysar commented Dec 8, 2023

aaronleetw commented Dec 9, 2023

Crossle86 commented Dec 13, 2023

aaronleetw commented Dec 14, 2023

sciencewhiz commented Dec 22, 2023

Crossle86 commented Dec 24, 2023

stephenjust commented Jan 13, 2024

aaronleetw commented Jan 18, 2024

Crossle86 commented Jan 27, 2024

JaiCode08 commented Feb 22, 2024 •

edited

Loading

nkalupahana commented Feb 27, 2024

Crossle86 commented Feb 28, 2024

2024 Beta RIO 1 Out-Of-Memory's after some deploys #39

2024 Beta RIO 1 Out-Of-Memory's after some deploys #39

Comments

CoryNessCTR commented Oct 19, 2023

EyalKeysar commented Dec 7, 2023 • edited Loading

calcmogul commented Dec 7, 2023 • edited Loading

EyalKeysar commented Dec 8, 2023

aaronleetw commented Dec 9, 2023

Crossle86 commented Dec 13, 2023

aaronleetw commented Dec 14, 2023

sciencewhiz commented Dec 22, 2023

Crossle86 commented Dec 24, 2023

stephenjust commented Jan 13, 2024

aaronleetw commented Jan 18, 2024

Crossle86 commented Jan 27, 2024

JaiCode08 commented Feb 22, 2024 • edited Loading

nkalupahana commented Feb 27, 2024

Crossle86 commented Feb 28, 2024

EyalKeysar commented Dec 7, 2023 •

edited

Loading

calcmogul commented Dec 7, 2023 •

edited

Loading

JaiCode08 commented Feb 22, 2024 •

edited

Loading