-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polymetis server fails after two hours #1379
Comments
In the libfranka error message, A common workaround is to use |
Hi, Thanks for responding! I actually found a way around this and it was hidden within polymetis the whole time. If anyone else runs into this issue instead of the launch robot command above, you should run the script at ~/fairo/polymetis/polymetis/python/scripts/persis_server.sh. This will make it so the server will be restarted if it gets killed, I ran this and was able to get the franka running for 14+ hours without issues! |
Hello, Sorry to re-open this issue, but it turns out I was wrong. I switched to a new task where the robot must pick up a cube through reinforcement learning (before we were doing goal reaching so no need for contact), and the error came up again. Maybe for more context, the error gets thrown in the following function in my library. What's strange is that we are checking if the controller is not running and then starting impedance control but I'm guessing this is too naive an implementation. Do you have any thoughts on how to make this more robust, such that it never tries the update joint command without the cartesian controller running?
|
To clarify, the issue you were running into happens as follows:
Theoretically you should not be updating without a controller running anymore, since you are now always checking to see if a controller is running. Can you confirm whether you are still seeing the error message in 3.? The source of the issue, which I explained in the previous comment, cannot be removed programmatically since it has to do with the limits of the hardware itself. You can try to make your policy less aggressive to prevent the issue, but I would advise to wrap another loop around your script to reset the robot and maybe throw away the data collected in the current episode as a recovery mechanism to ensure that your experiment continues running. |
So I checked the server and I did get the error message in 3. I reran a training loop while printing For anyone curious, my workaround is that I added a try-except clause to catch this failure mode, and then restart the server. Now I can run the script for many hours without it crashing due to the error above (although it's not ideal)
|
Type of Issue
Select the type of issue:
Description
Hi all,
I'm currently running polymetis with a Franka robot and the Robotiq gripper. Everything is working as intended, except for when we run polymetis for longer periods of time (~2 hrs). We'd like to be able to run it continuously for many hours or even days as we are testing reinforcement learning algorithms
The issue seems to occur within the server / robot client with the following command
launch_robot.py robot_client=franka_hardware robot_client.executable_cfg.robot_ip=172.16.0.2
And everything will work as intended for two hours but eventually crash
Current Behavior
When my training script crashes, I see the following output from the polymetis server
Expected Behavior
Not entirely sure what's to be expected, but when the experiment is running well, and the server is able to start the controller / switch to the default controller. For example, all I had to do with restart my training script and the server (with the same log as above) was able to continue working as follows:
Steps to reproduce
Not sure how to reproduce this other than maybe running a random policy for ~2hrs?
The text was updated successfully, but these errors were encountered: