Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rosidl type support packages not found #819

Closed
alberthli opened this issue Jul 18, 2024 · 3 comments
Closed

rosidl type support packages not found #819

alberthli opened this issue Jul 18, 2024 · 3 comments

Comments

@alberthli
Copy link

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04 (Dockerized)
  • Installation type:
    • humble via pixi (robostack)
  • DDS implementation:
    • both Fast-RTPS and Cyclone DDS
  • Client library (if applicable):
    • both rclpy and rclcpp

This issue is cross-posted from here: prefix-dev/pixi#1635. However, since pixi is such a specific tool, I was hoping to get any additional possible insight here as well. If it's better for organization, I'm happy to close this issue and consolidate the discussion to the other thread.

The full text of the original issue is pasted here for reference.

Original Issue

Reproducible example

Tl;dr: Our issue arises when trying to boot up a ROS stack in a pixi shell. When using pixi, rosidl type support packages are not found. However, when we manually install ROS pre-compiled binaries in a Docker container without using pixi, the packages are found and our stack boots up correctly.

We have not found an easily-reproducible minimal example. The issue can be reproduced in the following debugging branch of our project repo with the commit hash 46f503cf2ba7068b2b576e5540eb3ab17cea7f10: https://github.com/Caltech-AMBER/obelisk/tree/debug-docker-rosidl.

Assuming you already have required system deps and Docker, run our setup script with the following flags in the root of the cloned repo on the debug-docker-rosidl branch:

source dev_setup.sh --skip-docker --no-cyclone-perf

The only thing this does is set an environment variable called OBELISK_ROOT to the directory where this repository has been cloned and creates a .env file containing other environment variables useful for dockerizing our dev setup. We run Docker for isolation and run pixi inside of it. To build and enter the container, run

cd docker
docker compose -f docker-compose-no-gpu.yml run --build obelisk

Then, to enter the dev environment, run

pixi shell -e dev

We build some custom messages through some pixi task, then the rest of the ROS2 packages.

pixi run messages-build
pixi run source-obelisk

Then, we boot up our stack using an alias installed in the Docker container:

obk-launch config_file_path=dummy_cpp.yaml device_name=onboard

This will spawn a simulation, but we see a runtime error in the terminal:

[obelisk_mujoco_robot-5] [rcutils|error_handling.c:65] an error string (message, file name, or formatted message) will be truncated
[obelisk_mujoco_robot-5] 
[obelisk_mujoco_robot-5] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
[obelisk_mujoco_robot-5] This error state is being overwritten:
[obelisk_mujoco_robot-5] 
[obelisk_mujoco_robot-5]   'Type support not from this implementation. Got:
[obelisk_mujoco_robot-5]     Handle's typesupport identifier (rosidl_typesupport_cpp) is not supported by this library, at /opt/conda/build_artifacts/ros-humble-rosidl-typesupport-cpp-0_1675687196536/work/ros-humble-rosidl-typesupport-cpp/src/work/src/type_support_dispatch.hpp:111
[obelisk_mujoco_robot-5]     Could not load library libobelisk_sensor_msgs__rosidl_typesupport_introspection_cpp.so: dlopen error: libstd_msgs__rosidl_typesupport_introspection_cpp.so: cannot open shared object file: No such file or directory, at /opt/conda/build_artifacts/ros-humble-rcutils-0_1675685175987/work/ros-humble-rcutils/src/work/src/shared_library.c:99, at /opt/conda/build_artifacts/ros-humble-rosidl-typesupport-cpp-0_1675687196536/work/ros-humble-rosidl-typesupport-cpp/s, at /opt/conda/build_artifacts/ros-humble-rmw-cyclonedds-cpp-0_1675688132292/work/ros-humble-rmw-cyclonedds-cpp/src/work/src/rmw_node.cpp:1958'
[obelisk_mujoco_robot-5] 
[obelisk_mujoco_robot-5] with this new error message:
[obelisk_mujoco_robot-5] 
[obelisk_mujoco_robot-5]   'type_support is null, at /opt/conda/build_artifacts/ros-humble-rmw-cyclonedds-cpp-0_1675688132292/work/ros-humble-rmw-cyclonedds-cpp/src/work/src/rmw_node.cpp:2277'
[obelisk_mujoco_robot-5] 
[obelisk_mujoco_robot-5] rcutils_reset_error() should be called after error handling to avoid this.
[obelisk_mujoco_robot-5] <<<

Issue description

Here, we summarize a few things we tested. Each test is independent from each other.

Test 1: Using only Docker without pixi

If instead, the Debug section of the Dockerfile is uncommented and pixi is not installed:

# syntax=docker/dockerfile:1

# base image
FROM ubuntu:22.04 as base
SHELL ["/bin/bash", "-c"]

# username, uid, gid
ARG USER=user
ARG UID=1000
ARG GID=1000
ARG OBELISK_ROOT=/
ENV USER=$USER
ENV UID=$UID
ENV GID=$GID
ENV OBELISK_ROOT=$OBELISK_ROOT

# set timezone
ENV TZ=America/Los_Angeles
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone

# basic dependencies from docker_setup.sh (up until sudo and below)
RUN apt-get update && apt-get install -y \
    curl \
    build-essential \
    cmake \
    clang-tools-12 \
    nano \
    vim \
    git \
    python3-dev \
    python-is-python3 \
    python3-pip \
    python3-argcomplete \
    mesa-utils \
    x11-apps \
    libyaml-dev \
    mesa-common-dev \
    libglfw3-dev \
    locales \
    sudo && \
    rm -rf /var/lib/apt/lists/* && \
    locale-gen en_US.UTF-8

# create non-root user with sudo privileges for certain commands
RUN groupadd --gid $GID $USER && \
    useradd --uid $UID --gid $GID -m $USER -d /home/${USER} --shell /usr/bin/bash && \
    echo "${USER}:password" | chpasswd && \
    usermod -aG sudo ${USER} && \
    echo "%${USER} ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

# add user to dialout group to enable serial port access from the container
RUN sudo usermod -a -G dialout ${USER}

# DEBUG
##############################################
RUN curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key \
        -o /usr/share/keyrings/ros-archive-keyring.gpg && \
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] \
        http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" | \
        tee /etc/apt/sources.list.d/ros2.list > /dev/null && \
    apt-get update -y
RUN apt-get install -y \
    ros-humble-ros-base \
    ros-dev-tools \
    ros-humble-rosidl-generator-cpp \
    ros-humble-rosidl-default-generators
RUN apt-get install -y \
    ros-humble-rmw-cyclonedds-cpp
RUN echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc
ENV RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
##############################################

# switch to new user and workdir
USER ${UID}

# run docker setup script in Dockerfile
# COPY docker_setup.sh /tmp/docker_setup.sh
# RUN source /tmp/docker_setup.sh --skip-docker && \
#     sudo rm /tmp/docker_setup.sh

# add local user binary folder to PATH variable
ENV PATH="${PATH}:/home/${USER}/.local/bin"
ENV XDG_RUNTIME_DIR=/run/user/${UID}
WORKDIR /home/${USER}

We can successfully boot up the stack with no errors by doing the following.

# rebuild and re-enter the container
cd docker
docker compose -f docker-compose-no-gpu.yml run --build obelisk

Then, in the docker container, running the following commands after deleting the build and install directories under obelisk_ws:

source repro.sh
ros2 launch obelisk_ros obelisk_bringup.launch.py config_file_path:=dummy_cpp.yaml device_name:=onboard auto_start:=true bag:=false

This should produce the simulation of the dummy robot swinging back and forth with no error messages in the terminal.

Test 2: Not Using Cyclone DDS

We have tried not using Cyclone for our RMW by commenting out the following line in the pixi.toml:

# env = { RMW_IMPLEMENTATION="rmw_cyclonedds_cpp" }

We confirm that the rmw implementation is the default by running

echo $RMW_IMPLEMENTATION

and

ros2 doctor --report

which shows that it is rmw_fastrtps_cpp. When running the same commands as described in the section on reproducing the issue, we get a slightly different error:

[jointencoders_passthrough_estimator-4] [rcutils|error_handling.c:65] an error string (message, file name, or formatted message) will be truncated
[jointencoders_passthrough_estimator-4] [rcutils|error_handling.c:65] an error string (message, file name, or formatted message) will be truncated
[jointencoders_passthrough_estimator-4] 
[jointencoders_passthrough_estimator-4] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
[jointencoders_passthrough_estimator-4] This error state is being overwritten:
[jointencoders_passthrough_estimator-4] 
[jointencoders_passthrough_estimator-4]   'Type support not from this implementation. Got:
[jointencoders_passthrough_estimator-4]     Handle's typesupport identifier (rosidl_typesupport_cpp) is not supported by this library, at /opt/conda/build_artifacts/ros-humble-rosidl-typesupport-cpp-0_1675687196536/work/ros-humble-rosidl-typesupport-cpp/src/work/src/type_support_dispatch.hpp:111
[jointencoders_passthrough_estimator-4]     Could not load library libobelisk_sensor_msgs__rosidl_typesupport_fastrtps_cpp.so: dlopen error: libstd_msgs__rosidl_typesupport_fastrtps_cpp.so: cannot open shared object file: No such file or directory, at /opt/conda/build_artifacts/ros-humble-rcutils-0_1675685175987/work/ros-humble-rcutils/src/work/src/shared_library.c:99, at /opt/conda/build_artifacts/ros-humble-rosidl-typesupport-cpp-0_1675687196536/work/ros-humble-rosidl-typesupport-cpp/src/work/sr, at /opt/conda/build_artifacts/ros-humble-rcl-0_1675689729790/work/ros-humble-rcl/src/work/src/rcl/subscription.c:108'
[jointencoders_passthrough_estimator-4] 
[jointencoders_passthrough_estimator-4] with this new error message:
[jointencoders_passthrough_estimator-4] 
[jointencoders_passthrough_estimator-4]   'invalid allocator, at /opt/conda/build_artifacts/ros-humble-rcl-0_1675689729790/work/ros-humble-rcl/src/work/src/rcl/subscription.c:218'
[jointencoders_passthrough_estimator-4] 
[jointencoders_passthrough_estimator-4] rcutils_reset_error() should be called after error handling to avoid this.
[jointencoders_passthrough_estimator-4] <<<
[jointencoders_passthrough_estimator-4] invalid allocator, at /opt/conda/build_artifacts/ros-humble-rcl-0_1675689729790/work/ros-humble-rcl/src/work/src/rcl/subscription.c:218

Test 3: Various Changes to CMakeLists.txt Files

We have determined that the error is introduced by a custom ROS message we define, called MujocoImage.msg under the obelisk_sensor_msgs package located in the obelisk_ws directory. When we comment it out from the CMakeLists.txt and rebuild/rerun, the stack runs correctly without error:

rosidl_generate_interfaces(${PROJECT_NAME}
  "msg/JointEncoders.msg"
  "msg/TrueSimState.msg"
  # "msg/MujocoImage.msg"
  DEPENDENCIES
  std_msgs
  obelisk_std_msgs
)

We have tried numerous other changes noted in related issues online, such as specifying the LIBRARY_NAME (see: #441):

rosidl_generate_interfaces(${PROJECT_NAME}
  "msg/JointEncoders.msg"
  "msg/TrueSimState.msg"
  "msg/MujocoImage.msg"
  DEPENDENCIES
  std_msgs
  obelisk_std_msgs
  LIBRARY_NAME
  ${PROJECT_NAME}
)

This failed.

We have tried downgrading to python 3.10.12 in the pixi.toml as per this issue: ros2/examples#303 without success.

We have tried to install Cyclone locally (in the docker container) while also running pixi, just in case this issue would have been relevant: ros2/rmw_fastrtps#541 without success.

Other Observations

We noticed that it's looking for rosidl_typesupport in /opt/conda/..., which we don't have in the container:

Handle's typesupport identifier (rosidl_typesupport_cpp) is not supported by this library, at /opt/conda/build_artifacts/ros-humble-rosidl-typesupport-cpp-0_1675687196536/work/ros-humble-rosidl-typesupport-cpp/src/work/src/type_support_dispatch.hpp:111

It also tries to look there for a .so file:

Could not load library libobelisk_sensor_msgs__rosidl_typesupport_fastrtps_cpp.so: dlopen error: libstd_msgs__rosidl_typesupport_fastrtps_cpp.so: cannot open shared object file: No such file or directory, at /opt/conda/build_artifacts/ros-humble-rcutils-0_1675685175987/work/ros-humble-rcutils/src/work/src/shared_library.c:99, at /opt/conda/build_artifacts/ros-humble-rosidl-typesupport-cpp-0_1675687196536/work/ros-humble-rosidl-typesupport-cpp/src/work/sr, at /opt/conda/build_artifacts/ros-humble-rcl-0_1675689729790/work/ros-humble-rcl/src/work/src/rcl/subscription.c:108'

However, we can find this file successfully in the directory $OBELISK_ROOT/obelisk_ws/install/obelisk_sensor_msgs/lib

# this shows a bunch of .so files
ls $OBELISK_ROOT/obelisk_ws/install/obelisk_sensor_msgs/lib

# outputs the below
libobelisk_sensor_msgs__rosidl_generator_c.so    libobelisk_sensor_msgs__rosidl_typesupport_cpp.so           libobelisk_sensor_msgs__rosidl_typesupport_introspection_c.so
libobelisk_sensor_msgs__rosidl_generator_py.so   libobelisk_sensor_msgs__rosidl_typesupport_fastrtps_c.so    libobelisk_sensor_msgs__rosidl_typesupport_introspection_cpp.so
libobelisk_sensor_msgs__rosidl_typesupport_c.so  libobelisk_sensor_msgs__rosidl_typesupport_fastrtps_cpp.so  python3.11

This suggests that we may be able to configure something slightly different such that these files are found.

Expected behavior

Our ROS stack should boot up without issue.

@traversaro
Copy link
Contributor

traversaro commented Jul 19, 2024

To clarify, I also tried to use the instructions you provided for run the installation with docker, but executing:

source dev_setup.sh --skip-docker --no-cyclone-perf

tries to install some tools (uv) globally, so I just halted it to avoid the script messing with my environment.

It also tried to clutter my .bashrc, it would be ideal if you avoid doing that while presenting a way to reproduce your problem.

@alberthli
Copy link
Author

Closing this issue in favor of activity on the other thread.

@traversaro
Copy link
Contributor

To clarify, I also tried to use the instructions you provided for run the installation with docker, but executing:

source dev_setup.sh --skip-docker --no-cyclone-perf

tries to install some tools (uv) globally, so I just halted it to avoid the script messing with my environment.

It also tried to clutter my .bashrc, it would be ideal if you avoid doing that while presenting a way to reproduce your problem.

My bad, that was an answer for the other thread, not sure how it ended up here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants