-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential endianness issue with big-endian client, little endian Agent #197
Comments
Friendly ping? |
Friendly ping. @pablogs9 @Acuadros95 would you perhaps have any suggestions on a possible direction for debugging? |
Hello @gavanderhoorn, we should fit a time slot to take a look at that, but currently, we are quite busy. I'm going to add this to our internal roadmap we will take a look at some point. Probably this error is an issue at Micro XRCE-DDS Client level not serializing/copying some header/payload properly when BE is configured, let us know if you find something. If not, we will check it later. BTW if this issue is critical for your use case, feel free to ask for commercial support at [email protected] or just ask me and we can schedule a meeting with our sales team. |
Ok, I'll take that into consideration. But you can confirm that setting |
Yes, it should be enought |
@gavanderhoorn We found some issues on the Agent deserialization process with big-endian clients. Please try out this PR with the fixes eProsima/Micro-XRCE-DDS-Agent#336, will merge when your feedback is OK. |
@Acuadros95: absolutely fantastic. I had not yet found time to look into this myself -- and I also didn't expect you guys to be so quick to work on this. Purely for effort already 💯 👍 I'll test these changes and update you here -- or would you prefer I comment on the PR? |
No problem!
Comment on the PR and we will close the issue when merged. |
Closing as eProsima/Micro-XRCE-DDS-Agent#336 is merged. Reopen if other issues are found |
@Acuadros95 @pablogs9: with more time to test, I'm getting to the point of actually sending data between the Client and the Agent and I believe I'm running into another endianness issue. For context: same systems as described earlier in this issue (ie: BE Client, LE Agent, all Fast-DDS on the ROS 2 side), Humble on both sides. To rule out issues with our use of the micro-ROS API, I've ported the Registration of Client to Agent works immediately. The Click to expand
(this is when the publisher publishes the value 11 -- hence the On the ROS 2 side, Click to expand
Initially I thought this was garbage data (ie: unitialised memory, misaligned reads, etc), but actually: Click to expand
Unfortunately I didn't capture the DDS side of the traffic (ie: between the Agent and the Technically I believe this should be OK (the values do represent little-endian integers in some way). Not sure what's making Is this still a problem in our/the application code, or should payload data have been swapped automatically? I can provide a Wireshark capture, but it would basically show what I explained above. The application code is Edit: I've searched for mentions of Edit 2: could it be the endianess information is lost when the Agent forwards the DDS-XRCE message? If ROS 2 'sees' a LE flag while the payload is BE, that would result in what |
Just captured the ROS 2 (Fast-)RTPS traffic for the same test (repeated it). I can't be sure, but I believe something is not entirely correct wrt the endianness encapsulation in the messages sent by the Agent. From the capture, an RTPS packet containing a DDS-XRCE payload forwarded by the Agent:
Notice how Edit: looks like services have a similar problem. Payloads coming from an LE client are not swapped. The |
I've spent some time trying to find where the encapsulation of specific (sub)messages is set, but I must admit that without some (architecture) documentation, there are a few too many 'moving parts' to quickly & easily determine what's going on exactly. |
Hello @gavanderhoorn, I'm back from holidays and I have this as TODO for today, let me check and I will report here probably today. |
Hello @gavanderhoorn I have drafted two approaches to solve this issue. The main point is that the Micro XRCE-DDS "Generic" Typesupport always writes the Representation Header (inside the RTPS Data Payload) as Little Endian: This line makes your Wireshark packet to decode Representation Header as:
Note that this "Generic" Typesupport is a Typesupport that serializes a byte array in order to allow the Micro XRCE-DDS Agent type agnostic. Using it, the Agent just forwards the Client's messages to the DDS Dataspace, with no ser/des procedure. So we have two options here, they are independent. Option 1: Make micro-ROS RMW to use always Little EndianBranch: This branch modifies the micro-ROS RMW to prepare the writing buffers and then ensure that the buffer endianness is LE. Note that after this endianness enforcement, the Note that:
Option 2: Assume the payload to have the same endianness as the XRCE protocolBranch:
As you know, the Micro XRCE-DDS Client library can be configured to use This does not mean that the payload (that is completely up to the application user) is not in BE nor in LE. But as I have mentioned in the beginning, the Micro XRCE-DDS Agent assumes LE. This approach, adds a configuration parameter in the XRCE Session Creation procedure named In this draft approach, I assume that if the library is configured to use BE wire protocol (by means of So, when the Client sends the session's property Note that:
In general, those are the approaches I would take, what is your opinion? Let me know your thoughts and your test results, so we can choose an option and go ahead. |
Thanks @pablogs9. You've confirmed my analysis, and also the two options I saw. If I understand you correctly, approach two would basically come down to:
action 4 would inform receiving entities data is in BE format, so they would use an appropriate deserialisation strategy to get data into the endianness they'd prefer/require (action 5). Correct? If yes, then my initial reaction would be that I'd find this approach 'cleaner'. It would also seem to put most of the work (ie: swapping) on the consumer's side, which in the case of Micro-XRCE DDS seems to make sense. Consumers are most likely more powerful systems for which swapping operations might be less costly than for producers. (and minor, but traditionally all network traffic was always BE)
Are there any platforms you support today which would require UCLIENT_BIG_ENDIANNESS to be IIRC, there are systems that are BE for binaries (ie: opcodes are encoded BE) and LE for data, but again IIRC those are rather rare. Just so IIUC: with wire protocol, you mean the DDS XRCE binary representation et al. defined by the DDS XRCE standard, correct? Not a custom one? |
Btw: the Client I can easily incorporate in the application. So approach 1 is easily tested. Approach 2 also requires changes to the Agent. Would you have a workflow for building that from a different branch quickly/easily? |
That's it.
Agree with this.
I have never seen this case, but also it is true that AFAIK you are the only ones that use XRCE/micro-ROS in BE.
Yes, when
Let me know if it works.
I would just checkout the branch and follow the classic CMake procedure of As far as you provide positive feedback on the solution, I will proceed with creating and merging a PR. |
Just performed a quick test (with approach 1): I've As expected, all payloads coming from the client are now marked The
Wireshark shows two LE Client then stays 'silent' and does not send back any traffic any more. I'll see if I can make the capture available tomorrow. |
Regarding the services you mention, we have the very same problem here: the transparent forwarding made by the Agent. As you can see here, the Agent deserializes the received payload (DDS Dataspace -> XRCE path) skipping the representation header. This way, when the raw serialized data (the byte array) is forwarded to the XRCE Client, there is no endianness information. This drives me to another solution Option 3: XRCE Client incoming and outcoming payloads shall include Representation HeaderMicro XRCE-DDS Agent shall not handle the representation header here and here and shall perfect forward what XRCE Client or remote DDS Writers sends. This way the XRCE Client uses who is in charge of writing or reading the Representation Header and handling the payload correctly. This implies the following changes:
Considerations:
|
I'm sure I haven't had sufficient ☕ but would letting the Agent set the encapsulation based on the value set by the Client on the XRCE DDS traffic not be an option? The current implementation may skip the bytes, but just checking whether the Client has set That would keep the Agent transparent I'd say, as it doesn't touch the payload, it just peeks at meta-data. Of course this would mean that BE Clients might receive LE data, which would mean they'd have to swap, but there's no way around that. Similar for BE Clients sending BE data to LE Agents -- but we've discussed that earlier. I'm sure I'm overlooking something important though. Edit: and if all else fails, standardising everything on LE (or BE) and incurring the swap somewhere would also seem to solve this. Perhaps support this as a CMake option, next to |
Issue template
micro_ros_setup
(client), Docker images (Agent)2.0.0
(client), Humble (Agent)Steps to reproduce the issue
This isn't going to be an sscce, but at a high-level:
-DCONFIG_BIG_ENDIANNESS=ON
-DUCLIENT_BIG_ENDIANNESS=ON
(although super build is not used)amd64
)Expected behavior
Successful Client<->Agent connection.
Actual behavior
Agent prints:
and so on.
Client fails to connect (
rclc_support_init_with_options (..)
returns1
).Additional information
Comparing the second packet exchanged with the Agent with a working little-endian Client (different system), it looks like some of the fields are swapped (from a Wireshark capture,
be: big endian
,le: little endian
):Note how the 6th, 7th, 10th and 13th groups are mirror images.
The first packet also seems to have the last 4 bytes swapped (
0x00000002
vs0x02000000
).Perhaps I misunderstood, but I expected
CONFIG_BIG_ENDIANNESS=ON
at the Micro-CDR level to take care of byte-swapping.Is more configuration needed? Should the client be setup differently perhaps?
Edit: I've checked and Micro-CDR appears to be correctly configured/built by CMake. Relevant part of the generated
config.h
:The text was updated successfully, but these errors were encountered: