Make the handling of RTI connections more robust #146

cmnrd · 2023-01-20T10:20:33Z

There is a long-standing flaw in the RTI/federate mechanism for handling ports. The RTI tries to get a default port, and if is unavailable, it tries a port number that is one larger, and if that fails, it tries one more, etc. The federates go through a similar sequence, trying the default port number first, and if failing, trying one more.

However, this really doesn't work. In particular, if you start a federate before the RTI, it skips the default port, and it takes a very long time for it to circle around to try that default port again.

The problem this was trying to address is that when a program releases a port, the OS does not make the port available to other programs for some time. There is a good reason for this: the OS wants to prevent a program from grabbing a port and then receiving messages that were intended for a program that has exited. It therefore holds the port long enough that any messages that were in flight die before it releases the port.

This feature was making CI fail because it runs many federated programs in sequence.

I think a better solution is just that the RTI should just use a fixed port, perhaps optionally specified as a command-line argument (which the federates will also need to be told). Then we just have to figure out how to make CI work (wait long enough between federated tests?).

Originally posted by @edwardalee in lf-lang/lingua-franca#1556 (comment)

Also see the rest of the discussion in lf-lang/lingua-franca#1556

Jakio815 · 2024-03-26T23:35:32Z

@edwardalee @cmnrd I don't think this issue is fixed. Were there further discussions about the port fixing?

edwardalee · 2024-03-27T09:00:09Z

I think the situation is improved, in that you can now reliably start several RTIs/federations on the same machine. But I think you still have to start the RTI first for the federates to find it in reasonable time. What symptoms are you seeing?

@lhstrh has proposed designing a broker that would have a fixed IP/port and would hand out RTI IP/port addresses (as they say, all problems in CS can be satisfied with one more level of indirection). This broker would have to be run as a demon to be effective, like the mosquito broker in MQTT. I proposed that this broker could be itself an LF program. It does create the extra hassle of having to set the broker up on any machine that you would like an RTI to run. Alternatively, I guess we could run a global broker on lf-mac.eecs.berkeley.edu. Another alternative might be to look into using broadcast packets, but this would require relying lower-level networking APIs.

Jakio815 · 2024-03-27T17:15:52Z

I didn't really find a problem. I was curious what happened to the discussion of incrementing the RTI's port.

The design looks interesting. Is it a future plan to implement, or is it on a branch?

edwardalee · 2024-03-27T18:01:50Z

AFAIK, nobody has started working on it. If I were to do it, I would try to make an LF program (just for fun... I don't think it really needs LF features).

lhstrh · 2024-03-28T11:45:51Z

I haven't really researched the topic, but my off-the-cuff response is we need something along the lines of the following:

have a broker listen at a standard port for communication with federates
let federates either specify a known broker or discover one
discovery could be done using UDP broadcast
this means federates should also be listening for responses from brokers that receive their inquiry

One question that comes to mind is: what do we do if participants of the same federation discover different brokers? I think gossip protocols are typically used to address these kinds of situations.

First and foremost, before doing anything, I would research whether there are existing implementations that are stable, popular, and well maintained. I estimate the likelihood that we need to build something like this from scratch to be near zero. This is obviously a problem that has been solved in a thousand many different ways already...

cmnrd mentioned this issue Jan 20, 2023

Fix docker tests in fed-gen lf-lang/lingua-franca#1556

Merged

Jakio815 mentioned this issue Mar 26, 2024

Default ports for federates. #402

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the handling of RTI connections more robust #146

Make the handling of RTI connections more robust #146

cmnrd commented Jan 20, 2023

Jakio815 commented Mar 26, 2024

edwardalee commented Mar 27, 2024

Jakio815 commented Mar 27, 2024

edwardalee commented Mar 27, 2024

lhstrh commented Mar 28, 2024 •

edited

Loading

Make the handling of RTI connections more robust #146

Make the handling of RTI connections more robust #146

Comments

cmnrd commented Jan 20, 2023

Jakio815 commented Mar 26, 2024

edwardalee commented Mar 27, 2024

Jakio815 commented Mar 27, 2024

edwardalee commented Mar 27, 2024

lhstrh commented Mar 28, 2024 • edited Loading

lhstrh commented Mar 28, 2024 •

edited

Loading