-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement automatic reconnection #287
base: main
Are you sure you want to change the base?
Conversation
Thank you for looking into this! 👍 Automatic reconnect is definitely on the wish list of many of our users. :) I do like the idea to have "everything in one client" as opposed to my approach with a separate low-level and high-level client. At least the single-client approach is easier to grok for our users. 👍 I looked through the code, and it looks good. I'm (as always) a little bit concerned about adding extra internal state to our client (the new reconnect task). Increases the maintenance burden. In any case, for this feature I don't think we can avoid it if we want to use a single client. 😄 Aside: There is a lot of "reset state" going on (re-creating the futures). That's not your fault (or the fault of this PR) but my own fault (using futures in the first place). 😅 I strongly suggest that we look at alternatives for all the internal futures and background tasks. I'm thinking anyio (to no ones surprise I guess). :)
That is a concern of mine as well! Indeed, we would have to save all the subscriptions inside the client and then "resubscribe" when the connection is back online. These subscriptions are even more state to manage. :) Here is a suggestion that is a slight variation of the current approach:
This way, we get the separation of high-level and low-level and the easier maintenance that follows. It also allows allows to replace the
I see the benefits of this approach! 👍 I suggest that we do both 😄 That is, to have both Again, thank you for looking into this issue and very well done on the draft implementation. 👍 Let me know what you think of my comments above and do say if you have any questions. 😄 |
Thanks for your thoughts on this! Your reviews are one of the main reasons this project is so fun for me 😉 You're right about the internal state getting slightly out of control. While implementing this draft I already got bitten resetting futures while they were awaited elsewhere (which raises
I'm not sure I understand this point. Do you mean using streams and broadcast for the I'm on board with the |
Thank you for saying that. 😄 I don't have that much time these days but I do try to find it anyhow to at least do these reviews. :) It's a bit easier here during Easter.
I'm was leaning towards doing "streams and broadcasts" to do the reconnection itself (like in my sample code). E.g., keep the current In any case, let's be pragmatic and review our options here:
So in the larger perspective (time and resources being essential) I do actually lean towards option (1). Option 1 provides value here and now at the cost of future maintenance. 😄 If you agree, I think the next steps is to write out some test cases for this (to mitigate the maintenance cost). With tests in place, the pro-con calculation becomes easy since the maintenance cost goes towards zero. 😉 Again, thank you for all the time that you put into this PR (and the aiomqtt project in general). 👍 Let me know if you have any comments or questions. :) |
Thanks for elaborating, I think I understand what you mean now 😊 I'll play around with streams and broadcasts to understand them a little better and see how much work the second option would bring! You convinced me that that's the better option 😄 I'll report back once I have more, could take a bit, though, as I'm busy the next few days 😋 |
I just stumbled across this so sorry if I am late to the party. From my experience it might be better to split the client into a low level and high level because as mentioned it makes maintenance much easier, however I have no strong opinion on that. While looking through the PR I am not sure I understood everything correctly: I think he usage of the client is hard because there are multiple places where the disconnect error can occur. Typically I have a Edit: |
Thanks for chipping in and sorry for the late reply! 🙂
I agree that What benefits do you see from failing right away?
Good idea 👍
I noticed that, too. Currently we can only set the last will on client initialization. Maybe we can add a function to set the last will dynamically in the future. |
Maybe expose some client status through a property - e.g. I am suggesting that publish should not block when the client is disconnected because it's very unexpected. If I have a small program likes this while True:
await read_sensors()
try:
await client.publish(...)
except Exception:
pass
await write_sensor_to_local_db() it will stop reading sensors when the client is disconnected and it's not clear at all that it will behave like that. I think it depends on the program how publish should behave and currently I can think of three different desired behaviors
The third one is the most generic one because with it it's easy to implement the first two behaviors and I would expect a high level client to return something like that.
I currently don't understand how a function will provide a message when gracefully disconnecting. |
I'd like to avoid that, if possible. The whole internal connected / disconnected state is already all over the place, so I'm concerned that exposing this would lead to problems down the line.
I think I see why now. When If we think about how an asynchronous
However, I think you convinced me that a better design would be to implement only an asynchronous
Returning something (third bullet point) wouldn't work very well in the case of the One option would be to do the reconnect manually, so we could do I was in the past, and am still very against manual This turned out to be a long rambling, I hope it's understandable 😄 Do say if something's not clear, and let me know what you think! Again, thanks for chipping in, it helps quite a lot to have other perspectives on this. You make some good points, not only here, but also in the other issues and PRs that you're involved with 👍
Ah, I think I understand what you mean now 👍 #28 is related, and a very interesting discussion. There already has been a solution proposed, I'm interested to hear what you think about it. Maybe you are up to do a PR? 🙂 |
This is not very nice but not my main issue. With
I would have expected
Could you please elaborate why this would be an issue? Currently I fail to see how the # create obj
msg = client.publish(...)
msg.cancel_publish()
# status or flags
msg.status
msg.is_published
# wait for publish
await msg.published()
await msg.published(reconnect=True) # Could be on a per message basis
What would your goal be with that and can you make a pseudo example? Do I as a caller still have to catch an error and manually call reconnect? Do I have to do it both in
Thanks - that means a lot! |
I'm coming from #334 - and would just like to provide some input from my perspective, as I was asked to. Keep in mind, though, that I'm having a more low-level background (microcontrollers, C) and I'm only having one - my above referenced - use-case in mind. Assuming the That being said, I'd expect if Network issues while listening to subscribed topics appears a more challenging topic at first glance to me. Depending on the use-case I might definitely wanna know if I potentially missed a message - or I don't care and expect the loop iterating over messages, most likely running in its dedicated task, to keep going at best effort. What really doesn't help is how exceptions thrown in asyncio tasks are supposed to be handled, which - according to my knowledge - only works with a global asyncio exception handler. I guess it's still better than just leaving the user in the dark whether messages were potentially unhandled due to network issues, though. That doesn't even address the actual issue of how to actually implement reconnection, though. At first thought I'd be fine with both scenarios - given above assumptions hold true for both: |
Hi there @frederikaalund @JonathanPlasse 😊
Frederik drafted how reconnection could look like a while ago already (thank you again, master of asyncio 🙏😄). I finally had some time to hack around with this! Some thoughts:
reconnect=True
parameter to the client and leave all method signatures the same.For now, I've implemented the reconnection background task and adapted the
publish
method to wait until the connection returns and the message could actually be published (both are probably still full of bugs). You can play around with it by shutting a local MQTT broker on and off (e.g. with./scripts/develop
) and by running:I'm still thinking about how to deal with existing subscriptions and last wills. We probably have to resend them when
clean_session=True
, otherwise they will cease to exist without notice to the user after a reconnection in the background.Happy to hear what your thoughts are on this (or anyone else's who wants to chip in) 😊