Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[paho-dev] python: implicit loop_stop on disconnect, multiple clients, watchdogs

I'm using paho.mqtt.python 2.1.0 (NetBSD 10 amd64, python 3.12).
Generally things work well, but dealing with flaky networks has been a
little challenging.  Reading the docs and code, I'm having trouble
following.  The README points at this list for discussion.

I have a python program to monitor a UPS.  It connects to one broker
(not 5) and sends a json payload (voltage/load etc.) every minute, or on
demand if the payload is interesting.  That's ingested into a bunch of
Home Assistant sensors.  There's nothing interesting MQTT-wise about
this.

I chose loop_start as it seemed the simplest.  So connect_async, then
loop_start, and things are as expected.

I have had some network flakiness, as one would expect from time to
time.  Due to my general paranoia I wrote a watchdog which is cleared by
getting a PUBACK, to try to validate what I care about.  And in part, it
was due to not realizing that there is a built in mechanism that when
the TCP connection times out, paho.mqtt.python will tear down the socket
and make a new one.

On firing, the watchdog calls disconnect and then connect_async.  I
found that the program never recovered and just realized that disconnect
calls loop_stop, says the documentation.

Reading code, I am finding a few things hard to follow:

  I don't understand why loop_stop is called on disconnect.  To me,
  connect/disconnect is logically separate from running the event loop.
  It might be nice to make this louder, changing the 1-line disconnect
  description to
    `Use disconnect() to disconnect from the broker and stop loop processing.`
  to make it more likely people absorb this.  But perhaps it's only me.

  I can't find in the code where loop_stop is invoked on disconnect.
  Grepping for loop_stop in *.py, I see only the definition (plus a
  comment).

  It is now clear to me that loop_start is called on a client object,
  not at the higher level of the whole library.  I thus wonder:

    - Can you create multiple mqtt clients, loop_start on each, and have
      that all work?  (e.g. a program that talks to two brokers.)  The
      documentation implies yes, but it doesn't say it.

    - If you allow the client to become unreferenced and gc happens,
      does this cause the thread to cleanly exit?  Perhaps the thread
      class itself does that?  Is having a client gc'd an ok thing to
      do?  (It really seems like it should be, but it also doesn't seem
      trivial to make it work right.)

  I would expect that the callbacks happen in the loop thread, and this
  creates a multiprocessing hazard with using data structures also used
  in the normal thread.  The docs don't mention the need for
  synchronization.

I am now calling loop_start after connect_async on watchdog and expect
things to go much better.

Thanks,
Greg


Back to the top