Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Paho » deadlock with the java client(hitting a deadlock with the java client)
deadlock with the java client [message #1720571] Tue, 19 January 2016 12:22 Go to next message
Emmanuel Touzery is currently offline Emmanuel TouzeryFriend
Messages: 3
Registered: January 2016
Junior Member
Hello,

we are hitting a deadlock with the java client (latest version of the library, 1.0.2). We are ourselves using locks and wait() calls and have a number of sending threads, but the deadlock appears to be in the paho code, so we wonder what could we do about it.

we are setting up mqtt without persistent queue:

client = new MqttClient(
Config.getMqttServerUri(),
clientId,
new MemoryPersistence());
client.setCallback(this);

Can you tell us what could we do to track down the issue, or what could be causing it?

I got these stacks with jtrace:

"Thread-206601" #283594 prio=5 os_prio=0 tid=0x00007fc4f4002800 nid=0x4588 in Object.wait() [0x00007fc557dfc000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Unknown Source)
at org.eclipse.paho.client.mqttv3.internal.Token.waitForResponse(Token.java:141)
- locked <0x000000077f3cb390> (a java.lang.Object)
at org.eclipse.paho.client.mqttv3.internal.Token.waitForCompletion(Token.java:108)
at org.eclipse.paho.client.mqttv3.MqttToken.waitForCompletion(MqttToken.java:67)
at org.eclipse.paho.client.mqttv3.MqttClient.publish(MqttClient.java:361)
at com.lecip.mqtt_db_sync.MqttSender.sendMessage(MqttSender.java:90)
- locked <0x00000006c6f05490> (a java.lang.Object)
at [our code]

same stack in many other threads (literally tens of them), and also:

"MQTT Ping: DCS_MQTT_BLS_LOC_16" #132164 prio=5 os_prio=0 tid=0x00007fc504020800 nid=0x2021 in Object.wait() [0x00007fc56c633000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Unknown Source)
- locked <0x00000006c7230dd8> (a java.util.TaskQueue)
at java.util.TimerThread.run(Unknown Source)

"MQTT Call: DCS_MQTT_BLS_LOC_16" #132163 prio=5 os_prio=0 tid=0x00007fc504147800 nid=0x2020 in Object.wait() [0x00007fc55cbca000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Unknown Source)
at org.eclipse.paho.client.mqttv3.internal.CommsCallback.run(CommsCallback.java:129)
- locked <0x00000006c7230f68> (a java.lang.Object)
at java.lang.Thread.run(Unknown Source)

"MQTT Snd: DCS_MQTT_BLS_LOC_16" #132162 prio=5 os_prio=0 tid=0x00007fc50413d000 nid=0x201f in Object.wait() [0x00007fc566dec000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Unknown Source)
at org.eclipse.paho.client.mqttv3.internal.ClientState.get(ClientState.java:643)
- locked <0x00000006c7231128> (a java.lang.Object)
at org.eclipse.paho.client.mqttv3.internal.CommsSender.run(CommsSender.java:98)
at java.lang.Thread.run(Unknown Source)

"MQTT Rec: DCS_MQTT_BLS_LOC_16" #132161 prio=5 os_prio=0 tid=0x00007fc50416e000 nid=0x201e runnable [0x00007fc566fee000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.security.ssl.InputRecord.readFully(Unknown Source)
at sun.security.ssl.InputRecord.read(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
- locked <0x00000006c72313a8> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at sun.security.ssl.AppInputStream.read(Unknown Source)
- eliminated <0x00000006c72314d0> (a sun.security.ssl.AppInputStream)
at sun.security.ssl.AppInputStream.read(Unknown Source)
- locked <0x00000006c72314d0> (a sun.security.ssl.AppInputStream)
at java.io.DataInputStream.readByte(Unknown Source)
at org.eclipse.paho.client.mqttv3.internal.wire.MqttInputStream.readMqttWireMessage(MqttInputStream.java:65)
at org.eclipse.paho.client.mqttv3.internal.CommsReceiver.run(CommsReceiver.java:107)
at java.lang.Thread.run(Unknown Source)
Re: deadlock with the java client [message #1720708 is a reply to message #1720571] Wed, 20 January 2016 11:57 Go to previous messageGo to next message
James Sutton is currently offline James SuttonFriend
Messages: 69
Registered: July 2015
Member
Hi Emmanuel,

Could you try using the 1.0.3 version from the SNAPSHOTS repository?

Change the repository url to: https://repo.eclipse.org/content/repositories/paho-snapshots/ and library version to 1.0.3.

The library has gone through a fair bit of change recently and it might be worth trying it first.
Re: deadlock with the java client [message #1720710 is a reply to message #1720708] Wed, 20 January 2016 12:20 Go to previous messageGo to next message
Emmanuel Touzery is currently offline Emmanuel TouzeryFriend
Messages: 3
Registered: January 2016
Junior Member
Thank you for the answer! Otherwise, well, this is a production application. What do you think is the risk to try the 1.0.3 snapshot version?

[Updated on: Wed, 20 January 2016 12:25]

Report message to a moderator

Re: deadlock with the java client [message #1721273 is a reply to message #1720710] Tue, 26 January 2016 09:15 Go to previous messageGo to next message
James Sutton is currently offline James SuttonFriend
Messages: 69
Registered: July 2015
Member
Overall I'd say that the 1.0.3 snapshot is more stable than 1.0.2 as the majority of changes so far have been bug fixes with only a few small features being added (WebSocket support). It currently passes all current Unit tests and we're planning to release 1.2.0 (Neon) on 01-05-2016 (https://projects.eclipse.org/projects/technology.paho/releases/1.2.0) so if you can't wait until then and 1.0.3 fixes your problem it might be worth swapping out until then.

If 1.0.3 doesn't fix your problem, we might need a bit more detail in order to investigate and replicate your problem so that we can get it fixed before 1.2.0.
Re: deadlock with the java client [message #1722062 is a reply to message #1721273] Tue, 02 February 2016 14:51 Go to previous messageGo to next message
Emmanuel Touzery is currently offline Emmanuel TouzeryFriend
Messages: 3
Registered: January 2016
Junior Member
Thank you for the answer! And sorry that I did not answer myself before.
By reviewing the Paho source code and the stack trace, I am pretty sure that the problem will be prevented by calling setTimeToWait() on the MqttClient.

It seems to me the server did not answer at all to the request, and so the query was "hanged" in the wait(). Hopefully setTimeToWait() will prevent a repeat.

That said, we did not deploy the fix yet in production, and the issue has yet to repeat...

Thank you for your answers, let me know if you don't agree with the interpretation!

Emmanuel

[Updated on: Tue, 02 February 2016 14:51]

Report message to a moderator

Re: deadlock with the java client [message #1724188 is a reply to message #1722062] Mon, 22 February 2016 09:10 Go to previous message
James Sutton is currently offline James SuttonFriend
Messages: 69
Registered: July 2015
Member
It looks from your stack trace like the error is being thrown when you are attempting to publish a message. Paho will throw an error if you attempt to publish a message if you are not yet fully connected so I think the connection must have been made at this point. An MQTTException should have been thrown when this happened, getting the error message from that would be a lot more useful I think.

try {
    // Publish here
}catch(MqttException me) {
    System.out.println("reason "+me.getReasonCode());
    System.out.println("msg "+me.getMessage());
    System.out.println("loc "+me.getLocalizedMessage());
    System.out.println("cause "+me.getCause());
    System.out.println("excep "+me);
    me.printStackTrace();
}
Previous Topic:where are the object files stored in the paho project
Next Topic:can you support max_packets of loop(timeout=1.0, max_packets=1)?
Goto Forum:
  


Current Time: Wed Sep 19 20:20:45 GMT 2018

Powered by FUDForum. Page generated in 0.02674 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top