Sharing the thread with the paho mailing list.Begin forwarded message: Subject: Re: Paho/M2M IWG Calls
Date: March 5, 2013 2:28:01 PM GMT+01:00
Hi Marco,
the two observations are problems:
1) Keepalive is much improved but there
is still a hole somewhere - once I have tidied in particular ant build
I will get debug info on a mac and work out what the hole is.
2) The client should detect when the
keepalive is not honoured and cause connection lost to occur when it is
not. Indeed it does in most cases except for the one you have found, The
code that handles it is here:
if
(lastInboundActivity
- lastOutboundActivity
>= this.keepAlive
||
System.currentTimeMillis()
- lastOutboundActivity
>= this.keepAlive)
{
//
Timed Out, send a ping
if
(pingOutstanding)
{
//@TRACE
619=Timed out as no activity, keepAlive={0} lastOutboundActivity={1} lastInboundActivity={2}
log.severe(className,methodName,"619",
new
Object[]{new
Long(this.keepAlive),new
Long(lastOutboundActivity),new
Long(lastInboundActivity)});
//
A ping has already been sent. At this point, assume that the
//
broker has hung and the TCP layer hasn't noticed.
throw
ExceptionHelper.createMqttException(MqttException.REASON_CODE_CLIENT_TIMEOUT);
}
It handles the case where a client only
receives QOS 0 messages but will fail (as per what you see) in the case
where the client is only sending QOS 0 messages.
There are two bugs open around the keepalive
problem suggest using those to continue to track the problem.
All the best
Dave LockeSenior Inventor, Pervasive and
Advanced Messaging Technologies
locke@xxxxxxxxxx
Dave
Locke/UK/IBM@ibmgb
7-246165
(int) +44 1962816165 (ext)
37274133
(mobex) +44 7764132584 (ext)
|
Fringe
Bluepages with fuel injection
My
Cattail: Share
files in IBM and save your in-box |
From:
Nicholas O'Leary/UK/IBM
To:
Marco Carrer <marco.carrer@xxxxxxxxxxxx>,
Cc:
andypiperuk@xxxxxxxxx,
Arlen Nipper <arlen.nipper@xxxxxxxxxxxxxxx>, Cristiano De Alti <Cristiano.DeAlti@xxxxxxxxxxxx>,
Dave Locke/UK/IBM@IBMGB, Scott de Deugd <dedeugd@xxxxxxxxxx>, Wes
Johnson <wes.johnson@xxxxxxxxxxxx>
Date:
05/03/2013 12:58
Subject:
Re: Paho/M2M
IWG Calls
Hi Marco,
I should start by saying I've not been
involved with the development of the new branch of the client, so I'm still
getting up to speed with everything that has changed in it.
I run Ubuntu, so I'll give your test
case a run later on.
I'm uneasy about adding another thread
to the client - it already has three and I've had feedback since first
writing it that even that is too many!
The original Paho client certainly did
track inbound/outbound activity separately for exactly the scenario you
describe. I haven't had time to look at it properly, and won't before the
call in 40 mins, but a quick glance comparing ClientState between the master
branch and the develop branch shows the new method getTimeUntilPing which
only considers the last outbound activity. .... and having just reread
your note I see you've supplied the fix for exactly that :)
Could you attach that fix to the existing
bug (https://bugs.eclipse.org/bugs/show_bug.cgi?id=397651
) or consider raising a new bug as
this is actually a regression in the new branch rather than part of the
on-going intermittent disconnect issue.
Please also attach that test case to
the bug report; I'm keen that with the code now in git we can move this
conversation into the open over on the Paho Dev list.
Cheers,
Nick
|
|
Nicholas O'Leary
|
|
Emerging Technology Services
| External phone: +44 (0)1962
815720
|
|
IBM Software Group UK
| Internal phone: 37245720
|
MP 137, IBM Hursley Park, Winchester,
SO21 2JN
| e-mail: nick_oleary@xxxxxxxxxx |
Andy, Dave,
thanks for the detailed update.
We have been testing the Java client for the past week
and here is our feedback.
1. I attached the test case that we used - a keep alive
of 3 sec and a variable publishing rate/QoS.
With such test, I can confirm that we noticed unexpected
disconnects even on ubuntu not just on Mac OS X.
I believe that we should consider adding a separate CommsPing
thread that is responsible for scheduling Ping messages outside of the
main ClientState get loop.
This will avoid the need of doing complicated math on
the ping time interval.
If you agree on the approach we can help with the implementation/validation.
This is tracked in bug 397651.
2. We also noticed that the paho client does not use the
PING messages to detect a link failure.
In ubuntu, where the default socket timeout is infinite
and the socket buffer is large, sending repeatedly messages with QoS 0
will keep the client "happy" even when the network link goes
down. The client will wait for the "network layer" to report
the connectivity issue instead of using the application layer "ping"
message to detect it.
Similarly, sending repeated messages with QoS 1 or 2 will
first result into an error when the max number of inflight messages is
reached instead of reporting a connection lost.
We can discuss in the call today whether this should be
be considered a desirable ER and tracked in a bug.
In the meantime, for your reference, please find attached
the test driver we used and a possible fix for issue #2 - even if we would
prefer refactoring that logic into a dedicate CommsPing class/thread as
discussed for issue #1.
Thanks,
-Marco
[attachment "ClientState.java"
deleted by Nicholas O'Leary/UK/IBM] [attachment "Test.java" deleted
by Nicholas O'Leary/UK/IBM]
On Feb 12, 2013, at 11:31 AM, Marco Carrer <marco.carrer@xxxxxxxxxxxx>
wrote:
Dave, Arlen,
I finally did some testing on the
latest paho client. Here are few comments:
1. Bug 397651 - Skipping Ping
The problem of the ping messages being sent too late occasionally
is still reproducible on Mac OS X.
Arlen, if you have a chance, please confirm whether that's
the case for you as well.
I looked at the code again and I am afraid we just cannot
rely on time interval differences when using Object.wait().
Some deeper changes are required.
Please find attached a paho version with a proposed fix
in the ClientState class.
I ran the modified version overnight yesterday and I did
not encounter any disconnects so it seems to be fine.
Arlen, if you have a chance, you can try the attached
version and see if it fixes the problem for you too.
2. As I did a bit more testing, I encountered another
issue where, sometimes, the connectionLost callback is not invoked.
I tried bouncing the broker with the client connected
and sometimes, rarely, the connectionLost callback is not invoked.
I also did some testing by disabling and re-enabling the
network connection with similar results.
I do see the following TRACE output from ClientState:516
but the callback.connectionLost does not get invoked.
//@TRACE 619=Timed out as no activity, keepAlive={0} lastOutboundActivity={1}
lastInboundActivity={2}
log.severe(className,methodName,"619", new Object[]{new Long(this.keepAlive),new
Long(lastOutboundActivity),new Long(lastInboundActivity)});
I could not pin point exactly what happened but it seems
to be related to when the disconnect happens during the ping cycle.
Dave, maybe you have some ideas on this. I'll wait for
your comments before filing a new bug.
3. I am looking at SSL as well. With the out of the box
configuration, paho cannot connect to our SSL broker as we use SSLv3.
Dave, can you please briefly describe how we can configure
the paho client to enable the SSLv3 protocol instead of the default TLS?
4. As for the MqttMessage.setDuplicate() method, we can
use the internal methods for now.
Thanks,
-Marco
On Feb 5, 2013, at 5:27 PM, Arlen Nipper <arlen.nipper@xxxxxxxxxxxxxxx>
wrote:
Thanks Dave!
I'm on the road today but as soon as I get into the hotel I'll fire up
some of the test scenarios I've got waiting to test this. I'll let you
know how it goes..
Cheers, Arlen
Arlen Nipper
President and CTO
Cirrus Link Solutions
Cell: +1 913.406.1014
Arlen.Nipper@xxxxxxxxxxxxxxx
On 2/5/13 5:50 AM, Dave Locke wrote:
Hi Marco et al,
please find attached latest Paho Java client which contains the following
updates:
- fix for intermittent ping / keepalive
- updated OSGi manifest
- corrected method name to be Id not ID
In this version the setDuplicate method has not been externalised - primarily
as the dupe handling is performed under the covers and not something that
an MQTT app programmer needs to be exposed to. Is the need to be
able to set it solely from within a custom persistence module (one that
implements MqttClientPersistence)
? The duplicate flag for outbound messages is currently set in the
restoreState() method in ClientState() class when restoring data from the
persistent store. Would like to understand if a) this is not working
as expected or b) could it be enhanced to handle your use case? i.e.
is there a way to do what you are looking for internally within the MQTT
client.
Arlen, Marco, would be much appreciated if you can test the keepalive on
the boxes that show up the intermittent keepalive. I am on hols
until the end of this week and travelling the first few days of next week
but if the testing is good will drop to Eclipse on my return.
All the best
Dave LockeSenior Inventor, Pervasive and
Advanced Messaging Technologies
<Mail Attachment.jpeg>locke@xxxxxxxxxx
<Mail Attachment.jpeg>Dave
Locke/UK/IBM@ibmgb
<Mail Attachment.jpeg>7-246165
(int) +44 1962816165 (ext)
<Mail Attachment.jpeg>37274133
(mobex) +44 7764132584 (ext)
|
<Mail Attachment.gif>
Fringe
Bluepages with fuel injection
<Mail Attachment.gif>
My
Cattail: Share
files in IBM and save your in-box |
From: "Carrer,
Marco" <marco.carrer@xxxxxxxxxxxx>
To: Scott
de Deugd <dedeugd@xxxxxxxxxx>,
Cc: Dave
Locke/UK/IBM@IBMGB, "andypiperuk@xxxxxxxxx"
<andypiperuk@xxxxxxxxx>,
"Tomasson, Hilary" <hilary.tomasson@xxxxxxxxxxxx>
Date: 22/01/2013
08:40
Subject: Paho/M2M
IWG Calls
Scott, Dave,
I have a customer visit this afternoon and I will not be
able to attend the Paho and the IWG calls.
For Paho, our pending issues are reported below.
1. Bug 397651 - Java Paho client intermittently skipping Pings
This is a show stopper for us to be able to use java paho library in production.
2. The OSGi bundle MANIFEST.MF should also export the package org.eclipse.paho.client.mqttv3.persist
to access the MemoryPersistence class.
3. Bug 392052. Any chance of making the MqttMessage.setDuplicate(boolean)
method public in this release?
4. I noticed that IMqttDeliveryToken.getMessageID() uses a capital "D"
in the name of the accessor.
Other APIs in paho use the CamelCase syntax like MqttReceivedMessage.getMessageId().
It is very very minor but we may want to refactor the name to be consistent.
Please let me know if we have an ETA for the next drop.
Thanks,
-Marco
On Jan 8, 2013, at 10:50 AM, Marco Carrer <marco.carrer@xxxxxxxxxxxx>
wrote:
Dave,
to track the Java PING issue we discussed before the holidays,
I filed the following bug:
Bug 397651 - Java Paho client intermittently skipping Pings
Thanks,
-Marco
On Jan 8, 2013, at 4:43 AM, Scott de Deugd <dedeugd@xxxxxxxxxx>
wrote:
(agenda updates per suggestions form Andy)
Project updates
- current Java/C client build and commit (in public) status
- documentation status
- Eclipse plugin status (dependent on updated Java code)
- binary builds / test code out for Milestone
Schedule
- revisit schedule to align with M2M-EPP and Kepler in 6/13
- update project metadata
From: Scott
de Deugd/Raleigh/IBM
To: Paho
PROJECT TEAM
Date: 01/07/2013
06:36 PM
Subject: reminder:
Paho Project call Tuesday
Proposed Agenda
Project updates
Running parallel updates/ patches (multiple contributors)
Next steps toward Milestone 0 (Snapshot Build - moved to 2013)
8 am US EST: Australia (1-800-85-4950 or 0-2-80318490), USA (888-426-6840
215-861-6239), UK (0800-368-0638 0-20-30596451), Italy (800-975100
0-2-00621263)
Passcode 2746478#
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
<org.eclipse.paho.client.mqttv3_eth.jar><ClientState.java>
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
|