Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ecf-dev] zookeeper discovery comes and goes


A service  is signaled as undiscovered either because it is indeed unpublished from the other end point (nothing new) ,  the connection is lost or session expired. These are  dealt with in NodeReader.process(WatchedEvent event),  this is the closest point to the underlying zookeeper that get informed that a data node is deleted or no more reachable, so I'd say debug that short method to see wether it is a genuine serivce undiscovery or one masking a session/connection loss. The recieved event is intuitive and should be strighforward.
 A word about the session  mentioned above:   a  client should send heart beats within a session-timeout interval (currently set to 3000) to be considered still alive.

best regards,

From: Wim Jongman <wim.jongman@xxxxxxxxx>
To: "Eclipse Communication Framework (ECF) developer mailing list." <ecf-dev@xxxxxxxxxxx>, ahmed.aadel@xxxxxxxxxxxxxxxxxx
Date: 13-10-2010 11:02
Subject: Re: [ecf-dev] zookeeper discovery comes and goes

Hi Bryan,

AFAIK the discovery mechanism does not in itself guard the connection and decides to drop it. It merely listens and propagates service registrations/deregistrations to the distribution provider.

I would say this behavior should be seen with other discovery mechanisms as well.

Ahmed can you think of reasons why this could happen?



On Tue, Oct 12, 2010 at 4:56 PM, Bryan Hunt <bhunt@xxxxxxx> wrote:
I'm seeing a strange problem with zookeeper service discovery.  I've been experimenting with YourKit and noticed a ConcurrentModificationException as reported here:

When I went to capture a better stack trace, I noticed that when I remotely connected with YourKit, the client reported zookeeper undiscovered one of my 4 remote services.  When I disconnect and reconnect YourKit, the client reports that zookeeper discovers 4 remote services.

This problem appears repeatable.  The client and all 4 remote services are running on the same machine.  Each remote service is running in it's own JVM.  Any ideas on what is going on, or suggestions for tracking down the problem?  Is there by chance some sort of very short keepalive that might be having problems if there were a network delay or thread delay when YourKit connects to the client JVM?

ecf-dev mailing list


Back to the top