Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » Reconnecting from clone to master after CDO master server failure
Reconnecting from clone to master after CDO master server failure [message #834972] Mon, 02 April 2012 16:19 Go to next message
Scott Dybiec is currently offline Scott DybiecFriend
Messages: 148
Registered: July 2009
Senior Member
I'm writing a sports timing application that needs to be resilient and
high-available when deployed at the race course site --- even with
unreliable network connections. CDO's offline clone and failover
capability look like an excellent fit.

I've regenerated this RCP-based application's EMF model to use native
CDO, got it working with a direct connection to the CDO server. This CDO
stuff is very cool! Getting the application's editors to work with
Delete and deal with the InvalidObject's and StaleReferences has been a
challenge, but I think I've found workarounds.

Now I'm testing the use of CDO's offline clone capability. The cloned
respository seems to work well with the master. A snippet from my
application's log:

10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
changed to state SYNCING
10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
changed to state ONLINE
[INFO] Synchronized with master.
10:21:32 DEBUG (CDOClient.java:93) - Successfully connected
10:21:32 DEBUG (CDOClient.java:56) - You are now connected to the server !

For the first of my resiliency tests, I shutdown the CDO master server
and can see the clone begin to periodically retry creating its
connection to the master server.

After I bring the CDO server back up, I expected the clone repository to
automatically reconnect and return to ONLINE status. However, even after
I bring the CDO server back up, the retries continue, and the clone's
synchronizer is never able to reconnect. Here's the log containing a
stack trace during one of the retries:

[INFO] Disconnected from master.
10:23:11 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
changed to state OFFLINE
[WARN] Connection attempt failed. Retrying in 10 seconds...
org.eclipse.net4j.connector.ConnectorException: Connection timeout after
10000 milliseconds
at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
at org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
at
org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
at
org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1)
at org.eclipse.net4j.signal.SignalProtocol.open(SignalProtocol.java:139)
at
org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.createProtocol(CDONet4jSessionImpl.java:191)
at
org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.openSession(CDONet4jSessionImpl.java:216)
at
org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.doActivate(CDONet4jSessionImpl.java:123)
at org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:72)
at
org.eclipse.emf.internal.cdo.session.CDOSessionConfigurationImpl.openSession(CDOSessionConfigurationImpl.java:250)
at
org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:98)
at
org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:1)
at
org.eclipse.emf.cdo.internal.server.syncing.RepositorySynchronizer$ConnectRunnable.run(RepositorySynchronizer.java:350)
at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:26)
at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:1)
at
org.eclipse.net4j.util.concurrent.QueueWorker.doWork(QueueWorker.java:81)
at org.eclipse.net4j.util.concurrent.QueueWorker.work(QueueWorker.java:72)
at
org.eclipse.net4j.util.concurrent.Worker$WorkerThread.run(Worker.java:206)
[WARN] Connection attempt failed. Retrying in 10 seconds...
org.eclipse.net4j.connector.ConnectorException: Connection timeout after
10000 milliseconds
at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
at org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
at
org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
at
org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1



Is this a supported outage scenario that should work? If so, maybe you
could point me in the right direction for debugging this one.

$cott
Re: Reconnecting from clone to master after CDO master server failure [message #836396 is a reply to message #834972] Wed, 04 April 2012 12:41 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Hi Scott,

I've just tested with our (simple) offline clone examples:

/org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleMaster.java
/org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClone.java
/org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClient.java

All seems to work fine. Have youtried that? Maybe you find a problem in your code by comparing to these examples?

More comments below...


Am 02.04.2012 18:19, schrieb scott@xxxxxxxx:
> I'm writing a sports timing application that needs to be resilient and high-available when deployed at the race course
> site --- even with unreliable network connections. CDO's offline clone and failover capability look like an excellent
> fit.
>
> I've regenerated this RCP-based application's EMF model to use native CDO, got it working with a direct connection to
> the CDO server. This CDO stuff is very cool!
Thanks ;-)

> Getting the application's editors to work with Delete and deal with the InvalidObject's and StaleReferences has been a
> challenge, but I think I've found workarounds.
Remote deletes (detachments) are generally not easy to deal with in applications that are not written for multi user
access from scratch. You should play with CDO's InvalidationPolicies and StaleReferencePolicies, e.g.,

CDOView.Options options = view.options();
options.setStaleReferencePolicy(CDOStaleReferencePolicy.PROXY);
options.setInvalidationPolicy(CDOInvalidationPolicy.RELAXED);

Please note that I recently (CDO 4.1!) applied some smaller fixes that can have a big positive impact:

374962: Make CDOStaleReferencePolicy.PROXY robust for eAdapters() calls
https://bugs.eclipse.org/bugs/show_bug.cgi?id=374962

374965: Make detachment notifications configurable
https://bugs.eclipse.org/bugs/show_bug.cgi?id=374965

375033: Remote notifications must be ignored in CDOPostEventTransactionHandler
https://bugs.eclipse.org/bugs/show_bug.cgi?id=375033

375034: Consolidate server-side exceptions for commit conflicts
https://bugs.eclipse.org/bugs/show_bug.cgi?id=375034

>
> Now I'm testing the use of CDO's offline clone capability. The cloned respository seems to work well with the master.
> A snippet from my application's log:
>
> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository changed to state SYNCING
> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository changed to state ONLINE
> [INFO] Synchronized with master.
> 10:21:32 DEBUG (CDOClient.java:93) - Successfully connected
> 10:21:32 DEBUG (CDOClient.java:56) - You are now connected to the server !
>
> For the first of my resiliency tests, I shutdown the CDO master server and can see the clone begin to periodically
> retry creating its connection to the master server.
>
> After I bring the CDO server back up, I expected the clone repository to automatically reconnect and return to ONLINE
> status. However, even after I bring the CDO server back up, the retries continue, and the clone's synchronizer is
> never able to reconnect. Here's the log containing a stack trace during one of the retries:
>
> [INFO] Disconnected from master.
> 10:23:11 DEBUG (CDORepositoriesManager.java:199) - Cloned repository changed to state OFFLINE
> [WARN] Connection attempt failed. Retrying in 10 seconds...
> org.eclipse.net4j.connector.ConnectorException: Connection timeout after 10000 milliseconds
> at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
> at org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
> at org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
> at org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1)
> at org.eclipse.net4j.signal.SignalProtocol.open(SignalProtocol.java:139)
> at org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.createProtocol(CDONet4jSessionImpl.java:191)
> at org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.openSession(CDONet4jSessionImpl.java:216)
> at org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.doActivate(CDONet4jSessionImpl.java:123)
> at org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:72)
> at org.eclipse.emf.internal.cdo.session.CDOSessionConfigurationImpl.openSession(CDOSessionConfigurationImpl.java:250)
> at
> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:98)
> at
> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:1)
> at
> org.eclipse.emf.cdo.internal.server.syncing.RepositorySynchronizer$ConnectRunnable.run(RepositorySynchronizer.java:350)
> at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:26)
> at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:1)
> at org.eclipse.net4j.util.concurrent.QueueWorker.doWork(QueueWorker.java:81)
> at org.eclipse.net4j.util.concurrent.QueueWorker.work(QueueWorker.java:72)
> at org.eclipse.net4j.util.concurrent.Worker$WorkerThread.run(Worker.java:206)
> [WARN] Connection attempt failed. Retrying in 10 seconds...
> org.eclipse.net4j.connector.ConnectorException: Connection timeout after 10000 milliseconds
> at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
> at org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
> at org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
> at org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1
Note that these log events have severity [WARN]. They not necessarily indicate a bug somewhere, except if you're really
sure that the master should be reachable. Can you connect a normal client to the master?

>
>
>
> Is this a supported outage scenario that should work?
Yes.

> If so, maybe you could point me in the right direction for debugging this one.
I would start on the networking layer by setting breakpoints in

org.eclipse.net4j.internal.tcp.TCPAcceptor.handleRegistration(ITCPSelector, ServerSocketChannel)
org.eclipse.net4j.internal.tcp.TCPAcceptor.handleAccept(ITCPSelector, ServerSocketChannel)

Also make sure that your nodes have enabled as many tracing options as possible. Maybe that shows some more exceptions...

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper


Re: Reconnecting from clone to master after CDO master server failure [message #836416 is a reply to message #836396] Wed, 04 April 2012 13:10 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Here's another brand new fix regarding remote detachment:

376067: CDOFeatureDelta.UNKNOWN_VALUE is not a Notifier
https://bugs.eclipse.org/bugs/show_bug.cgi?id=376067

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper


Re: Reconnecting from clone to master after CDO master server failure [message #837311 is a reply to message #836396] Thu, 05 April 2012 14:04 Go to previous messageGo to next message
Scott Dybiec is currently offline Scott DybiecFriend
Messages: 148
Registered: July 2009
Senior Member
Thanks, Eike. I tried some scenarios using the Offline examples you
suggested to see where I might have gone wrong.

My test scenario is this:

0. Erase all master and clone DB files
1. Start the master server
2. Start the clone server
3. Start the offline client
4. Add an instance of my model to the CDO repository through the clone
and wait for the trans to commit.
4. Exit (Option 0) the master server
5. Restart the master server

This scenario works fine with the one node Company model instance
provided with the examples, but when I use my own large model instance
(2MB graph with 17K+ EObjects), step 5 causes the clone to go into a loop.

Here's how it looks on the clone server console with tracing turned off.
With tracing on, there are stack traces:

Enter a command:
0 - exit
1 - connect repository to network
2 - disconnect repository from network
3 - dump repository infos

Opened Session9 [master]
State changed to SYNCING
State changed to OFFLINE
Opened Session10 [master]
State changed to SYNCING
State changed to OFFLINE
Opened Session11 [master]
State changed to SYNCING
State changed to OFFLINE
Opened Session12 [master]
State changed to SYNCING
State changed to OFFLINE
Opened Session13 [master]
State changed to SYNCING
State changed to OFFLINE
Opened Session14 [master]
State changed to SYNCING

The debugger shows these session threads accumulating in both the master
and clone servers.

I tried creating a larger the Company instance, but that didn't
replicate the problem.

Here are the changes I made to the example code to use my model:

1. AbstractOfflineServer: I changed only the EPackage registration line
CompanyPackage.eINSTANCE.getClass(); changed to
RegattaPackage.eINSTANCE.getRegatta();
2. OfflineExampleMaster: no changes
3. OfflineExampleClone: no changes
4. OfflineExampleClient: replaced the Company model with my model. I
attached this file in case you want to see it.

Here's a snippet from one of the loop iterations captured from the
master server console with tracing turned on:

Thread-12 [org.eclipse.spi.net4j.Channel] Handling buffer:
Buffer@636[PUTTING] --> Channel[1, SERVER, cdo]
Thread-12 [org.eclipse.internal.net4j.buffer.BufferPool] Obtained
Buffer@637[INITIAL]
Thread-12 [org.eclipse.spi.net4j.Channel] Handling buffer:
Buffer@637[PUTTING] --> Channel[1, SERVER, cdo]
[ERROR] org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not found;
SQL statement:
SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx,
l_t.cdo_tag, l_t.cdo_value_DATE, l_t.cdo_value_VARCHAR,
l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT,
l_t.cdo_value_TIMESTAMP, l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME,
l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB
FROM XMLTypeDocumentRoot_mixed_list l_t, XMLTypeDocumentRoot a_t WHERE
a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND
a_t.cdo_branch=l_t.cdo_branch [42122-117]
org.eclipse.net4j.db.DBException: org.h2.jdbc.JdbcSQLException: Column
L_T.CDO_SOURCE not found; SQL statement:
SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx,
l_t.cdo_tag, l_t.cdo_value_DATE, l_t.cdo_value_VARCHAR,
l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT,
l_t.cdo_value_TIMESTAMP, l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME,
l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB
FROM XMLTypeDocumentRoot_mixed_list l_t, XMLTypeDocumentRoot a_t WHERE
a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND
a_t.cdo_branch=l_t.cdo_branch [42122-117]
at org.eclipse.net4j.db.DBUtil.serializeTable(DBUtil.java:808)
at
org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.AbstractHorizontalMappingStrategy.rawExportList(AbstractHorizontalMappingStrategy.java:198)
at
org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.AbstractHorizontalMappingStrategy.rawExport(AbstractHorizontalMappingStrategy.java:179)
at
org.eclipse.emf.cdo.server.internal.db.DBStoreAccessor.rawExport(DBStoreAccessor.java:1107)
at
org.eclipse.emf.cdo.internal.server.Repository.replicateRaw(Repository.java:1113)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.ReplicateRepositoryRawIndication.responding(ReplicateRepositoryRawIndication.java:62)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.responding(CDOServerIndicationWithMonitoring.java:170)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.responding(IndicationWithMonitoring.java:90)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedOutput(IndicationWithResponse.java:96)
at org.eclipse.net4j.signal.Signal.doOutput(Signal.java:296)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:65)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerReadIndicationWithMonitoring.execute(CDOServerReadIndicationWithMonitoring.java:36)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not
found; SQL statement:
SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx,
l_t.cdo_tag, l_t.cdo_value_DATE, l_t.cdo_value_VARCHAR,
l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT,
l_t.cdo_value_TIMESTAMP, l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME,
l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB
FROM XMLTypeDocumentRoot_mixed_list l_t, XMLTypeDocumentRoot a_t WHERE
a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND
a_t.cdo_branch=l_t.cdo_branch [42122-117]
at org.h2.message.Message.getSQLException(Message.java:105)
at org.h2.message.Message.getSQLException(Message.java:116)
at org.h2.message.Message.getSQLException(Message.java:75)
at org.h2.expression.ExpressionColumn.optimize(ExpressionColumn.java:128)
at org.h2.expression.Comparison.optimize(Comparison.java:137)
at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:132)
at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:131)
at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:131)
at org.h2.command.dml.Select.prepare(Select.java:723)
at org.h2.command.Parser.prepareCommand(Parser.java:235)
at org.h2.engine.Session.prepareLocal(Session.java:415)
at org.h2.engine.Session.prepareCommand(Session.java:376)
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1049)
at org.h2.jdbc.JdbcStatement.executeQuery(JdbcStatement.java:70)
at org.eclipse.net4j.db.DBUtil.serializeTable(DBUtil.java:785)
... 17 more

With tracing turned on, the clone's log shows the same 'Column
L_T.CDO_SOURCE not found' message.

What could cause the 'Column L_T.CDO_SOURCE not found' problem?

$cott

On 4/4/2012 8:41 AM, Eike Stepper wrote:
> Hi Scott,
>
> I've just tested with our (simple) offline clone examples:
>
> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleMaster.java
>
> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClone.java
>
> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClient.java
>
>
> All seems to work fine. Have youtried that? Maybe you find a problem in
> your code by comparing to these examples?
>
> More comments below...
>
>
> Am 02.04.2012 18:19, schrieb scott@xxxxxxxx:
>> I'm writing a sports timing application that needs to be resilient and
>> high-available when deployed at the race course site --- even with
>> unreliable network connections. CDO's offline clone and failover
>> capability look like an excellent fit.
>>
>> I've regenerated this RCP-based application's EMF model to use native
>> CDO, got it working with a direct connection to the CDO server. This
>> CDO stuff is very cool!
> Thanks ;-)
>
>> Getting the application's editors to work with Delete and deal with
>> the InvalidObject's and StaleReferences has been a challenge, but I
>> think I've found workarounds.
> Remote deletes (detachments) are generally not easy to deal with in
> applications that are not written for multi user access from scratch.
> You should play with CDO's InvalidationPolicies and
> StaleReferencePolicies, e.g.,
>
> CDOView.Options options = view.options();
> options.setStaleReferencePolicy(CDOStaleReferencePolicy.PROXY);
> options.setInvalidationPolicy(CDOInvalidationPolicy.RELAXED);
>
> Please note that I recently (CDO 4.1!) applied some smaller fixes that
> can have a big positive impact:
>
> 374962: Make CDOStaleReferencePolicy.PROXY robust for eAdapters() calls
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=374962
>
> 374965: Make detachment notifications configurable
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=374965
>
> 375033: Remote notifications must be ignored in
> CDOPostEventTransactionHandler
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=375033
>
> 375034: Consolidate server-side exceptions for commit conflicts
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=375034
>
>>
>> Now I'm testing the use of CDO's offline clone capability. The cloned
>> respository seems to work well with the master. A snippet from my
>> application's log:
>>
>> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>> changed to state SYNCING
>> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>> changed to state ONLINE
>> [INFO] Synchronized with master.
>> 10:21:32 DEBUG (CDOClient.java:93) - Successfully connected
>> 10:21:32 DEBUG (CDOClient.java:56) - You are now connected to the
>> server !
>>
>> For the first of my resiliency tests, I shutdown the CDO master server
>> and can see the clone begin to periodically retry creating its
>> connection to the master server.
>>
>> After I bring the CDO server back up, I expected the clone repository
>> to automatically reconnect and return to ONLINE status. However, even
>> after I bring the CDO server back up, the retries continue, and the
>> clone's synchronizer is never able to reconnect. Here's the log
>> containing a stack trace during one of the retries:
>>
>> [INFO] Disconnected from master.
>> 10:23:11 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>> changed to state OFFLINE
>> [WARN] Connection attempt failed. Retrying in 10 seconds...
>> org.eclipse.net4j.connector.ConnectorException: Connection timeout
>> after 10000 milliseconds
>> at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
>> at
>> org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
>> at
>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
>>
>> at
>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1)
>>
>> at org.eclipse.net4j.signal.SignalProtocol.open(SignalProtocol.java:139)
>> at
>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.createProtocol(CDONet4jSessionImpl.java:191)
>>
>> at
>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.openSession(CDONet4jSessionImpl.java:216)
>>
>> at
>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.doActivate(CDONet4jSessionImpl.java:123)
>>
>> at org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:72)
>> at
>> org.eclipse.emf.internal.cdo.session.CDOSessionConfigurationImpl.openSession(CDOSessionConfigurationImpl.java:250)
>>
>> at
>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:98)
>>
>> at
>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:1)
>>
>> at
>> org.eclipse.emf.cdo.internal.server.syncing.RepositorySynchronizer$ConnectRunnable.run(RepositorySynchronizer.java:350)
>>
>> at
>> org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:26)
>> at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:1)
>> at
>> org.eclipse.net4j.util.concurrent.QueueWorker.doWork(QueueWorker.java:81)
>> at
>> org.eclipse.net4j.util.concurrent.QueueWorker.work(QueueWorker.java:72)
>> at
>> org.eclipse.net4j.util.concurrent.Worker$WorkerThread.run(Worker.java:206)
>>
>> [WARN] Connection attempt failed. Retrying in 10 seconds...
>> org.eclipse.net4j.connector.ConnectorException: Connection timeout
>> after 10000 milliseconds
>> at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
>> at
>> org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
>> at
>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
>>
>> at
>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1
>>
> Note that these log events have severity [WARN]. They not necessarily
> indicate a bug somewhere, except if you're really sure that the master
> should be reachable. Can you connect a normal client to the master?
>
>>
>>
>>
>> Is this a supported outage scenario that should work?
> Yes.
>
>> If so, maybe you could point me in the right direction for debugging
>> this one.
> I would start on the networking layer by setting breakpoints in
>
> org.eclipse.net4j.internal.tcp.TCPAcceptor.handleRegistration(ITCPSelector,
> ServerSocketChannel)
> org.eclipse.net4j.internal.tcp.TCPAcceptor.handleAccept(ITCPSelector,
> ServerSocketChannel)
>
> Also make sure that your nodes have enabled as many tracing options as
> possible. Maybe that shows some more exceptions...
>
> Cheers
> /Eike
>
> ----
> http://www.esc-net.de
> http://thegordian.blogspot.com
> http://twitter.com/eikestepper
>
>


/*
* Copyright (c) 2004 - 2011 Eike Stepper (Berlin, Germany) and others.
* All rights reserved. This program and the accompanying materials
* are made available under the terms of the Eclipse Public License v1.0
* which accompanies this distribution, and is available at
* http://www.eclipse.org/legal/epl-v10.html
*
* Contributors:
* Eike Stepper - initial API and implementation
*/
package org.eclipse.emf.cdo.examples.server.offline;

import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Properties;

import org.apache.log4j.LogManager;
import org.apache.log4j.Logger;
import org.apache.log4j.PropertyConfigurator;
import org.eclipse.emf.cdo.CDOObject;
import org.eclipse.emf.cdo.common.CDOCommonRepository;
import org.eclipse.emf.cdo.common.CDOCommonRepository.State;
import org.eclipse.emf.cdo.common.branch.CDOBranch;
import org.eclipse.emf.cdo.common.commit.CDOCommitInfo;
import org.eclipse.emf.cdo.net4j.CDONet4jUtil;
import org.eclipse.emf.cdo.net4j.CDOSession;
import org.eclipse.emf.cdo.net4j.CDOSessionConfiguration;
import org.eclipse.emf.cdo.session.CDORepositoryInfo;
import org.eclipse.emf.cdo.transaction.CDOTransaction;
import org.eclipse.emf.cdo.util.CDOUtil;
import org.eclipse.emf.cdo.util.CommitException;
import org.eclipse.emf.common.util.EList;
import org.eclipse.emf.common.util.URI;
import org.eclipse.emf.ecore.EObject;
import org.eclipse.emf.ecore.resource.Resource;
import org.eclipse.emf.spi.cdo.DefaultCDOMerger;
import org.eclipse.net4j.Net4jUtil;
import org.eclipse.net4j.connector.IConnector;
import org.eclipse.net4j.util.container.IManagedContainer;
import org.eclipse.net4j.util.event.IEvent;
import org.eclipse.net4j.util.event.IListener;
import org.eclipse.net4j.util.lifecycle.LifecycleUtil;

import com.humanfactor.rw.model.ecore.regatta.Regatta;
import com.humanfactor.rw.model.ecore.regatta.RegattaFactory;
import com.humanfactor.rw.model.ecore.regatta.util.RegattaResourceFactoryImpl;
import com.humanfactor.utils.exception.ExceptionUtil;

/**
* Following console parameters are allowed: <br>
* -automerge provides an automatic merging of the offline changes to the master repository
*
* @author Eike Stepper
* @author Martin Fluegge
* @since 4.0
*/
public class OfflineClient {

private static Logger logger = LogManager.getLogger(OfflineClient.class);

public static final int PORT = 2037;

private static CDOTransaction cloneTx;
private static CDOTransaction masterTx;
private static int versionCount = 1;
private static boolean autoMerging;

private static IManagedContainer cloneContainer = OfflineUtil.createContainer();
private static IManagedContainer masterContainer = OfflineUtil.createContainer();

private static void addObject(CDOTransaction tx) {
try {
Regatta regatta = RegattaFactory.eINSTANCE.createRegatta();
tx.getOrCreateResource("/r1").getContents().add(regatta);

logger.debug("Committing an object to " + tx.getBranch().getPathName());
CDOCommitInfo commitInfo = tx.commit();
CDOBranch branch = commitInfo.getBranch();
System.out.println("Committed an object to " + branch.getPathName());
tx.setBranch(branch);
} catch (CommitException x) {
throw new RuntimeException(x);
}
}

private static void addRegatta(CDOTransaction tx) {
try {
String fileName = "regattas/Stotesbury-Cup-Regatta-2010-05-14.rml";
Regatta regatta = loadRegatta(fileName);
tx.getOrCreateResource("/regattas/Stotesbury-Cup-Regatta-2010-05-14-v" + versionCount++ + ".rml")
.getContents().add(regatta);

System.out.println("Committing an regatta to " + tx.getBranch().getPathName());
CDOCommitInfo commitInfo = tx.commit();
CDOBranch branch = commitInfo.getBranch();
System.out.println("Committed a regatta to " + branch.getPathName());
tx.setBranch(branch);
} catch (CommitException x) {
throw new RuntimeException(x);
}
}

private static void lockObject(CDOTransaction tx) {
EList<EObject> contents = tx.getOrCreateResource("/r1").getContents();
int size = contents.size();
if (size < 1) {
System.out.println("There are no objects; can't lock anything.");
}

System.out.println("Locking last object");
CDOObject firstObject = CDOUtil.getCDOObject(contents.get(size - 1));
firstObject.cdoWriteLock().lock();
System.out.println("Locked last object");
}

private static void unlockObject(CDOTransaction tx) {
EList<EObject> contents = tx.getOrCreateResource("/r1").getContents();
int size = contents.size();
if (size < 1) {
System.out.println("There are no objects; can't lock anything.");
}

System.out.println("Unlocking last object");
CDOObject firstObject = CDOUtil.getCDOObject(contents.get(size - 1));
firstObject.cdoWriteLock().unlock();
System.out.println("Unlocked last object");
}

private static void createBranch(CDOTransaction tx) {
CDOBranch subBranch = tx.getBranch().createBranch("sub.1");
tx.setBranch(subBranch);
}

private static boolean isAutoMerge(String[] args) {
for (int i = 0; i < args.length; i++) {
if (args[i].equals("-automerge")) {
return true;
}
}

return false;
}

private static void createSessionListener(final CDOSession session, final boolean autoMerging) {
session.addListener(new IListener() {
private boolean wasOffline;

public void notifyEvent(IEvent event) {
if (event instanceof CDOCommonRepository.StateChangedEvent) {
CDOCommonRepository.StateChangedEvent e = (CDOCommonRepository.StateChangedEvent) event;
State newState = e.getNewState();
System.out.println("State changed to " + newState);
if (autoMerging) {
merge(session, newState);
}
}
}

private void merge(final CDOSession session, State newState) {
if (newState == State.ONLINE && wasOffline) {
try {
CDOTransaction newTransaction = session.openTransaction(session.getBranchManager()
.getMainBranch());

// CDOBranch mainBranch = session.getBranchManager().getMainBranch();
newTransaction.merge(getCloneTransaction().getBranch().getHead(),
new DefaultCDOMerger.PerFeature.ManyValued());

newTransaction.commit();
getCloneTransaction().close();
cloneTx = newTransaction;
} catch (CommitException ex) {
ex.printStackTrace();
} finally {
wasOffline = false;
}
} else if (newState == State.OFFLINE) {
wasOffline = true;
}
}
});
}

private static CDOTransaction createCloneTransaction() {
System.out.println("Connecting to clone...");
IConnector connector = Net4jUtil.getConnector(cloneContainer, AbstractOfflineServer.TRANSPORT_TYPE,
"localhost:" + PORT);

CDOSessionConfiguration configuration = CDONet4jUtil.createSessionConfiguration();
configuration.setConnector(connector);
configuration.setRepositoryName(OfflineCloneServer.NAME);

CDOSession session = configuration.openSession();
CDORepositoryInfo repositoryInfo = session.getRepositoryInfo();
System.out.println("Connected to " + repositoryInfo.getName());

CDOTransaction tx = session.openTransaction();
tx.enableDurableLocking(true);
createSessionListener(session, autoMerging);
return tx;
}

private static CDOTransaction createMasterTransaction() {
System.out.println("Connecting directly to master...");
IConnector connector = Net4jUtil.getConnector(masterContainer, AbstractOfflineServer.TRANSPORT_TYPE,
"localhost:" + OfflineMasterServer.PORT);
CDOSessionConfiguration configuration = CDONet4jUtil.createSessionConfiguration();
configuration.setConnector(connector);
configuration.setRepositoryName(OfflineMasterServer.NAME);

CDOSession session = configuration.openSession();
CDORepositoryInfo repositoryInfo = session.getRepositoryInfo();
System.out.println("Connected to " + repositoryInfo.getName());

CDOTransaction tx = session.openTransaction();
tx.enableDurableLocking(true);
return tx;
}

public static CDOTransaction getMasterTransaction() {
if (masterTx == null) {
masterTx = createMasterTransaction();
}
return masterTx;
}

public static CDOTransaction getCloneTransaction() {
if (cloneTx == null) {
cloneTx = createCloneTransaction();
}
return cloneTx;
}

public static void main(String[] args) throws Exception {

Properties props = new Properties();
props.setProperty("log4j.rootLogger", "DEBUG, stdout");
props.setProperty("log4j.appender.stdout", "org.apache.log4j.ConsoleAppender");
props.setProperty("log4j.appender.stdout.layout", "org.apache.log4j.PatternLayout");
props.setProperty("log4j.appender.stdout.layout.ConversionPattern", "%-4r [%t] %-5p %c %x - %m%n");
props.setProperty("log4j.logger.com.humanfactor.rw", "DEBUG");
props.setProperty("log4j.logger.org.apache", "ERROR");

PropertyConfigurator.configure(props);

autoMerging = isAutoMerge(args);

for (;;) {
System.out.println();
System.out.println("Enter a command:");
System.out.println("0 - exit");
System.out.println("1 - add an object to the repository");
System.out.println("2 - lock the last object in the repository");
System.out.println("3 - unlock the last object in the repository");
System.out.println("4 - create a branch");
System.out.println("5 - add a large regatta directly to the master repository");
System.out.println("6 - add a large regatta directly to the clone repository");

String command = new BufferedReader(new InputStreamReader(System.in)).readLine();
if ("0".equals(command)) {
break;
}

if ("1".equals(command)) {
addObject(getCloneTransaction());
} else if ("2".equals(command)) {
lockObject(getCloneTransaction());
} else if ("3".equals(command)) {
unlockObject(getCloneTransaction());
} else if ("4".equals(command)) {
createBranch(getCloneTransaction());
} else if ("5".equals(command)) {
addRegatta(getMasterTransaction());
} else if ("6".equals(command)) {
addRegatta(getCloneTransaction());
}
}

if (cloneTx != null) {
cloneTx.getSession().close();
}
LifecycleUtil.deactivate(cloneContainer);

if (masterTx != null) {
masterTx.getSession().close();
}
LifecycleUtil.deactivate(masterContainer);
}

public static Regatta loadRegatta(String regattaFileName) {
Regatta regatta;
Resource.Factory.Registry.INSTANCE.getExtensionToFactoryMap().put("rml", new RegattaResourceFactoryImpl());
Resource.Factory.Registry.INSTANCE.getExtensionToFactoryMap().put("rmz", new RegattaResourceFactoryImpl());

URI fileURI = URI.createFileURI(new File(regattaFileName).getAbsolutePath());
RegattaResourceFactoryImpl regattaResourceFactory = new RegattaResourceFactoryImpl();
Resource regattaResource = regattaResourceFactory.createResource(fileURI);

try {
regattaResource.load(null);
if (regattaResource.getContents().size() == 1) {
regatta = (Regatta) regattaResource.getContents().get(0);
} else if (regattaResource.getContents().size() == 0) {
logger.error("No regattas in file '" + regattaFileName + "'");
regatta = null;
} else {
logger.error("File '" + regattaFileName + "' contains multiple ("
+ regattaResource.getContents().size() + ") resources");
regatta = (Regatta) regattaResource.getContents().get(0);
}
} catch (IOException e) {
logger.error(ExceptionUtil.getStackTrace(e), e);
regatta = null;
}
return regatta;
}
}
Re: Reconnecting from clone to master after CDO master server failure [message #837421 is a reply to message #837311] Thu, 05 April 2012 16:37 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Hi Scott,

Looking at the failing SQL statement it seems that you're using feature maps in your model. They belong to the functions
with the most complex implementation and we know of some problems that we just don't have the resources to
investigate/fix adequately. I'm sure nobody has used them so far together with offline replication. Nevertheless I'd
like to discuss the issue with Stefan Winkler, our DBStore expert. Perhaps the solution is as easy as replacing the
respective String constants in CDODBSchema.java as follows:

/**
* Field names of featuremap tables
*/
public static final String FEATUREMAP_REVISION_ID = LIST_REVISION_ID;
public static final String FEATUREMAP_VERSION = LIST_REVISION_VERSION;
public static final String FEATUREMAP_VERSION_ADDED = LIST_REVISION_VERSION_ADDED;
public static final String FEATUREMAP_VERSION_REMOVED = LIST_REVISION_VERSION_REMOVED;
public static final String FEATUREMAP_BRANCH = LIST_REVISION_BRANCH;
public static final String FEATUREMAP_IDX = LIST_IDX;
public static final String FEATUREMAP_TAG = LIST_FEATURE;
public static final String FEATUREMAP_VALUE = LIST_VALUE;

Can you please test this with your model and report here?

And please submit a bugzilla with your problem description and this formatted SQL:

SELECT
l_t.cdo_id,
l_t.cdo_branch,
l_t.cdo_version,
l_t.cdo_idx,
l_t.cdo_tag,
l_t.cdo_value_DATE,
l_t.cdo_value_VARCHAR,
l_t.cdo_value_BOOLEAN,
l_t.cdo_value_BIGINT,
l_t.cdo_value_SMALLINT,
l_t.cdo_value_TIMESTAMP,
l_t.cdo_value_DOUBLE,
l_t.cdo_value_TIME,
l_t.cdo_value_FLOAT,
l_t.cdo_value_BLOB,
l_t.cdo_value_CHAR,
l_t.cdo_value_INTEGER,
l_t.cdo_value_LONGVARCHAR,
l_t.cdo_value_CLOB
FROM
XMLTypeDocumentRoot_mixed_list l_t,
XMLTypeDocumentRoot a_t
WHERE
a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
a_t.cdo_id=l_t.cdo_source AND
a_t.cdo_version=l_t.cdo_version AND
a_t.cdo_branch=l_t.cdo_branch

Sorry for the inconvenience!

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper



Am 05.04.2012 16:04, schrieb scott@xxxxxxxx:
> Thanks, Eike. I tried some scenarios using the Offline examples you suggested to see where I might have gone wrong.
>
> My test scenario is this:
>
> 0. Erase all master and clone DB files
> 1. Start the master server
> 2. Start the clone server
> 3. Start the offline client
> 4. Add an instance of my model to the CDO repository through the clone and wait for the trans to commit.
> 4. Exit (Option 0) the master server
> 5. Restart the master server
>
> This scenario works fine with the one node Company model instance provided with the examples, but when I use my own
> large model instance (2MB graph with 17K+ EObjects), step 5 causes the clone to go into a loop.
>
> Here's how it looks on the clone server console with tracing turned off. With tracing on, there are stack traces:
>
> Enter a command:
> 0 - exit
> 1 - connect repository to network
> 2 - disconnect repository from network
> 3 - dump repository infos
>
> Opened Session9 [master]
> State changed to SYNCING
> State changed to OFFLINE
> Opened Session10 [master]
> State changed to SYNCING
> State changed to OFFLINE
> Opened Session11 [master]
> State changed to SYNCING
> State changed to OFFLINE
> Opened Session12 [master]
> State changed to SYNCING
> State changed to OFFLINE
> Opened Session13 [master]
> State changed to SYNCING
> State changed to OFFLINE
> Opened Session14 [master]
> State changed to SYNCING
>
> The debugger shows these session threads accumulating in both the master and clone servers.
>
> I tried creating a larger the Company instance, but that didn't replicate the problem.
>
> Here are the changes I made to the example code to use my model:
>
> 1. AbstractOfflineServer: I changed only the EPackage registration line CompanyPackage.eINSTANCE.getClass(); changed
> to RegattaPackage.eINSTANCE.getRegatta();
> 2. OfflineExampleMaster: no changes
> 3. OfflineExampleClone: no changes
> 4. OfflineExampleClient: replaced the Company model with my model. I attached this file in case you want to see it.
>
> Here's a snippet from one of the loop iterations captured from the master server console with tracing turned on:
>
> Thread-12 [org.eclipse.spi.net4j.Channel] Handling buffer: Buffer@636[PUTTING] --> Channel[1, SERVER, cdo]
> Thread-12 [org.eclipse.internal.net4j.buffer.BufferPool] Obtained Buffer@637[INITIAL]
> Thread-12 [org.eclipse.spi.net4j.Channel] Handling buffer: Buffer@637[PUTTING] --> Channel[1, SERVER, cdo]
> [ERROR] org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not found; SQL statement:
> SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx, l_t.cdo_tag, l_t.cdo_value_DATE,
> l_t.cdo_value_VARCHAR, l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT, l_t.cdo_value_TIMESTAMP,
> l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME, l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
> l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB FROM XMLTypeDocumentRoot_mixed_list l_t,
> XMLTypeDocumentRoot a_t WHERE a_t.cdo_created BETWEEN 1 AND 1333579470612 AND a_t.cdo_id=l_t.cdo_source AND
> a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch [42122-117]
> org.eclipse.net4j.db.DBException: org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not found; SQL statement:
> SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx, l_t.cdo_tag, l_t.cdo_value_DATE,
> l_t.cdo_value_VARCHAR, l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT, l_t.cdo_value_TIMESTAMP,
> l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME, l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
> l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB FROM XMLTypeDocumentRoot_mixed_list l_t,
> XMLTypeDocumentRoot a_t WHERE a_t.cdo_created BETWEEN 1 AND 1333579470612 AND a_t.cdo_id=l_t.cdo_source AND
> a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch [42122-117]
> at org.eclipse.net4j.db.DBUtil.serializeTable(DBUtil.java:808)
> at
> org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.AbstractHorizontalMappingStrategy.rawExportList(AbstractHorizontalMappingStrategy.java:198)
> at
> org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.AbstractHorizontalMappingStrategy.rawExport(AbstractHorizontalMappingStrategy.java:179)
> at org.eclipse.emf.cdo.server.internal.db.DBStoreAccessor.rawExport(DBStoreAccessor.java:1107)
> at org.eclipse.emf.cdo.internal.server.Repository.replicateRaw(Repository.java:1113)
> at
> org.eclipse.emf.cdo.server.internal.net4j.protocol.ReplicateRepositoryRawIndication.responding(ReplicateRepositoryRawIndication.java:62)
> at
> org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.responding(CDOServerIndicationWithMonitoring.java:170)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.responding(IndicationWithMonitoring.java:90)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedOutput(IndicationWithResponse.java:96)
> at org.eclipse.net4j.signal.Signal.doOutput(Signal.java:296)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:65)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> at
> org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerReadIndicationWithMonitoring.execute(CDOServerReadIndicationWithMonitoring.java:36)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> Caused by: org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not found; SQL statement:
> SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx, l_t.cdo_tag, l_t.cdo_value_DATE,
> l_t.cdo_value_VARCHAR, l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT, l_t.cdo_value_TIMESTAMP,
> l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME, l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
> l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB FROM XMLTypeDocumentRoot_mixed_list l_t,
> XMLTypeDocumentRoot a_t WHERE a_t.cdo_created BETWEEN 1 AND 1333579470612 AND a_t.cdo_id=l_t.cdo_source AND
> a_t.cdo_version=l_t.cdo_version AND a_t.cdo_branch=l_t.cdo_branch [42122-117]
> at org.h2.message.Message.getSQLException(Message.java:105)
> at org.h2.message.Message.getSQLException(Message.java:116)
> at org.h2.message.Message.getSQLException(Message.java:75)
> at org.h2.expression.ExpressionColumn.optimize(ExpressionColumn.java:128)
> at org.h2.expression.Comparison.optimize(Comparison.java:137)
> at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:132)
> at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:131)
> at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:131)
> at org.h2.command.dml.Select.prepare(Select.java:723)
> at org.h2.command.Parser.prepareCommand(Parser.java:235)
> at org.h2.engine.Session.prepareLocal(Session.java:415)
> at org.h2.engine.Session.prepareCommand(Session.java:376)
> at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1049)
> at org.h2.jdbc.JdbcStatement.executeQuery(JdbcStatement.java:70)
> at org.eclipse.net4j.db.DBUtil.serializeTable(DBUtil.java:785)
> ... 17 more
>
> With tracing turned on, the clone's log shows the same 'Column L_T.CDO_SOURCE not found' message.
>
> What could cause the 'Column L_T.CDO_SOURCE not found' problem?
>
> $cott
>
> On 4/4/2012 8:41 AM, Eike Stepper wrote:
>> Hi Scott,
>>
>> I've just tested with our (simple) offline clone examples:
>>
>> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleMaster.java
>>
>> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClone.java
>>
>> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClient.java
>>
>>
>> All seems to work fine. Have youtried that? Maybe you find a problem in
>> your code by comparing to these examples?
>>
>> More comments below...
>>
>>
>> Am 02.04.2012 18:19, schrieb scott@xxxxxxxx:
>>> I'm writing a sports timing application that needs to be resilient and
>>> high-available when deployed at the race course site --- even with
>>> unreliable network connections. CDO's offline clone and failover
>>> capability look like an excellent fit.
>>>
>>> I've regenerated this RCP-based application's EMF model to use native
>>> CDO, got it working with a direct connection to the CDO server. This
>>> CDO stuff is very cool!
>> Thanks ;-)
>>
>>> Getting the application's editors to work with Delete and deal with
>>> the InvalidObject's and StaleReferences has been a challenge, but I
>>> think I've found workarounds.
>> Remote deletes (detachments) are generally not easy to deal with in
>> applications that are not written for multi user access from scratch.
>> You should play with CDO's InvalidationPolicies and
>> StaleReferencePolicies, e.g.,
>>
>> CDOView.Options options = view.options();
>> options.setStaleReferencePolicy(CDOStaleReferencePolicy.PROXY);
>> options.setInvalidationPolicy(CDOInvalidationPolicy.RELAXED);
>>
>> Please note that I recently (CDO 4.1!) applied some smaller fixes that
>> can have a big positive impact:
>>
>> 374962: Make CDOStaleReferencePolicy.PROXY robust for eAdapters() calls
>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=374962
>>
>> 374965: Make detachment notifications configurable
>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=374965
>>
>> 375033: Remote notifications must be ignored in
>> CDOPostEventTransactionHandler
>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=375033
>>
>> 375034: Consolidate server-side exceptions for commit conflicts
>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=375034
>>
>>>
>>> Now I'm testing the use of CDO's offline clone capability. The cloned
>>> respository seems to work well with the master. A snippet from my
>>> application's log:
>>>
>>> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>>> changed to state SYNCING
>>> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>>> changed to state ONLINE
>>> [INFO] Synchronized with master.
>>> 10:21:32 DEBUG (CDOClient.java:93) - Successfully connected
>>> 10:21:32 DEBUG (CDOClient.java:56) - You are now connected to the
>>> server !
>>>
>>> For the first of my resiliency tests, I shutdown the CDO master server
>>> and can see the clone begin to periodically retry creating its
>>> connection to the master server.
>>>
>>> After I bring the CDO server back up, I expected the clone repository
>>> to automatically reconnect and return to ONLINE status. However, even
>>> after I bring the CDO server back up, the retries continue, and the
>>> clone's synchronizer is never able to reconnect. Here's the log
>>> containing a stack trace during one of the retries:
>>>
>>> [INFO] Disconnected from master.
>>> 10:23:11 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>>> changed to state OFFLINE
>>> [WARN] Connection attempt failed. Retrying in 10 seconds...
>>> org.eclipse.net4j.connector.ConnectorException: Connection timeout
>>> after 10000 milliseconds
>>> at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
>>> at
>>> org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
>>> at
>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
>>>
>>> at
>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1)
>>>
>>> at org.eclipse.net4j.signal.SignalProtocol.open(SignalProtocol.java:139)
>>> at
>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.createProtocol(CDONet4jSessionImpl.java:191)
>>>
>>> at
>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.openSession(CDONet4jSessionImpl.java:216)
>>>
>>> at
>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.doActivate(CDONet4jSessionImpl.java:123)
>>>
>>> at org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:72)
>>> at
>>> org.eclipse.emf.internal.cdo.session.CDOSessionConfigurationImpl.openSession(CDOSessionConfigurationImpl.java:250)
>>>
>>> at
>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:98)
>>>
>>>
>>> at
>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:1)
>>>
>>>
>>> at
>>> org.eclipse.emf.cdo.internal.server.syncing.RepositorySynchronizer$ConnectRunnable.run(RepositorySynchronizer.java:350)
>>>
>>> at
>>> org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:26)
>>> at org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:1)
>>> at
>>> org.eclipse.net4j.util.concurrent.QueueWorker.doWork(QueueWorker.java:81)
>>> at
>>> org.eclipse.net4j.util.concurrent.QueueWorker.work(QueueWorker.java:72)
>>> at
>>> org.eclipse.net4j.util.concurrent.Worker$WorkerThread.run(Worker.java:206)
>>>
>>> [WARN] Connection attempt failed. Retrying in 10 seconds...
>>> org.eclipse.net4j.connector.ConnectorException: Connection timeout
>>> after 10000 milliseconds
>>> at org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
>>> at
>>> org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
>>> at
>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
>>>
>>> at
>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1
>>>
>> Note that these log events have severity [WARN]. They not necessarily
>> indicate a bug somewhere, except if you're really sure that the master
>> should be reachable. Can you connect a normal client to the master?
>>
>>>
>>>
>>>
>>> Is this a supported outage scenario that should work?
>> Yes.
>>
>>> If so, maybe you could point me in the right direction for debugging
>>> this one.
>> I would start on the networking layer by setting breakpoints in
>>
>> org.eclipse.net4j.internal.tcp.TCPAcceptor.handleRegistration(ITCPSelector,
>> ServerSocketChannel)
>> org.eclipse.net4j.internal.tcp.TCPAcceptor.handleAccept(ITCPSelector,
>> ServerSocketChannel)
>>
>> Also make sure that your nodes have enabled as many tracing options as
>> possible. Maybe that shows some more exceptions...
>>
>> Cheers
>> /Eike
>>
>> ----
>> http://www.esc-net.de
>> http://thegordian.blogspot.com
>> http://twitter.com/eikestepper
>>
>>
>


Re: Reconnecting from clone to master after CDO master server failure [message #838051 is a reply to message #837421] Fri, 06 April 2012 13:48 Go to previous message
Scott Dybiec is currently offline Scott DybiecFriend
Messages: 148
Registered: July 2009
Senior Member
Bugzilla is opened. The patch you provided below worked to solve the
problem. Details in the bug record:

https://bugs.eclipse.org/bugs/show_bug.cgi?id=376252

$cott

On 4/5/2012 12:37 PM, Eike Stepper wrote:
> Hi Scott,
>
> Looking at the failing SQL statement it seems that you're using feature
> maps in your model. They belong to the functions with the most complex
> implementation and we know of some problems that we just don't have the
> resources to investigate/fix adequately. I'm sure nobody has used them
> so far together with offline replication. Nevertheless I'd like to
> discuss the issue with Stefan Winkler, our DBStore expert. Perhaps the
> solution is as easy as replacing the respective String constants in
> CDODBSchema.java as follows:
>
> /**
> * Field names of featuremap tables
> */
> public static final String FEATUREMAP_REVISION_ID = LIST_REVISION_ID;
> public static final String FEATUREMAP_VERSION = LIST_REVISION_VERSION;
> public static final String FEATUREMAP_VERSION_ADDED =
> LIST_REVISION_VERSION_ADDED;
> public static final String FEATUREMAP_VERSION_REMOVED =
> LIST_REVISION_VERSION_REMOVED;
> public static final String FEATUREMAP_BRANCH = LIST_REVISION_BRANCH;
> public static final String FEATUREMAP_IDX = LIST_IDX;
> public static final String FEATUREMAP_TAG = LIST_FEATURE;
> public static final String FEATUREMAP_VALUE = LIST_VALUE;
>
> Can you please test this with your model and report here?
>
> And please submit a bugzilla with your problem description and this
> formatted SQL:
>
> SELECT
> l_t.cdo_id,
> l_t.cdo_branch,
> l_t.cdo_version,
> l_t.cdo_idx,
> l_t.cdo_tag,
> l_t.cdo_value_DATE,
> l_t.cdo_value_VARCHAR,
> l_t.cdo_value_BOOLEAN,
> l_t.cdo_value_BIGINT,
> l_t.cdo_value_SMALLINT,
> l_t.cdo_value_TIMESTAMP,
> l_t.cdo_value_DOUBLE,
> l_t.cdo_value_TIME,
> l_t.cdo_value_FLOAT,
> l_t.cdo_value_BLOB,
> l_t.cdo_value_CHAR,
> l_t.cdo_value_INTEGER,
> l_t.cdo_value_LONGVARCHAR,
> l_t.cdo_value_CLOB
> FROM
> XMLTypeDocumentRoot_mixed_list l_t,
> XMLTypeDocumentRoot a_t
> WHERE
> a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
> a_t.cdo_id=l_t.cdo_source AND
> a_t.cdo_version=l_t.cdo_version AND
> a_t.cdo_branch=l_t.cdo_branch
>
> Sorry for the inconvenience!
>
> Cheers
> /Eike
>
> ----
> http://www.esc-net.de
> http://thegordian.blogspot.com
> http://twitter.com/eikestepper
>
>
>
> Am 05.04.2012 16:04, schrieb scott@xxxxxxxx:
>> Thanks, Eike. I tried some scenarios using the Offline examples you
>> suggested to see where I might have gone wrong.
>>
>> My test scenario is this:
>>
>> 0. Erase all master and clone DB files
>> 1. Start the master server
>> 2. Start the clone server
>> 3. Start the offline client
>> 4. Add an instance of my model to the CDO repository through the clone
>> and wait for the trans to commit.
>> 4. Exit (Option 0) the master server
>> 5. Restart the master server
>>
>> This scenario works fine with the one node Company model instance
>> provided with the examples, but when I use my own large model instance
>> (2MB graph with 17K+ EObjects), step 5 causes the clone to go into a
>> loop.
>>
>> Here's how it looks on the clone server console with tracing turned
>> off. With tracing on, there are stack traces:
>>
>> Enter a command:
>> 0 - exit
>> 1 - connect repository to network
>> 2 - disconnect repository from network
>> 3 - dump repository infos
>>
>> Opened Session9 [master]
>> State changed to SYNCING
>> State changed to OFFLINE
>> Opened Session10 [master]
>> State changed to SYNCING
>> State changed to OFFLINE
>> Opened Session11 [master]
>> State changed to SYNCING
>> State changed to OFFLINE
>> Opened Session12 [master]
>> State changed to SYNCING
>> State changed to OFFLINE
>> Opened Session13 [master]
>> State changed to SYNCING
>> State changed to OFFLINE
>> Opened Session14 [master]
>> State changed to SYNCING
>>
>> The debugger shows these session threads accumulating in both the
>> master and clone servers.
>>
>> I tried creating a larger the Company instance, but that didn't
>> replicate the problem.
>>
>> Here are the changes I made to the example code to use my model:
>>
>> 1. AbstractOfflineServer: I changed only the EPackage registration
>> line CompanyPackage.eINSTANCE.getClass(); changed to
>> RegattaPackage.eINSTANCE.getRegatta();
>> 2. OfflineExampleMaster: no changes
>> 3. OfflineExampleClone: no changes
>> 4. OfflineExampleClient: replaced the Company model with my model. I
>> attached this file in case you want to see it.
>>
>> Here's a snippet from one of the loop iterations captured from the
>> master server console with tracing turned on:
>>
>> Thread-12 [org.eclipse.spi.net4j.Channel] Handling buffer:
>> Buffer@636[PUTTING] --> Channel[1, SERVER, cdo]
>> Thread-12 [org.eclipse.internal.net4j.buffer.BufferPool] Obtained
>> Buffer@637[INITIAL]
>> Thread-12 [org.eclipse.spi.net4j.Channel] Handling buffer:
>> Buffer@637[PUTTING] --> Channel[1, SERVER, cdo]
>> [ERROR] org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not found;
>> SQL statement:
>> SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx,
>> l_t.cdo_tag, l_t.cdo_value_DATE, l_t.cdo_value_VARCHAR,
>> l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT,
>> l_t.cdo_value_TIMESTAMP, l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME,
>> l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
>> l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB
>> FROM XMLTypeDocumentRoot_mixed_list l_t, XMLTypeDocumentRoot a_t WHERE
>> a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
>> a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND
>> a_t.cdo_branch=l_t.cdo_branch [42122-117]
>> org.eclipse.net4j.db.DBException: org.h2.jdbc.JdbcSQLException: Column
>> L_T.CDO_SOURCE not found; SQL statement:
>> SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx,
>> l_t.cdo_tag, l_t.cdo_value_DATE, l_t.cdo_value_VARCHAR,
>> l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT,
>> l_t.cdo_value_TIMESTAMP, l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME,
>> l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
>> l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB
>> FROM XMLTypeDocumentRoot_mixed_list l_t, XMLTypeDocumentRoot a_t WHERE
>> a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
>> a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND
>> a_t.cdo_branch=l_t.cdo_branch [42122-117]
>> at org.eclipse.net4j.db.DBUtil.serializeTable(DBUtil.java:808)
>> at
>> org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.AbstractHorizontalMappingStrategy.rawExportList(AbstractHorizontalMappingStrategy.java:198)
>>
>> at
>> org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.AbstractHorizontalMappingStrategy.rawExport(AbstractHorizontalMappingStrategy.java:179)
>>
>> at
>> org.eclipse.emf.cdo.server.internal.db.DBStoreAccessor.rawExport(DBStoreAccessor.java:1107)
>>
>> at
>> org.eclipse.emf.cdo.internal.server.Repository.replicateRaw(Repository.java:1113)
>>
>> at
>> org.eclipse.emf.cdo.server.internal.net4j.protocol.ReplicateRepositoryRawIndication.responding(ReplicateRepositoryRawIndication.java:62)
>>
>> at
>> org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.responding(CDOServerIndicationWithMonitoring.java:170)
>>
>> at
>> org.eclipse.net4j.signal.IndicationWithMonitoring.responding(IndicationWithMonitoring.java:90)
>>
>> at
>> org.eclipse.net4j.signal.IndicationWithResponse.doExtendedOutput(IndicationWithResponse.java:96)
>>
>> at org.eclipse.net4j.signal.Signal.doOutput(Signal.java:296)
>> at
>> org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:65)
>>
>> at
>> org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>>
>> at
>> org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerReadIndicationWithMonitoring.execute(CDOServerReadIndicationWithMonitoring.java:36)
>>
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>> at java.lang.Thread.run(Unknown Source)
>> Caused by: org.h2.jdbc.JdbcSQLException: Column L_T.CDO_SOURCE not
>> found; SQL statement:
>> SELECT l_t.cdo_id, l_t.cdo_branch, l_t.cdo_version, l_t.cdo_idx,
>> l_t.cdo_tag, l_t.cdo_value_DATE, l_t.cdo_value_VARCHAR,
>> l_t.cdo_value_BOOLEAN, l_t.cdo_value_BIGINT, l_t.cdo_value_SMALLINT,
>> l_t.cdo_value_TIMESTAMP, l_t.cdo_value_DOUBLE, l_t.cdo_value_TIME,
>> l_t.cdo_value_FLOAT, l_t.cdo_value_BLOB, l_t.cdo_value_CHAR,
>> l_t.cdo_value_INTEGER, l_t.cdo_value_LONGVARCHAR, l_t.cdo_value_CLOB
>> FROM XMLTypeDocumentRoot_mixed_list l_t, XMLTypeDocumentRoot a_t WHERE
>> a_t.cdo_created BETWEEN 1 AND 1333579470612 AND
>> a_t.cdo_id=l_t.cdo_source AND a_t.cdo_version=l_t.cdo_version AND
>> a_t.cdo_branch=l_t.cdo_branch [42122-117]
>> at org.h2.message.Message.getSQLException(Message.java:105)
>> at org.h2.message.Message.getSQLException(Message.java:116)
>> at org.h2.message.Message.getSQLException(Message.java:75)
>> at org.h2.expression.ExpressionColumn.optimize(ExpressionColumn.java:128)
>> at org.h2.expression.Comparison.optimize(Comparison.java:137)
>> at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:132)
>> at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:131)
>> at org.h2.expression.ConditionAndOr.optimize(ConditionAndOr.java:131)
>> at org.h2.command.dml.Select.prepare(Select.java:723)
>> at org.h2.command.Parser.prepareCommand(Parser.java:235)
>> at org.h2.engine.Session.prepareLocal(Session.java:415)
>> at org.h2.engine.Session.prepareCommand(Session.java:376)
>> at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1049)
>> at org.h2.jdbc.JdbcStatement.executeQuery(JdbcStatement.java:70)
>> at org.eclipse.net4j.db.DBUtil.serializeTable(DBUtil.java:785)
>> ... 17 more
>>
>> With tracing turned on, the clone's log shows the same 'Column
>> L_T.CDO_SOURCE not found' message.
>>
>> What could cause the 'Column L_T.CDO_SOURCE not found' problem?
>>
>> $cott
>>
>> On 4/4/2012 8:41 AM, Eike Stepper wrote:
>>> Hi Scott,
>>>
>>> I've just tested with our (simple) offline clone examples:
>>>
>>> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleMaster.java
>>>
>>>
>>> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClone.java
>>>
>>>
>>> /org.eclipse.emf.cdo.examples/src/org/eclipse/emf/cdo/examples/server/offline/OfflineExampleClient.java
>>>
>>>
>>>
>>> All seems to work fine. Have youtried that? Maybe you find a problem in
>>> your code by comparing to these examples?
>>>
>>> More comments below...
>>>
>>>
>>> Am 02.04.2012 18:19, schrieb scott@xxxxxxxx:
>>>> I'm writing a sports timing application that needs to be resilient and
>>>> high-available when deployed at the race course site --- even with
>>>> unreliable network connections. CDO's offline clone and failover
>>>> capability look like an excellent fit.
>>>>
>>>> I've regenerated this RCP-based application's EMF model to use native
>>>> CDO, got it working with a direct connection to the CDO server. This
>>>> CDO stuff is very cool!
>>> Thanks ;-)
>>>
>>>> Getting the application's editors to work with Delete and deal with
>>>> the InvalidObject's and StaleReferences has been a challenge, but I
>>>> think I've found workarounds.
>>> Remote deletes (detachments) are generally not easy to deal with in
>>> applications that are not written for multi user access from scratch.
>>> You should play with CDO's InvalidationPolicies and
>>> StaleReferencePolicies, e.g.,
>>>
>>> CDOView.Options options = view.options();
>>> options.setStaleReferencePolicy(CDOStaleReferencePolicy.PROXY);
>>> options.setInvalidationPolicy(CDOInvalidationPolicy.RELAXED);
>>>
>>> Please note that I recently (CDO 4.1!) applied some smaller fixes that
>>> can have a big positive impact:
>>>
>>> 374962: Make CDOStaleReferencePolicy.PROXY robust for eAdapters() calls
>>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=374962
>>>
>>> 374965: Make detachment notifications configurable
>>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=374965
>>>
>>> 375033: Remote notifications must be ignored in
>>> CDOPostEventTransactionHandler
>>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=375033
>>>
>>> 375034: Consolidate server-side exceptions for commit conflicts
>>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=375034
>>>
>>>>
>>>> Now I'm testing the use of CDO's offline clone capability. The cloned
>>>> respository seems to work well with the master. A snippet from my
>>>> application's log:
>>>>
>>>> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>>>> changed to state SYNCING
>>>> 10:21:32 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>>>> changed to state ONLINE
>>>> [INFO] Synchronized with master.
>>>> 10:21:32 DEBUG (CDOClient.java:93) - Successfully connected
>>>> 10:21:32 DEBUG (CDOClient.java:56) - You are now connected to the
>>>> server !
>>>>
>>>> For the first of my resiliency tests, I shutdown the CDO master server
>>>> and can see the clone begin to periodically retry creating its
>>>> connection to the master server.
>>>>
>>>> After I bring the CDO server back up, I expected the clone repository
>>>> to automatically reconnect and return to ONLINE status. However, even
>>>> after I bring the CDO server back up, the retries continue, and the
>>>> clone's synchronizer is never able to reconnect. Here's the log
>>>> containing a stack trace during one of the retries:
>>>>
>>>> [INFO] Disconnected from master.
>>>> 10:23:11 DEBUG (CDORepositoriesManager.java:199) - Cloned repository
>>>> changed to state OFFLINE
>>>> [WARN] Connection attempt failed. Retrying in 10 seconds...
>>>> org.eclipse.net4j.connector.ConnectorException: Connection timeout
>>>> after 10000 milliseconds
>>>> at
>>>> org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
>>>> at
>>>> org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
>>>> at
>>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
>>>>
>>>>
>>>> at
>>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1)
>>>>
>>>>
>>>> at
>>>> org.eclipse.net4j.signal.SignalProtocol.open(SignalProtocol.java:139)
>>>> at
>>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.createProtocol(CDONet4jSessionImpl.java:191)
>>>>
>>>>
>>>> at
>>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.openSession(CDONet4jSessionImpl.java:216)
>>>>
>>>>
>>>> at
>>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionImpl.doActivate(CDONet4jSessionImpl.java:123)
>>>>
>>>>
>>>> at
>>>> org.eclipse.net4j.util.lifecycle.Lifecycle.activate(Lifecycle.java:72)
>>>> at
>>>> org.eclipse.emf.internal.cdo.session.CDOSessionConfigurationImpl.openSession(CDOSessionConfigurationImpl.java:250)
>>>>
>>>>
>>>> at
>>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:98)
>>>>
>>>>
>>>> at
>>>> org.eclipse.emf.cdo.internal.net4j.CDONet4jSessionConfigurationImpl.openSession(CDONet4jSessionConfigurationImpl.java:1)
>>>>
>>>>
>>>> at
>>>> org.eclipse.emf.cdo.internal.server.syncing.RepositorySynchronizer$ConnectRunnable.run(RepositorySynchronizer.java:350)
>>>>
>>>>
>>>> at
>>>> org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:26)
>>>> at
>>>> org.eclipse.net4j.util.concurrent.QueueRunner.work(QueueRunner.java:1)
>>>> at
>>>> org.eclipse.net4j.util.concurrent.QueueWorker.doWork(QueueWorker.java:81)
>>>>
>>>> at
>>>> org.eclipse.net4j.util.concurrent.QueueWorker.work(QueueWorker.java:72)
>>>> at
>>>> org.eclipse.net4j.util.concurrent.Worker$WorkerThread.run(Worker.java:206)
>>>>
>>>>
>>>> [WARN] Connection attempt failed. Retrying in 10 seconds...
>>>> org.eclipse.net4j.connector.ConnectorException: Connection timeout
>>>> after 10000 milliseconds
>>>> at
>>>> org.eclipse.spi.net4j.Connector.waitForConnection(Connector.java:245)
>>>> at
>>>> org.eclipse.spi.net4j.Connector.doBeforeOpenChannel(Connector.java:360)
>>>> at
>>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:139)
>>>>
>>>>
>>>> at
>>>> org.eclipse.spi.net4j.ChannelMultiplexer.openChannel(ChannelMultiplexer.java:1
>>>>
>>>>
>>> Note that these log events have severity [WARN]. They not necessarily
>>> indicate a bug somewhere, except if you're really sure that the master
>>> should be reachable. Can you connect a normal client to the master?
>>>
>>>>
>>>>
>>>>
>>>> Is this a supported outage scenario that should work?
>>> Yes.
>>>
>>>> If so, maybe you could point me in the right direction for debugging
>>>> this one.
>>> I would start on the networking layer by setting breakpoints in
>>>
>>> org.eclipse.net4j.internal.tcp.TCPAcceptor.handleRegistration(ITCPSelector,
>>>
>>> ServerSocketChannel)
>>> org.eclipse.net4j.internal.tcp.TCPAcceptor.handleAccept(ITCPSelector,
>>> ServerSocketChannel)
>>>
>>> Also make sure that your nodes have enabled as many tracing options as
>>> possible. Maybe that shows some more exceptions...
>>>
>>> Cheers
>>> /Eike
>>>
>>> ----
>>> http://www.esc-net.de
>>> http://thegordian.blogspot.com
>>> http://twitter.com/eikestepper
>>>
>>>
>>
Previous Topic:Re: how do we use EIDAttribute ?
Next Topic:EMF ecore from interfaces & impl
Goto Forum:
  


Current Time: Fri Mar 29 08:35:40 GMT 2024

Powered by FUDForum. Page generated in 0.02295 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top