[CDO] lock(long timeout) not timeout-ing... [message #1694764] |
Thu, 07 May 2015 15:43 |
Alexandre Borgoltz Messages: 31 Registered: July 2009 Location: France |
Member |
|
|
Hey everybody,
I am facing a strange problem with CDO write locking. I reproduced it with CDO 4.3 and CDO 4.4
I am having some concurrent threads simply trying to add objects in a list.
The case
For example (Model1), say a Company has many Categories.
Each thread iterates N times.
At each iteration, each thread
- creates a new session and a new transaction in that session.
CDOSession session = openSession();
CDOTransaction transaction = session.openTransaction();
- retreives the company
Company company = transaction.getObject(...);
CDOObject cdoObject = CDOUtil.getCDOObject(company);
- write-lock the company - notice the "LOCK_TIMEOUT" parameter
CDOLock lock = cdoObject.cdoWriteLock();
for (int retry = 0; retry < RETRIES; ++retry)
{
try
{
lock.lock(LOCK_TIMEOUT);
break;
}
catch (TimeoutException ex)
{
if (retry == RETRIES - 1)
{
return;
}
}
}
- creates a new Category
Category category = getModel1Factory().createCategory();
- adds the category to the company's list of categories
company.getCategories().add(category);
- commit
- Both session and transaction are closed at the end of the iteration.
finally
{
transaction.close();
transaction.getSession().close();
}
Note 1:
AutoReleaseLocks is enabled.
Note 2:
This test case is highly inspired by the "LockingSequenceTest" from CDO's test suite. The only noticeable differences are that
[*] my test does create a new session (+transaction obviously) at each iteration
[*] instead of modifying a simple EAttribute, I add item into an EReference
The problem we encounter is that after a -short- moment, no thread seem to be able to acquire the lock : they all seem to wait for the lock forever! It does not happen with the modification of the EAttribute, only when adding into an EReference!!!
Note 3: .lock() method does not respect the timeout (?)
After some time analyzing, I found out that the problem shows up when one thread gets stuck in the call of .lock(long timeout) : eventhough I set the timeout parameter to -say- 1000L, the method never returns...
More precisely, the lock() method ends up calling CDOViewImpl.lockObjects(...) wich sends a "lock" message to the server. The result of the message contains a timestamp. lockObjects then waitForUpdate(theTimestamp), with no timeout defined.
I guess it is not a normal behaviour that the awaited update never comes in (maybe something went wrong on the server side and/or the timestamp is wrong?). And I think that it is the real problem.
But the fact is that lock(N) should actually timeout after N milliseconds anyway. I guess using waitForUpdate(long timestamp, long timeout) would help... of course it's not that simple because it would mean that either:
[*] lock returns "successfully" but within an out-of-date transaction
[*] lock throws a TimeoutException but having actually locked the write.
In brief, I guess we'd have to replace the
waitForUpdate(requiredTimestamp); in CDOViewImpl. lockObjects(...) with something like
if(!waitForUpdate(requiredTimestamp, timeout) {
// unlock
//throw a timeoutexception ?
}
Note 4:
In the unit test I am attaching with this message, I have added a mechanism to detect hanging .lock() calls : I use a custom Thread that I start just before the call to .lock() and that I stop right after .lock() as returned. In the meantime, the thread measures time and logs when the theoritical max duration has been exceeded:
TimerThread timerThread = new TimerThread(LOCK_TIMEOUT);
try
{
timerThread.start();
lock.lock(LOCK_TIMEOUT);
break;
}
catch (TimeoutException ex)
{
// ...
}
finally
{
timerThread.done()
}
The corresponding log makes it obvious that at some point one call blocks eternally and then all the other threads exhaust their lock retrials until they die...
Note 5:
The problem does not appear with a small number of threads/additions, depending on the machine. On a slow comuter, I could reproduce it with 5 Threads and 2 additions per threads. On a faster one, it needs 10 threads...
Conclusion
In the end, I don't know why the waitForUpdate never returns in this very special case. Am I doing something wrong?
Congratulations for being brave enough to read my whole message and thank you very much by advance for your help!
--
Alexandre
|
|
|
|
|
|
|
Powered by
FUDForum. Page generated in 0.04605 seconds