Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[platform-dev] 4.26 JobManager implementation changes: Possible deadlock in IJobChangeListener

Attention: The JobManager implementation was changed! 
The old JobMangager implementation notified IJobChangeListener about various
IJobChangeEvent without strict order in various threads. For example
scheduled() was called in a thread that did call schedule() on a job while
done() was normally called within a worker thread. In case of a repeated
schedule() both notifications could run in parallel. That was a race
condition. The listener could not deterministically know if the job was
still scheduled or already done. As a consequence a join() could have
returned too early or the Progress View did not show jobs that did still
run. And probably many other things went wrong totally unnoticed. Even
multiple notifications could have happened in parallel in various worker
threads, as there is no guarantee that the next execution is done in the
same worker.
To fix this indeterministic behavior all events for the same Job instance
will now be sent one after the other. It is still not defined in which
thread job events will be sent, but the new implementation will not call any
IJobChangeListener until all pending events for the same job instance are
processed by all registered IJobChangeListener. The new implementation will
also wait with any job state change until all calls to IJobChangeListener
for that job returned.
The IJobChangeListener javadoc clearly specifies "whether the state change
occurs before, during, or after listeners are notified is unspecified." -
and the implementation changed within that broad specification.
Unfortunately some IJobChangeListener around rely on the old implementation.
They may now deadlock. For example the following snippet can now deadlock:
 job.schedule(); // may result in done();
 synchronized (lock){
  boolean schedule = ...; // lock needed for reasons
  if (schedule ) job.schedule(); // schedule() will notify scheduled()
  // - which may not return before previous Notifications return!
 }

 public void done(IJobChangeEvent event) {
  // can not enter while other Thread holds lock:
  synchronized (lock) {// possible deadlock
  }
 }
Instead use:
 job.schedule();
 boolean schedule;
 synchronized (lock) {
  schedule=...; // lock needed for reasons
 }
 if (schedule) job.schedule();

 public void done(IJobChangeEvent event) {
  synchronized (lock) {// OK
  }
 }
The same problem may occur on all IJobChangeListener methods and all methods
that result in IJobChangeEvent being sent: Job.schedule(), cancel(),
wakeUp(), sleep() and JobManager shutdown. Make sure all IJobChangeListener
implementations do not block and return promptly.
Due to the extreme risk the new implementation tries to identify
non-conforming IJobChangeListener and do a fall back: when an
IJobChangeEvent is not handled within 3 seconds a timeout error is logged
with stacktraces of the competing threads and the JobManager will no longer
wait - until JVM is restarted. Further calls to IJobChangeListener may occur
in non deterministic order and JobManager.join(family) can return too soon.
It is however possible, that there may also be some deadlocks in clients
code due to the changed JobManager implementation that may be undetected by
JobManager.
Also note that it is undefined in which thread IJobChangeListener methods
are called. Clients may have relied on the old implementation which called
done() in the UI (SWT) thread for UIJob - but that is not the case anymore -
it may happen in a background thread.

***It is recommended to check existing IJobChangeListener implementations
for possible regressions if updating to the new target platform.***

Jörg Kubitz



Back to the top