Copyright © 2006 Kenneth Ölwing
 Eclipse Corner Article

 

How to Correctly and Uniformly Use Progress Monitors

Summary
Handling a progress monitor instance is deceptively simple. It seems to be straightforward but it is easy to make a mistake when using them. And, depending on numerous factors such as the underlying implementation, how it is displayed, if it’s set to use a fixed number of work items or ‘unknown’, if used through a SubProgressMonitor wrapper etc., the result can range from completely ok, mildly confusing or outright silliness.

In this article I hope I can lay down a few ground rules that will help anyone use progress monitors in a way that will work with the explicit and implicit contract of IProgressMonitor. Also, understanding the usage side makes it easier to understand how to implement a monitor.

By Kenneth Ölwing, BEA JRPG
January 18, 2006


Using a progress monitor - what's up with that?

It all really comes down to a few, not too complex, rules. A common theme is 'know what you know - but only that'. This means that you shouldn't assume you know things you really don't know, and this includes the common mistake of only considering progress monitors you have seen, i.e. typically the graphical ones when using the IDE. Another thing to watch out for is the fact that commonly you design a number of tasks that may call each other using sub progress monitors, and while doing that make assumptions based on your knowledge that they will be called in this manner - never forget that sometime maybe your separate subtasks may be called from not-yet-written routines. It's then vitally important that your subtasks act exactly in a 'neutral' manner, i.e. with no 'implicit assumptions' on what happened before or what will happen after.

One of the motivations for this article is when I tried my hand at implementing a progress monitor intended for headless/console use - and realised that code using it could make it look really wacky when the monitor was wrongly used, and this was issues that were not as readily apparent with a graphical monitor. Also, code (including my own) frequently abuses the explicit and implicit (which admittedly are my interpretation of reasonable behavior) contract that the IProgressMonitor interface states, and this makes for dicey decisions for a monitor implementor - should it complain (and how) when it gets conflicting orders? If not, how should it then behave to make for a reasonable and intuitive user experience?

The protocol of IProgressMonitor

Generally, all interaction with a progress monitor is through the interface IProgressMonitor and this interface defines the protocol behavior expected. It does leave some things up in the air though; for example, the description states some things that should be true, but the methods have no throws clause that helps enforce some invariants. I have chosen to interpret the descriptions ‘hard’, even to the point of saying it’s valid to throw an (unchecked) exception if a described rule is violated (this is somewhat controversial of course - if you implement a monitor doing this you should probably provide a way to turn off 'strictness'). Hopefully we could eventually see a new interface that deprecates the old methods and provides new ones that better reflect the contract. The discussion below is based on the assumption that the reader is familiar with the general API; review it in the Eclipse help.

The first important consideration is the realization that a monitor (contract wise) can be in basically four states. Any given implementation may or may not track those state changes and may or may not do anything about them, which is part of the reason that misbehaving users of a monitor sometimes gets away with it. Only one of these states are readily testable using the interface however (if the monitor is canceled); the other states are just a given from correct use of the interface.

Essentially, the state changes are governed by the methods beginTask(), done() and setCanceled(), plus the implicit initial state of a new instance. Note that for the purposes discussed here the perceived ‘changes in state’ occurring as a result from calling worked() is not relevant. A separate discussion below details how to deal with worked() calls.

NB: The states described here are not any ‘officialese’ that can be found as constants or anything like that; they’re only here to serve so they can be used for discussion.


Now, one contract pattern described above is that if beginTask() is ever called, done() MUST be called. This is achieved by always following this code pattern (all code is simplified):
monitor = … // somehow get a new progress monitor which is in a pristine state
// figure some things out such as number of items to process etc…
try
{
monitor.beginTask(…)
// do stuff and call worked() for each item worked on, and check for cancellation
}
finally
{
monitor.done()
}
The important thing here then is to ensure that done() is always called (by virtue of being in the finally clause) but (normally) only if beginTask() has been successfully called (by virtue of being the first thing called in the try clause). There is a small loophole that could cause done() to be called without the monitor actually transitioning from PRISTINE to IN_USE. This loophole can with this pattern only happen if a particular beginTask() implementation throws an unchecked exception (The interface itself declares no throws clause) before it internally makes a note of the state change (if the specific implementation even tracks state in this manner and/or is too loose in its treatment of the interface contract).

tip Arguably, you should always strive for calling beginTask()/done(). The reasons for this are buried in the fact that you in principle never know when you are being called as a subtask. If you don't 'complete' the monitor, the parent can end up with an incorrect count for its own task. The full rationale is covered more below, in the section "Ensure to always complete your monitor!".

Delegating use of a progress monitor to subtasks

Above for the IN_USE state I mentioned that it’s very easy to get things wrong; beginTask() should never be called more than once. This frequently happens in code that doesn’t correctly understand the implications of the contract. Specifically, such code pass on the same instance it has been given to subtasks, and those subtasks; not aware that the caller already has begun following the contract, also tries following the contract in the expected manner – i.e. they start by doing a beginTask().

Thus, passing on a monitor instance is almost always wrong unless the code knows exactly what the implications are. So the rule becomes: In the general case, a piece of code that has received a progress monitor from a caller should always assume that the instance they are given is theirs and thus completely follow the beginTask()/done() protocol, and if it has subtasks that also needs a progress monitor, they should be given their own monitor instances through further use of the SubProgressMonitor implementation that wraps the ‘top-level’ monitor and correctly passes on worked() calls etc (more on this below).

Sample code to illustrate this:
monitor = … // somehow get a new progress monitor which is in a pristine state
// figure some things out such as number of items to process etc…
try
{
monitor.beginTask(…)
// do stuff and call worked() for each item processed, and check for cancellation

// farm out a piece of the work that is logically done by ‘me’ to something else
someThing.doWork(new SubProgressMonitor(monitor,…))
// farm out another piece of the work that is logically done by ‘me’ to something else
anotherThing.doWork(new SubProgressMonitor(monitor,…))
}
finally
{
monitor.done()
}
Note that each doWork() call gets a new instance of a SubProgressMonitor; such instances can and should not be reused for all the protocol reasons already discussed.

The only time a single instance of a monitor passed to, or retrieved by, a certain piece code can be reused in multiple places (e.g. typically methods called by the original receiver), is when the code in such methods is so intimately coupled so that they in effect constitute a single try/finally block. Also, for this to work each method must know exactly who does beginTask()/done() calls, and also (don’t forget this) how many work items they represent of the total reported to beginTask() so that they can make the correct reports. Personally, I believe this is generally more trouble than it’s worth – always follow the regular pattern of one receiver, one unique monitor instead and the code as a whole is more maintainable.

Managing the item count

This section is about how to do the initial beginTask() call and report the amount of total work expected, and then ideally report exactly that many items to the monitor. It is ok to end up not reporting all items in one particular case: when the job is aborted (due to cancellation by user, an exception thrown and so on) – this is normal and expected behavior and we will wind up in the finally clause where done() is called.

It is however sloppy technique to ‘just pick a number’ for the total and then call worked(), reporting a number and hope that the total is never exceeded. Either way this can cause very erratic behavior of the absolute top level and user visible progress bar (it is for a human we’re doing this after all) – if the total is too big compared to the actual items reported, a progress bar will move slowly, perhaps not at all due to scaling and then suddenly (at the done() call) jump directly to completed. If the total is too small, the bar will quickly reach ’100%’ or very close to it and then stay there ‘forever’.

So, first and foremost: do not guess on the number of work items. It’s a simple binary answer: either you know exactly how many things that will be processed…or you don’t know. It IS ok to not know! If you don't know, just report IProgressMonitor.UNKNOWN as the total number, call worked() to your hearts content and a clever progress monitor implementation will still do something useful with it. Note that each (sub)task can and should make its own decision on what it knows or not. If all are following the protocol it will ensure proper behavior at the outer, human visible end. A heads up though: never call the SubProgressMonitor(parentMonitor, subticks) constructor using IProgressMonitor.UNKNOWN for subticks - this is wrong! More on this later.

How to call beginTask() and worked()

There are typically two basic patterns where you know how many items you want to process: either you are going to call several different methods to achieve the full result, or you are going to call one method for each instance in a collection of some sort. Either way you know the total item count to process (the number of methods or the size of the collection). Variations of this are obviously combinations of these basic patterns so just multiply and sum it all up.

There is sometimes a benefit of scaling your total a bit. So, instead of reporting ‘3’ as the total (and do worked(1) for each item) you may consider scaling with, say 1000, and reporting ‘3000’ instead (and do worked(1000) for each item). The benefit comes up when you are farming out work to subtasks through a SubProgressMonitor; since they may internally have a very different total, especially one that is much bigger than your total, you give them (and the monitor instance) some ‘room’ to more smoothly consume and display the allotment you’ve given them (more explanations below on how to mix worked() and SubProgressMonitor work below). Consider that you say ‘my total is 3’ and you then give a subtask ‘1’ of these to consume. If the subtask now will report several thousand worked() calls, and assuming that the actual human visible progress bar also has the room, the internal protocol between a SubProgressMonitor and it’s wrapped monitor will scale better and give more smooth movement if you instead would have given it ‘1000’ out of ‘3000’. Or not - the point is really that you don't know what monitor implementation will be active, all you can do is give some information. How it's then displayed in reality is a matter of how nifty the progress monitor implementation is.

A sample of simple calls:
monitor = … // somehow get a new progress monitor which is in a pristine state
int total = 3 // hardcoded and known
try
{
monitor.beginTask(total)

// item 1
this.doPart1()
monitor.worked(1)

// item 2
this.doPart2()
monitor.worked(1)

// item 3
this.doPart3()
monitor.worked(1)
}
finally
{
monitor.done()
}
No reason to scale and no collection to dynamically compute.

A more elaborate sample:
monitor = … // somehow get a new progress monitor which is in a pristine state
int total = thingyList.size() * 3 + 2
try
{
monitor.beginTask(total)

// item 1
this.doBeforeAllThingies()
monitor.worked(1)

// items 2 to total-1
for (Thingy t : thingyList)
{
t.doThisFirst()
monitor.worked(1)
t.thenDoThat()
monitor.worked(1)
t.lastlyDoThis()
monitor.worked(1)
}

// final item
this.doAfterAllThingies()
monitor.worked(1)
}
finally
{
monitor.done()
}

Mixing straightforward calls with subtasks

I was initially confused by how to report progress when I farmed out work to subtasks. I experienced ‘reporting too much work’ since I didn’t understand when to call and when to not call worked(). Once I caught on, the rule is very simple however: calling a subtask with a SubProgressMonitor is basically an implicit call to worked() with the amount allotted to the subtask. So instead of this:
monitor = … // somehow get a new progress monitor which is in a pristine state
int scale = 1000
int total = 3 // hardcoded and known
try
{
monitor.beginTask(total * scale)

// item 1
this.doPart1()
monitor.worked(1 * scale)

// item 2
this.doPart2(new SubProgressMonitor(monitor, 1 * scale)) // allot 1 item
monitor.worked(1 * scale) // WRONG! Not needed, already managed by the SubProgressMonitor

// item 3
this.doPart3()
monitor.worked(1 * scale)
}
finally
{
monitor.done()
}
You should just leave out the second call to worked().

Tip Never pass IProgressMonitor.UNKNOWN (or any other negative value) when creating a SubProgressMonitor() wrapper!

A situation I just the other day experienced was when doing an IProgressMonitor.UNKNOWN number of things - I needed to call a subtask, and hence I set up to call it using a SubProgressMonitor(parent, subticks) but I realized that I hadn't ever considered how the sub monitor should be created - how many subticks it should be given - in the unknown case. I figured it should be ok to pass IProgressMonitor.UNKNOWN there also. However, when later trying my code I saw to my horror that my progress bar went backwards! Not the effect I figured on...

As it turns out, this is because the implementation (as of Eclipse 3.2M3) blindly uses the incoming ticks as a scaling factor. However, it goes haywire when it receives a negative value (and IProgressMonitor.UNKNOWN happens to have a value of -1). It does computations with it, and it ends up calling worked() with negative values which my monitor tried to process...that code is now fixed to be more resilient in such cases. I've filed bug #119018 to request that SubProgressMonitor handles it better and/or document that negative values is a bad idea for the constructor call.

Whatever, passing IProgressMonitor.UNKNOWN is incorrect in any case. If you have called beginTask() using IProgressMonitor.UNKNOWN you can gladly pass in any reasonable tick value to a SubProgressMonitor, it will give the correct result.

Ensure to always complete your monitor!

Consider the concept described in the previous section: the important thing here is that basically, you say that you have three distinct and logical things to do, and then you tick them off - but one of the ticks is actually farmed out to a subtask through a SubProgressMonitor. You don't really know how many distinct and logical things the subtask has to do, nor should you care. The mechanics of using a SubProgressMonitor makes the advancement of one of your ticks happen in the correct way. So, the end expectation is that once you reach the end of your three things, the monitor you have, have actually fulfilled the count you intended - the internal state of it should reflect this: "the user said three things should happen and my work count is now indeed '3'".

But, as I recently found out, this can fail. Specifically, I blindly invoked IProject.build() on a project which had no builders configured. To this method I sent in a SubProgressMonitor and allotted it one 'tick' of mine. But, as it turned out, internally it never used the monitor it got, presumably because there was no work to perform - not very unreasonable in a sense. However, this did have the effect that one of my ticks never got, well, 'tocked' :-). I could solve this specific problem by simply checking if there was any builders configured, and if there were none, I simply advanced the tick by worked(1) instead. But, it requires me, the caller, to make assumptions on the internal workings of the subtask, which is never good.

This is not a huge problem of course. But, I think it would make sense to always act the same. The code resulting from IProject.build() could just call beginTask("", countOfBuilders) regardless of if countOfBuilders was 0, iterate over the empty array or whatever, and then call done(). This would correctly advance my tick.

Cancellation

The sample code above does not show cancellation checks. However, it is obviously recommended that users of a progress monitor actively check for cancellation to timely break out of the operation. The more (potentially) long-running, the more important of course. And remember: you don't know if the operation is running in a context that allows it to be canceled or not - so you just have to code defensively. A sample of how it should look could be this:
monitor = … // somehow get a new progress monitor which is in a pristine state
try
{
monitor.beginTask(thingyList.size())

for (Thingy t : thingyList)
{
if(monitor.isCanceled())
throw new OperationCanceledException();
t.doSomething()
monitor.worked(1)
}
}
finally
{
monitor.done()
}

The NullProgressMonitor

A common pattern is to allow callers to skip sending a monitor, i.e. sending ‘null’. A simple and convenient way to deal with such calls is this:
public void doIt(IProgressMonitor monitor)
{
// ensure there is a monitor of some sort
if(monitor == null)
monitor = new NullProgressMonitor();

try
{
monitor.beginTask(thingyList.size())

for (Thingy t : thingyList)
{
if(monitor.isCanceled())
throw new OperationCanceledException();
t.doSomething()
monitor.worked(1)
}
}
finally
{
monitor.done()
}
}

Conclusion

I believe that by diligently following these rules and patterns, you will never have a problem in using the progress monitor mechanism. Obviously, it requires implementations to follow the contract as well. But remember, if you mistreat the protocol you will sooner or later end up talking to a progress monitor implementation that is stern and will simply throw an exception or give strange visual effects if you call it’s beginTask() one time too many. It’s essentially valid if the IProgressMonitor interface description is to be believed – and you will get blamed by your customer…