Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jdt-dev] Is this the right way to fix [extract method] bug?

Hi,

I currently use Eclipse as a Java IDE at work and have run into a
problem with the extract method refactoring not always giving the
result I would expect.

More specifically I ran into a variant of:
    https://bugs.eclipse.org/bugs/show_bug.cgi?id=285554
where extracting a statement from inside a nested loop do not properly
account for side effects, thereby causing changed functionality.

I had a discussion with a couple of coworkers at work when I hit this
bug, and they all said that they preferred to use IntelliJ for
refactoring work, as they had experienced the Eclipse refactoring
tools as too unreliable and that the IntelliJ refactorings works much
better.
I am however not that keen on switching to a proprietary solution, so
I have decided to try to figure out how to fix this bug as a hobby
project. This is my first venture into eclipse plugin programming. If
it becomes a success I might try to take on some more bug-fixing in
the refactoring code.

Currently I have gotten to the point where I think I know why it
fails. But before I venture further into trying to fix the bug, I
would like to hear if someone experienced with the refactoring code
can verify that I am on the right track.

The problem is exemplified as to extract the innermost statement
'list.add(a++);' in the following method:

    void foo(){
        ArrayList<Integer> list = new ArrayList<Integer>();
         for (Integer var: list) {
            int a = 0;
            for (int count = 0; count < 10; count++) {
                 list.add(a++);
             }
        }
    System.out.println(list.toString());
}

If there is only one loop the extracted method will (correctly) return
the modified value of the 'a' parameter and assign it to 'a' in the
inner loop.
In the above example however, the extracted method returns void and
the side effect (incrementing 'a') is lost.

I have narrowed the differences between the working and not working
case down to the result of the InputFlowAnalyzer as called from
ExtractMethodAnalyzer.computeOutput().
In the nested loop case, where it fails, it does not properly account
for the read of the local variable in the inner loop, which leads to
the (wrong) conclusion it is not necessary to return the modified
value from the extracted method.

So the problem is that the InputFlowAnalyzer class does not pick up
all variable uses from nested loops.

This in turn seems to be due to the way the analysis of loops is
separated out into a separate AST visitor called
LoopReentranceVisitor.
When InputFlowAnalyzer visits the first (outermost) loop it creates an
instance of the LoopReentranceVisitor:

    public boolean visit(ForStatement node) {
        createLoopReentranceVisitor(node);
        return super.visit(node);
    }

    private void createLoopReentranceVisitor(ASTNode node) {
        if (fLoopReentranceVisitor == null && fDoLoopReentrance &&
fSelection.coveredBy(node))
            fLoopReentranceVisitor= new
LoopReentranceVisitor(fFlowContext, fSelection, node);
    }

When visiting the nested loop the above method does nothing as we
already have a fLoopReentranceVisitor.
When we are finished traversing the nested loop we also do nothing in
endVisit() as this is not the outermost loop
(fLoopReentranceVisitor.getLoopNode() != node):

    public void endVisit(ForStatement node) {
        super.endVisit(node);
        handleLoopReentrance(node);
    }

    private void handleLoopReentrance(ASTNode node) {
        if (fLoopReentranceVisitor == null ||
fLoopReentranceVisitor.getLoopNode() != node)
            return;

        fLoopReentranceVisitor.process(node);
        GenericSequentialFlowInfo info= createSequential();
        info.merge(getFlowInfo(node), fFlowContext);
        info.merge(fLoopReentranceVisitor.getFlowInfo(node), fFlowContext);
        setFlowInfo(node, info);
    }

When we then finish the traversal of the outermost loop the
handleLoopReentrance() function will make the fLoopReentranceVisitor
traverse the loop too. This visitor now traverses the loops, handling
both the innermost and the outermost loops and attaches the flow info
results to the respective loop nodes.
But there is nothing that propagates/merges this result from the
nested loop node upwards in LoopReentranceVisitor, so that part of the
analysis is lost. Only the analysis of the outermost loop is merged
with the result of InputFlowAnalyzer's traversal of the sub-tree (in
handleLoopReentrance above) and is propagated upwards by
InputFlowAnalyzer when finishing visiting the parent nodes.

I tried (as an experiment to verify my understanding) to remove the
"fLoopReentranceVisitor.getLoopNode() != node" part of the
conditioning in handleLoopReentrance() so that the
fLoopReentranceVisitor got called for both the inner and the outer
loops and the info also merged while InputFlowAnalyzer is visiting the
nested loop. The result was that InputFlowAnalyzer propagated the
results upwards and this did in fact fix the bug, although it is not
the right solution, as it causes us to traverse the tree too many
times.

The problem is with all the loop constructs (for, enhanced-for, do and
while) as they are all handled the same way.

The main problem with the current solution as I see it, is that we
traverse the outermost loop twice with two different visitors, which
prevents the first visitor from picking up the results from the second
visitors handling of sub-nodes (nested loops) as the first visitor has
already finished the traversal of these.

I think the best solution is to move the analysis of the loops into
InputFlowAnalyzer and drop the separate LoopReentranceVisitor so the
outermost loop is only traversed once. Then we simply need to set a
reentry flag, while traversing the outer loop body, so we know we are
in a nested situation when handling inner loop(s).

This also makes the code easier to understand, as all node types will
be handled at the same level, instead of some being handled in a
separate visitor class. And it removes the complexities of having to
understand how the two traversals work together.

Do you agree in my analysis and do you agree in my proposed way to
solve the problem, or would you rather have it addressed in a
different way?

Best regards,
Dan N. Christensen


Back to the top