Home » Modeling » TMF (Xtext) » Performance issue in QueuedBuildData
Performance issue in QueuedBuildData [message #1739605] |
Wed, 03 August 2016 04:54  |
Eclipse User |
|
|
|
Hi there,
I have some performance issue in QueuedBuildData.
What am I trying to do?
I have more than 500k very simple resources in the workspace, balanced into 256 projects and I am trying to run a full clean build. (Please do not ask why I am doing that. I just wanted to try out whether Xtext can handle such a number of resources or not. It can. Indeed I had to customize the WorkspaceProjectsStateHelperand the DerivedStateAwareResourceDescriptionManager but it works fine. Pretty neat, isn't it? At least it is cool for me, I was not aware of that capability. Just as a side note, the full clean build time was around 40 minutes in my case.)
Where do I have exactly the issue?
In the QueuedBuildData#queueURI(URI) method. If I understand correctly, neither the uris, nor the values of the projectNameToChangedResource can contain duplicate URIs. The same rule applies on the corresponding containers for the copies. Currently, linked lists are being used for both the uris and the values of the projectNameToChangedResource. Unfortunately, it is pretty expensive to check the contains all the time (with such a number of resources) to avoid URI duplication. I did some profiling and the contains time is roughly the 20% of my overall build time. I was thinking about to replace it with a linked hash set. Actually, I have already done that, but I had to basically copy paste the whole class, replace the linked lists with linked hash sets and bind the QueuedBuildData to my custom implementation. Based on my experiences, it worked fine.
My question is the following:
Is there a more elegant way to customize the QueuedBuildData? If yes, can someone please advise how to achive that? If no, but you believe, it would make sense to provide some sort of mechanism to make it customizable, I would be happy to start working on a patch.
Thanks in advance for any kind of feedbacks!
Cheers,
Akos
|
|
| |
Re: Performance issue in QueuedBuildData [message #1739618 is a reply to message #1739611] |
Wed, 03 August 2016 06:49   |
Eclipse User |
|
|
|
Dear Karsten,
first off, thank you for your quick reply.
Secondly, I have to admit, I am extremely confused now...
"A linked hashset does not guarantee access order..."
But it guarantees iteration order, right? Or am I wrong? And we need only iteration order, and removing the first element and appending to the end of the collection, right? As I said, I am confused. Please reference the Javadoc of the LinkedHashSet.
" This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.) "
Also, you can reference my extremely simple example from the main method.
public static void main(String[] args) {
LinkedHashSet<String> linkedHashSet = new LinkedHashSet<String>();
HashSet<String> hashSet = new HashSet<String>();
List<Integer> toInsert = Arrays.asList(10, 11, 12, 88, 89, 90, 99);
System.out.println("Item insertion order:\n" + Arrays.toString(toInsert.toArray()) + "\n");
for (int i : toInsert) {
linkedHashSet.add(Integer.toString(i));
hashSet.add(Integer.toString(i));
}
System.out.println(LinkedHashSet.class.getSimpleName() + " iteration order:\n"
+ Arrays.toString(linkedHashSet.toArray()) + "\n");
System.out.println(
HashSet.class.getSimpleName() + " iteration order:\n" + Arrays.toString(hashSet.toArray()) + "\n");
System.out.println("____________________________________________________________________");
System.out.println("Reverse the order of added items and re-add them again. (In reverse order.)");
Collections.reverse(toInsert);
System.out.println("Item insertion order:\n" + Arrays.toString(toInsert.toArray()) + "\n");
for (int i : toInsert) {
linkedHashSet.add(Integer.toString(i));
hashSet.add(Integer.toString(i));
}
System.out.println(LinkedHashSet.class.getSimpleName() + " iteration order:\n"
+ Arrays.toString(linkedHashSet.toArray()) + "\n");
System.out.println(HashSet.class.getSimpleName() + " iteration order:\n" + Arrays.toString(hashSet.toArray()));
}
Furthemore, I believe using a UniqueEList does not change anything. The contains method is still checked at the base class's org.eclipse.emf.common.util.BasicEList.contains(Object) method. Which is again, an iteration (O(n)).
I was rather thinking about such a solution:
import java.util.AbstractQueue;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.LinkedHashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
import org.junit.Assert;
import org.junit.Test;
public class LinkedHashQueueTest extends Assert {
@Test
public void checkIterationOrder() {
final Queue<String> queue = new LinkedList<String>();
final LinkedHashSet<String> linkedHashSet = new LinkedHashSet<String>();
final List<Integer> toInsert = Arrays.asList(10, 11, 12, 88, 89, 90, 99);
for (final int i : toInsert) {
linkedHashSet.add(Integer.toString(i));
queue.add(Integer.toString(i));
}
final Queue<String> hashQueue = new LinkedHashQueue<String>(linkedHashSet);
final ArrayList<String> queuePollOrder = new ArrayList<String>();
final ArrayList<String> hashQueuePollOrder = new ArrayList<String>();
while (!queue.isEmpty()) {
queuePollOrder.add(queue.poll());
}
while (!hashQueue.isEmpty()) {
hashQueuePollOrder.add(hashQueue.poll());
}
assertTrue(queue.isEmpty());
assertTrue(hashQueue.isEmpty());
final String[] expecteds = new String[] { "10", "11", "12", "88", "89", "90", "99" };
assertArrayEquals(expecteds, queuePollOrder.toArray());
assertArrayEquals(expecteds, hashQueuePollOrder.toArray());
assertArrayEquals(queuePollOrder.toArray(), hashQueuePollOrder.toArray());
}
public class LinkedHashQueue<E> extends AbstractQueue<E> {
private final LinkedHashSet<E> delegate;
public LinkedHashQueue(final LinkedHashSet<E> delegate) {
this.delegate = delegate;
}
@Override
public boolean offer(final E e) {
return delegate.add(e);
}
@Override
public E poll() {
final Iterator<E> itr = delegate.iterator();
if (itr.hasNext()) {
final E next = itr.next();
itr.remove();
return next;
}
return null;
}
@Override
public E peek() {
final Iterator<E> itr = delegate.iterator();
if (itr.hasNext()) {
return itr.next();
}
return null;
}
@Override
public Iterator<E> iterator() {
return delegate.iterator();
}
@Override
public int size() {
return delegate.size();
}
/**
* This is overridden due to performance issues. To avoid iteration and
* check containment with a simple hash.
*
* <p>
* {@inheritDoc}
*/
@Override
public boolean contains(final Object o) {
return delegate.contains(o);
}
}
}
I know, I have attached a bit too much code to this thread, but I believe this thread definitely requires upfront clarification.
As always any kind of feedback is welcome.
Cheers,
Akos
|
|
| | | | | |
Goto Forum:
Current Time: Wed Jul 23 08:24:35 EDT 2025
Powered by FUDForum. Page generated in 0.17054 seconds
|