Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Mylyn » Mylyn performance on network drives?
Mylyn performance on network drives? [message #15845] Wed, 18 July 2007 19:53 Go to next message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
I've been having performance problems using Mylyn with my workspace
placed on a network drive (I assume that those problems come from having
a remote workspace, I might be mistaken).

First, there was an issue with tooltips (I had them accidentally set to
wait for 0ms before showing, and Mylyn was trying to load too many
tooltips at once - over the network, which caused some lag). Setting a
more reasonable tooltip delay time fixed this
(https://bugs.eclipse.org/bugs/show_bug.cgi?id=196109)

But Eclipse was still quite unresponsive - I even deleted all the
queries to bugs.eclipse.org so the system wouldn't spent time
synchronizing them, without success (and it was still spending
considerable time "synchronizing 0 queries").

The weird thing was that I kept getting change notifications for
tasks/bugs that I had already deleted from the task list (or so I presumed).

So I used Total Commander to look into the offline folder of mylin
metadata - and lo and behold, even getting a directory listing for all
of the 3300 stored bugs took a few seconds. No wonder Eclipse was so
sluggish (I assume it was going through the list of 3300 offline bugs
every time and looking for changes... even though I had none of those
bugs or queries in my task list). I even had to increase the VM size,
because I was getting heap space errors from Mylyn (last time I checked,
eclipse had about 500MB in memory and about 700MB virtual memory total).

So I manually deleted all of the 3k files in the offline directory. Now
Mylyn still spends some time "synchronizing" but at least Eclipse stays
responsive.

My question is - can something be done about this, maybe limiting the
amount of files stored offline? Or is it just an issue of me having my
workspace on a network drive? Is there something I can do to diagnose
the problem further?

Regards,
Kim Sullivan
Re: Mylyn performance on network drives? [message #15886 is a reply to message #15845] Wed, 18 July 2007 21:32 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: alex_blewitt.yahoo.com

Having anything on a network drive is a bad idea generally.

That being said, Windows has serious performance issues when dealing with more than 500 files in the same directory. This should be raised as a bug at https://bugs.eclipse.org with the name of the folder that you found, because it needs to be fixed. It shouldn't be generating that much data in one directory.

Alex.
Re: Mylyn performance on network drives? [message #17399 is a reply to message #15886] Thu, 19 July 2007 19:22 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: beatmik.acm.org

Alex,

Note that it is Mylyn generating all these files and storing them in
..metadata/.mylyn/offline/<repository-url>/. They are offline copies of
bug reports in the Task List. We don't currently foresee any problem
with having thousands of files in these directories because we are not
aware of Windows having trouble accessing large directories when you
access by name instead of trying to do a directory listing (e.g.
Windows/System32 has thousands of files). But if you know of any
indications or evidence to the contrary please let us know.

Mik

Alex Blewitt wrote:
> Having anything on a network drive is a bad idea generally.
>
> That being said, Windows has serious performance issues when dealing with more than 500 files in the same directory. This should be raised as a bug at https://bugs.eclipse.org with the name of the folder that you found, because it needs to be fixed. It shouldn't be generating that much data in one directory.
>
> Alex.
Re: Mylyn performance on network drives? [message #17408 is a reply to message #15845] Thu, 19 July 2007 19:32 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: beatmik.acm.org

Hi Kim,

As with other things that are kept in Eclipse's .metadata folder the
offline cache we expect fast I/O access to that cache (what Alex was
alluding to). But one of the key reasons that we broke the offline
cache into separate files, instead of maintaining one large file, is to
reduce the need for file I/O, since the files only need to be accessed
as you use them or as the bug reports get changed on disk. The only
large update to the cache happens when you first create a query, since
all of the data for the corresponding tasks will be asynchronously
retrieved at that point.

Sounds like you've got three options:

1) Move your workspace to a local disk.

2) Move Mylyn's task data folder to a local disk (via Preferences ->
Task List -> Advanced)

3) Help us figure out how to better support a non-local cache be filing
a bug report. For example, we could try to remove any unneeded
synchronous access to the cache.

Mik

Kim Sullivan wrote:
> I've been having performance problems using Mylyn with my workspace
> placed on a network drive (I assume that those problems come from having
> a remote workspace, I might be mistaken).
>
> First, there was an issue with tooltips (I had them accidentally set to
> wait for 0ms before showing, and Mylyn was trying to load too many
> tooltips at once - over the network, which caused some lag). Setting a
> more reasonable tooltip delay time fixed this
> (https://bugs.eclipse.org/bugs/show_bug.cgi?id=196109)
>
> But Eclipse was still quite unresponsive - I even deleted all the
> queries to bugs.eclipse.org so the system wouldn't spent time
> synchronizing them, without success (and it was still spending
> considerable time "synchronizing 0 queries").
>
> The weird thing was that I kept getting change notifications for
> tasks/bugs that I had already deleted from the task list (or so I
> presumed).
>
> So I used Total Commander to look into the offline folder of mylin
> metadata - and lo and behold, even getting a directory listing for all
> of the 3300 stored bugs took a few seconds. No wonder Eclipse was so
> sluggish (I assume it was going through the list of 3300 offline bugs
> every time and looking for changes... even though I had none of those
> bugs or queries in my task list). I even had to increase the VM size,
> because I was getting heap space errors from Mylyn (last time I checked,
> eclipse had about 500MB in memory and about 700MB virtual memory total).
>
> So I manually deleted all of the 3k files in the offline directory. Now
> Mylyn still spends some time "synchronizing" but at least Eclipse stays
> responsive.
>
> My question is - can something be done about this, maybe limiting the
> amount of files stored offline? Or is it just an issue of me having my
> workspace on a network drive? Is there something I can do to diagnose
> the problem further?
>
> Regards,
> Kim Sullivan
Re: Mylyn performance on network drives? [message #17422 is a reply to message #17399] Thu, 19 July 2007 23:07 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: alex_blewitt.yahoo.com

I know that it can cause problems, having worked through these recently :-) It's also fairly easy to test; create a directory with 5000 files, then open it up in Windows Explorer. Witness the amount of time it takes to do a directory traversal and generate the data. Any file system operation that hits (or attemtps to hit) any file in that directory will be blocked for seconds as the file system responds.

Indeed, it's trivial to have a Java class that will demo this; simply 'touch' thousands of files, then print out the time stamp, followed by File.exists, followed by the timestamp.

Alex.
Re: Mylyn performance on network drives? [message #17427 is a reply to message #17422] Thu, 19 July 2007 23:11 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: alex_blewitt.yahoo.com

NB it's also fairly easy to work around; simply create several subdirectories e.g. a-z or 1-9 to partition the file space in some way. BTW this also helps performance on Linux systems, though it's less noticable because Linux filing systems are generally not as bad as Windows ones are.

Alex.
Re: Mylyn performance on network drives? [message #17450 is a reply to message #17422] Fri, 20 July 2007 14:40 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: eclipse-news.mark-kirchner.de

Alex Blewitt schrieb:
> create a directory with 5000 files, then open it up in Windows
> Explorer. Witness the amount of time it takes to do a directory
> traversal and generate the data. Any file system operation that hits
> (or attemtps to hit) any file in that directory will be blocked for
> seconds as the file system responds.

Well, no, not necessaryly. If you are trying to access the file /by its
name/ there should be absolutly no need for the OS / the filesystem to
generate a directory listing.

On the other hand, if you open up the directory in Windows Explorer it
obviously has to generate such a listing which takes time. So these two
cases are not comparable. And BTW: Create a sufficiently large directory
on Linux and you'll get the same problems, too.

Regards,
Mark
Re: Mylyn performance on network drives? [message #17459 is a reply to message #17408] Sat, 21 July 2007 10:45 Go to previous messageGo to next message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
Hi Mik,

Mik Kersten wrote:
> As with other things that are kept in Eclipse's .metadata folder the
> offline cache we expect fast I/O access to that cache (what Alex was
> alluding to). But one of the key reasons that we broke the offline
> cache into separate files, instead of maintaining one large file, is to
> reduce the need for file I/O, since the files only need to be accessed
> as you use them or as the bug reports get changed on disk. The only
> large update to the cache happens when you first create a query, since
> all of the data for the corresponding tasks will be asynchronously
> retrieved at that point.

Thanks for the info - especially the fact that Mylyn doesn't actually
enumerate the offline folder's contents. I'll move my workspace to my
local disk - I'm not really sure that removing synchronous disk access
will help much, even if Eclipse stays responsive, Mylyn still has to
either load the data from/to disk via the network (slow), or cache it in
memory (increased memory usage and/or swapping to disk).

I noticed one thing though - after moving the workspace, Mylyn seems to
have stopped trying to synchronize 0 queries (and downloading new
offline tasks, even though I deleted all queries). I'm not sure if this
is just a coincidence (no bugs appeared in the last two days, or none of
the bugs that had previously been in the cache have been changed), a
result of quicker access to disk (synchronization is nearly
instantenious), the fact that I changed the eclipse.org task repository
repository to offline and then back online, manually deleting tasks from
disk or something completely different.

I'm new to Mylyn, so I have no idea how it is supposed to work, and if I
should try to reproduce the former behavior somehow for a bug report. Is
Mylyn supposed to keep all offline tasks in sync with the repository,
regardless of the fact that they're deleted from the task list?

Kim
Re: Mylyn performance on network drives? [message #17469 is a reply to message #17459] Sat, 21 July 2007 11:38 Go to previous messageGo to next message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
Kim Sullivan wrote:
> Hi Mik,
>
> Mik Kersten wrote:
>> As with other things that are kept in Eclipse's .metadata folder the
>> offline cache we expect fast I/O access to that cache (what Alex was
>> alluding to). But one of the key reasons that we broke the offline
>> cache into separate files, instead of maintaining one large file, is
>> to reduce the need for file I/O, since the files only need to be
>> accessed as you use them or as the bug reports get changed on disk.
>> The only large update to the cache happens when you first create a
>> query, since all of the data for the corresponding tasks will be
>> asynchronously retrieved at that point.
>
> Thanks for the info - especially the fact that Mylyn doesn't actually
> enumerate the offline folder's contents. I'll move my workspace to my
> local disk - I'm not really sure that removing synchronous disk access
> will help much, even if Eclipse stays responsive, Mylyn still has to
> either load the data from/to disk via the network (slow), or cache it in
> memory (increased memory usage and/or swapping to disk).

Ok, I stand corrected - I enabled "focus on workweek", an with over
36000 tasks in the "archive" category, eclipse slowed to a crawl and
became quite unresponsive (with large CPU usage spikes), even when
having everything on a local disk. I'll go file a bug.

Kim
Re: Mylyn performance on network drives? [message #19238 is a reply to message #17450] Thu, 26 July 2007 01:18 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: beatmik.acm.org

Mark Kirchner wrote:
> Well, no, not necessaryly. If you are trying to access the file /by its
> name/ there should be absolutly no need for the OS / the filesystem to
> generate a directory listing.

Yup, this is what we are relying on, i.e. the fact that accessing files
by path does not invoke the overhead of listing the directory. For this
reason we currently have a policy of never listing the directory and
have some rudimentary caching in place to help ensure that the
performance is transparent when the data is on the local filesystem. We
could do considerably fancier read/write caching but have not yet
noticed an need for this.

Mik
Re: Mylyn performance on network drives? [message #19256 is a reply to message #17469] Thu, 26 July 2007 01:32 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: beatmik.acm.org

Kim Sullivan wrote:
> Ok, I stand corrected - I enabled "focus on workweek", an with over
> 36000 tasks in the "archive" category, eclipse slowed to a crawl and
> became quite unresponsive (with large CPU usage spikes), even when
> having everything on a local disk. I'll go file a bug.

Kim: the Eclipse Platform's FilteredTree mechanism, which we use for the
"Find:" functionality, has some scalability problems when dealing with
this number of nodes:

186425: [FilteredTree] FilteredTree does not scale to arbitrary numbers
of nodes
https://bugs.eclipse.org/bugs/show_bug.cgi?id=186425

While it is possible for us to work around these, you should currently
consider the Task List to scale transparently to around 10K nodes (mine
is a bit under that), not 50K. What has likely happened is that you
created a series of very broad queries that brought a ton of tasks into
your Task List. I just added a FAQ entry on cleaning up your Task List.
Please post if that doesn't help, otherwise we can discuss the
performance issue further on bug.

http://wiki.eclipse.org/Mylyn_FAQ#The_Archive_category_conta ins_many_irrelevant_tasks.2C_how_do_I_clean_it_up.3F

Mik
Re: Mylyn performance on network drives? [message #19894 is a reply to message #19256] Fri, 27 July 2007 10:04 Go to previous messageGo to next message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
Mik Kersten wrote:
> Kim Sullivan wrote:
>> Ok, I stand corrected - I enabled "focus on workweek", an with over
>> 36000 tasks in the "archive" category, eclipse slowed to a crawl and
>
> While it is possible for us to work around these, you should currently
> consider the Task List to scale transparently to around 10K nodes (mine
> is a bit under that), not 50K.

Ooops, the 36000 tasks was a typo, it's only about 3600 tasks (the bug
report has the correct number of zeroes). The demo of the bug is with
about about 3100 tasks.

And getting those isn't such a problem, I basically followed the Mylyn
task list cheat sheet (but I selected P1,P2 and P3 priorities instead of
P1 and P2), which comes down to about 3k tasks.

Kim
Re: Mylyn performance on network drives? [message #20898 is a reply to message #19894] Sat, 28 July 2007 01:51 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: beatmik.acm.org

Kim Sullivan wrote:
> Ooops, the 36000 tasks was a typo, it's only about 3600 tasks (the bug
> report has the correct number of zeroes). The demo of the bug is with
> about about 3100 tasks.

Ah. 3600 is not a problem in general, but as we have been discussing on
bug having 3600 incomings can be.

> And getting those isn't such a problem, I basically followed the Mylyn
> task list cheat sheet (but I selected P1,P2 and P3 priorities instead of
> P1 and P2), which comes down to about 3k tasks.

Please file a bug against the cheat sheet, describing what got you to
make a query this broad. We should make sure that the cheat sheet does
not encourage users to make queries of this sort. But at least we now
know that someone uses the cheat sheet ;)

Mik
Re: Mylyn performance on network drives? [message #20911 is a reply to message #20898] Sat, 28 July 2007 10:23 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: alex_blewitt.yahoo.com

I noted that all the bugs in Mylyn are compressed with '.zip', which is the wrong compression method to be using; they should be written out with GZipOutputStream instead (if at all; the compression should be something that's user editable). My Eclipse runtime causes my machine to run much hotter than before, and I'm wondering if the amount of compression that's being done is causing the CPU load to go higher than it should be.

I was going to do some investigating, and report a bug on the subject and/or patch.

Alex.
Re: Mylyn performance on network drives? [message #25947 is a reply to message #20911] Thu, 06 September 2007 21:39 Go to previous message
Eclipse UserFriend
Originally posted by: beatmik.acm.org

Alex Blewitt wrote:
> I noted that all the bugs in Mylyn are compressed with '.zip', which is the wrong compression method to be using; they should be written out with GZipOutputStream instead (if at all; the compression should be something that's user editable). My Eclipse runtime causes my machine to run much hotter than before, and I'm wondering if the amount of compression that's being done is causing the CPU load to go higher than it should be.
>
> I was going to do some investigating, and report a bug on the subject and/or patch.

At the cost of a bit of additional CPU load, the zipping reduces disk
I/O by an order of magnitude, making for much faster file read/write
times. On an IDE/ATA drive this is likely to correspond to less CPU
usage overall, since disk I/O uses the CPU. Zipping things in this way
is not something we came up with, but are following suit of file formats
like the MS and Open Office ZIP XML formats. However, I wouldn't be
surprised if there is room for improvement in how we're zipping the
files, so could you please file a bug on using GZipOutputStream instead,
listing any benefits? I've heard this mentioned before and we should
investigate.

Mik
Re: Mylyn performance on network drives? [message #573310 is a reply to message #15845] Wed, 18 July 2007 21:32 Go to previous message
Alex Blewitt is currently offline Alex BlewittFriend
Messages: 946
Registered: July 2009
Senior Member
Having anything on a network drive is a bad idea generally.

That being said, Windows has serious performance issues when dealing with more than 500 files in the same directory. This should be raised as a bug at https://bugs.eclipse.org with the name of the folder that you found, because it needs to be fixed. It shouldn't be generating that much data in one directory.

Alex.
Re: Mylyn performance on network drives? [message #573705 is a reply to message #15886] Thu, 19 July 2007 19:22 Go to previous message
Mik Kersten is currently offline Mik KerstenFriend
Messages: 287
Registered: July 2009
Senior Member
Alex,

Note that it is Mylyn generating all these files and storing them in
..metadata/.mylyn/offline/<repository-url>/. They are offline copies of
bug reports in the Task List. We don't currently foresee any problem
with having thousands of files in these directories because we are not
aware of Windows having trouble accessing large directories when you
access by name instead of trying to do a directory listing (e.g.
Windows/System32 has thousands of files). But if you know of any
indications or evidence to the contrary please let us know.

Mik

Alex Blewitt wrote:
> Having anything on a network drive is a bad idea generally.
>
> That being said, Windows has serious performance issues when dealing with more than 500 files in the same directory. This should be raised as a bug at https://bugs.eclipse.org with the name of the folder that you found, because it needs to be fixed. It shouldn't be generating that much data in one directory.
>
> Alex.
Re: Mylyn performance on network drives? [message #573830 is a reply to message #15845] Thu, 19 July 2007 19:32 Go to previous message
Mik Kersten is currently offline Mik KerstenFriend
Messages: 287
Registered: July 2009
Senior Member
Hi Kim,

As with other things that are kept in Eclipse's .metadata folder the
offline cache we expect fast I/O access to that cache (what Alex was
alluding to). But one of the key reasons that we broke the offline
cache into separate files, instead of maintaining one large file, is to
reduce the need for file I/O, since the files only need to be accessed
as you use them or as the bug reports get changed on disk. The only
large update to the cache happens when you first create a query, since
all of the data for the corresponding tasks will be asynchronously
retrieved at that point.

Sounds like you've got three options:

1) Move your workspace to a local disk.

2) Move Mylyn's task data folder to a local disk (via Preferences ->
Task List -> Advanced)

3) Help us figure out how to better support a non-local cache be filing
a bug report. For example, we could try to remove any unneeded
synchronous access to the cache.

Mik

Kim Sullivan wrote:
> I've been having performance problems using Mylyn with my workspace
> placed on a network drive (I assume that those problems come from having
> a remote workspace, I might be mistaken).
>
> First, there was an issue with tooltips (I had them accidentally set to
> wait for 0ms before showing, and Mylyn was trying to load too many
> tooltips at once - over the network, which caused some lag). Setting a
> more reasonable tooltip delay time fixed this
> (https://bugs.eclipse.org/bugs/show_bug.cgi?id=196109)
>
> But Eclipse was still quite unresponsive - I even deleted all the
> queries to bugs.eclipse.org so the system wouldn't spent time
> synchronizing them, without success (and it was still spending
> considerable time "synchronizing 0 queries").
>
> The weird thing was that I kept getting change notifications for
> tasks/bugs that I had already deleted from the task list (or so I
> presumed).
>
> So I used Total Commander to look into the offline folder of mylin
> metadata - and lo and behold, even getting a directory listing for all
> of the 3300 stored bugs took a few seconds. No wonder Eclipse was so
> sluggish (I assume it was going through the list of 3300 offline bugs
> every time and looking for changes... even though I had none of those
> bugs or queries in my task list). I even had to increase the VM size,
> because I was getting heap space errors from Mylyn (last time I checked,
> eclipse had about 500MB in memory and about 700MB virtual memory total).
>
> So I manually deleted all of the 3k files in the offline directory. Now
> Mylyn still spends some time "synchronizing" but at least Eclipse stays
> responsive.
>
> My question is - can something be done about this, maybe limiting the
> amount of files stored offline? Or is it just an issue of me having my
> workspace on a network drive? Is there something I can do to diagnose
> the problem further?
>
> Regards,
> Kim Sullivan
Re: Mylyn performance on network drives? [message #573963 is a reply to message #17399] Thu, 19 July 2007 23:07 Go to previous message
Alex Blewitt is currently offline Alex BlewittFriend
Messages: 946
Registered: July 2009
Senior Member
I know that it can cause problems, having worked through these recently :-) It's also fairly easy to test; create a directory with 5000 files, then open it up in Windows Explorer. Witness the amount of time it takes to do a directory traversal and generate the data. Any file system operation that hits (or attemtps to hit) any file in that directory will be blocked for seconds as the file system responds.

Indeed, it's trivial to have a Java class that will demo this; simply 'touch' thousands of files, then print out the time stamp, followed by File.exists, followed by the timestamp.

Alex.
Re: Mylyn performance on network drives? [message #574000 is a reply to message #17422] Thu, 19 July 2007 23:11 Go to previous message
Alex Blewitt is currently offline Alex BlewittFriend
Messages: 946
Registered: July 2009
Senior Member
NB it's also fairly easy to work around; simply create several subdirectories e.g. a-z or 1-9 to partition the file space in some way. BTW this also helps performance on Linux systems, though it's less noticable because Linux filing systems are generally not as bad as Windows ones are.

Alex.
Re: Mylyn performance on network drives? [message #574191 is a reply to message #17422] Fri, 20 July 2007 14:40 Go to previous message
Mark Kirchner is currently offline Mark KirchnerFriend
Messages: 8
Registered: July 2009
Junior Member
Alex Blewitt schrieb:
> create a directory with 5000 files, then open it up in Windows
> Explorer. Witness the amount of time it takes to do a directory
> traversal and generate the data. Any file system operation that hits
> (or attemtps to hit) any file in that directory will be blocked for
> seconds as the file system responds.

Well, no, not necessaryly. If you are trying to access the file /by its
name/ there should be absolutly no need for the OS / the filesystem to
generate a directory listing.

On the other hand, if you open up the directory in Windows Explorer it
obviously has to generate such a listing which takes time. So these two
cases are not comparable. And BTW: Create a sufficiently large directory
on Linux and you'll get the same problems, too.

Regards,
Mark
Re: Mylyn performance on network drives? [message #574224 is a reply to message #17408] Sat, 21 July 2007 10:45 Go to previous message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
Hi Mik,

Mik Kersten wrote:
> As with other things that are kept in Eclipse's .metadata folder the
> offline cache we expect fast I/O access to that cache (what Alex was
> alluding to). But one of the key reasons that we broke the offline
> cache into separate files, instead of maintaining one large file, is to
> reduce the need for file I/O, since the files only need to be accessed
> as you use them or as the bug reports get changed on disk. The only
> large update to the cache happens when you first create a query, since
> all of the data for the corresponding tasks will be asynchronously
> retrieved at that point.

Thanks for the info - especially the fact that Mylyn doesn't actually
enumerate the offline folder's contents. I'll move my workspace to my
local disk - I'm not really sure that removing synchronous disk access
will help much, even if Eclipse stays responsive, Mylyn still has to
either load the data from/to disk via the network (slow), or cache it in
memory (increased memory usage and/or swapping to disk).

I noticed one thing though - after moving the workspace, Mylyn seems to
have stopped trying to synchronize 0 queries (and downloading new
offline tasks, even though I deleted all queries). I'm not sure if this
is just a coincidence (no bugs appeared in the last two days, or none of
the bugs that had previously been in the cache have been changed), a
result of quicker access to disk (synchronization is nearly
instantenious), the fact that I changed the eclipse.org task repository
repository to offline and then back online, manually deleting tasks from
disk or something completely different.

I'm new to Mylyn, so I have no idea how it is supposed to work, and if I
should try to reproduce the former behavior somehow for a bug report. Is
Mylyn supposed to keep all offline tasks in sync with the repository,
regardless of the fact that they're deleted from the task list?

Kim
Re: Mylyn performance on network drives? [message #574263 is a reply to message #17459] Sat, 21 July 2007 11:38 Go to previous message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
Kim Sullivan wrote:
> Hi Mik,
>
> Mik Kersten wrote:
>> As with other things that are kept in Eclipse's .metadata folder the
>> offline cache we expect fast I/O access to that cache (what Alex was
>> alluding to). But one of the key reasons that we broke the offline
>> cache into separate files, instead of maintaining one large file, is
>> to reduce the need for file I/O, since the files only need to be
>> accessed as you use them or as the bug reports get changed on disk.
>> The only large update to the cache happens when you first create a
>> query, since all of the data for the corresponding tasks will be
>> asynchronously retrieved at that point.
>
> Thanks for the info - especially the fact that Mylyn doesn't actually
> enumerate the offline folder's contents. I'll move my workspace to my
> local disk - I'm not really sure that removing synchronous disk access
> will help much, even if Eclipse stays responsive, Mylyn still has to
> either load the data from/to disk via the network (slow), or cache it in
> memory (increased memory usage and/or swapping to disk).

Ok, I stand corrected - I enabled "focus on workweek", an with over
36000 tasks in the "archive" category, eclipse slowed to a crawl and
became quite unresponsive (with large CPU usage spikes), even when
having everything on a local disk. I'll go file a bug.

Kim
Re: Mylyn performance on network drives? [message #575310 is a reply to message #17450] Thu, 26 July 2007 01:18 Go to previous message
Mik Kersten is currently offline Mik KerstenFriend
Messages: 287
Registered: July 2009
Senior Member
Mark Kirchner wrote:
> Well, no, not necessaryly. If you are trying to access the file /by its
> name/ there should be absolutly no need for the OS / the filesystem to
> generate a directory listing.

Yup, this is what we are relying on, i.e. the fact that accessing files
by path does not invoke the overhead of listing the directory. For this
reason we currently have a policy of never listing the directory and
have some rudimentary caching in place to help ensure that the
performance is transparent when the data is on the local filesystem. We
could do considerably fancier read/write caching but have not yet
noticed an need for this.

Mik
Re: Mylyn performance on network drives? [message #575365 is a reply to message #17469] Thu, 26 July 2007 01:32 Go to previous message
Mik Kersten is currently offline Mik KerstenFriend
Messages: 287
Registered: July 2009
Senior Member
Kim Sullivan wrote:
> Ok, I stand corrected - I enabled "focus on workweek", an with over
> 36000 tasks in the "archive" category, eclipse slowed to a crawl and
> became quite unresponsive (with large CPU usage spikes), even when
> having everything on a local disk. I'll go file a bug.

Kim: the Eclipse Platform's FilteredTree mechanism, which we use for the
"Find:" functionality, has some scalability problems when dealing with
this number of nodes:

186425: [FilteredTree] FilteredTree does not scale to arbitrary numbers
of nodes
https://bugs.eclipse.org/bugs/show_bug.cgi?id=186425

While it is possible for us to work around these, you should currently
consider the Task List to scale transparently to around 10K nodes (mine
is a bit under that), not 50K. What has likely happened is that you
created a series of very broad queries that brought a ton of tasks into
your Task List. I just added a FAQ entry on cleaning up your Task List.
Please post if that doesn't help, otherwise we can discuss the
performance issue further on bug.

http://wiki.eclipse.org/Mylyn_FAQ#The_Archive_category_conta ins_many_irrelevant_tasks.2C_how_do_I_clean_it_up.3F

Mik
Re: Mylyn performance on network drives? [message #575897 is a reply to message #19256] Fri, 27 July 2007 10:04 Go to previous message
Kim Sullivan is currently offline Kim SullivanFriend
Messages: 44
Registered: July 2009
Member
Mik Kersten wrote:
> Kim Sullivan wrote:
>> Ok, I stand corrected - I enabled "focus on workweek", an with over
>> 36000 tasks in the "archive" category, eclipse slowed to a crawl and
>
> While it is possible for us to work around these, you should currently
> consider the Task List to scale transparently to around 10K nodes (mine
> is a bit under that), not 50K.

Ooops, the 36000 tasks was a typo, it's only about 3600 tasks (the bug
report has the correct number of zeroes). The demo of the bug is with
about about 3100 tasks.

And getting those isn't such a problem, I basically followed the Mylyn
task list cheat sheet (but I selected P1,P2 and P3 priorities instead of
P1 and P2), which comes down to about 3k tasks.

Kim
Re: Mylyn performance on network drives? [message #575988 is a reply to message #19894] Sat, 28 July 2007 01:51 Go to previous message
Mik Kersten is currently offline Mik KerstenFriend
Messages: 287
Registered: July 2009
Senior Member
Kim Sullivan wrote:
> Ooops, the 36000 tasks was a typo, it's only about 3600 tasks (the bug
> report has the correct number of zeroes). The demo of the bug is with
> about about 3100 tasks.

Ah. 3600 is not a problem in general, but as we have been discussing on
bug having 3600 incomings can be.

> And getting those isn't such a problem, I basically followed the Mylyn
> task list cheat sheet (but I selected P1,P2 and P3 priorities instead of
> P1 and P2), which comes down to about 3k tasks.

Please file a bug against the cheat sheet, describing what got you to
make a query this broad. We should make sure that the cheat sheet does
not encourage users to make queries of this sort. But at least we now
know that someone uses the cheat sheet ;)

Mik
Re: Mylyn performance on network drives? [message #576112 is a reply to message #20898] Sat, 28 July 2007 10:23 Go to previous message
Alex Blewitt is currently offline Alex BlewittFriend
Messages: 946
Registered: July 2009
Senior Member
I noted that all the bugs in Mylyn are compressed with '.zip', which is the wrong compression method to be using; they should be written out with GZipOutputStream instead (if at all; the compression should be something that's user editable). My Eclipse runtime causes my machine to run much hotter than before, and I'm wondering if the amount of compression that's being done is causing the CPU load to go higher than it should be.

I was going to do some investigating, and report a bug on the subject and/or patch.

Alex.
Re: Mylyn performance on network drives? [message #579829 is a reply to message #20911] Thu, 06 September 2007 21:39 Go to previous message
Mik Kersten is currently offline Mik KerstenFriend
Messages: 287
Registered: July 2009
Senior Member
Alex Blewitt wrote:
> I noted that all the bugs in Mylyn are compressed with '.zip', which is the wrong compression method to be using; they should be written out with GZipOutputStream instead (if at all; the compression should be something that's user editable). My Eclipse runtime causes my machine to run much hotter than before, and I'm wondering if the amount of compression that's being done is causing the CPU load to go higher than it should be.
>
> I was going to do some investigating, and report a bug on the subject and/or patch.

At the cost of a bit of additional CPU load, the zipping reduces disk
I/O by an order of magnitude, making for much faster file read/write
times. On an IDE/ATA drive this is likely to correspond to less CPU
usage overall, since disk I/O uses the CPU. Zipping things in this way
is not something we came up with, but are following suit of file formats
like the MS and Open Office ZIP XML formats. However, I wouldn't be
surprised if there is room for improvement in how we're zipping the
files, so could you please file a bug on using GZipOutputStream instead,
listing any benefits? I've heard this mentioned before and we should
investigate.

Mik
Previous Topic:Connecting to dev.java.net
Next Topic:How does Mylyn determine the email addresses available when completing Bugzilla fields?
Goto Forum:
  


Current Time: Thu Apr 25 21:36:16 GMT 2024

Powered by FUDForum. Page generated in 0.05025 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top