Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cdt-dev] remote indexing benchmarks

> My gut and experience tells me that, while it may not work well for your customers,
> I think we can make it usable by the general remote project community.
I totally agree on this point. Actually a major group of people that demand remote development of programs comes from people that are working on super computers that perform simulations or run other scientific applications. The typical scale of such programs are not as huge as firefox. Another major group of users come from the embeded device world, which I am not quite familiar with but presumable the scale of the programs are also limited.

Best regards,
Tianchao Li

Doug Schaefer wrote:
Thanks, Chris, this is good information. I'll have to take a more detailed
look later but here are some quick thoughts.

First of all, I have to disagree that SMB is representative. SMB is probably
the slowest file sharing system I know. We've never used it for compiles in
the years I was a Windows developer for this very reason. At the same time
we had used NFS for years on our Unix boxes and, at the time in the mid 90s,
NFS wasn't much slower than local disks, certainly not 4 times slower. NFS
is pretty optimized for this where Samba is really meant to share files, not
as a replacement for local disks as NFS is. I think it is important that we
get similar timings in an NFS environment before coming to conclusions.

My idea is to have a local agent that is optimized to fulfill EFS requests
as quickly as it can. SMB sucks and NFS is hard to get on Windows and having
an EFS optimized agent will probably make us even faster than NFS. Until we
can get that prototyped I'm not going to give up on EFS. My gut and
experience tells me that, while it may not work well for your customers, I
think we can make it usable by the general remote project community.

Unfortunately, we're all tied up getting 4.0 out the door. So this will have
to wait until after. One thing I do recommend is that you guys put your
stuff on a branch. That way, it is easier to study. Also, I think you really
need to try this in a real customer environment. 280MB index files are huge,
especially if every user gets one and you have 500 users on a machine.

Doug Schaefer, QNX Software Systems
Eclipse CDT Project Lead,

-----Original Message-----
From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On
Behalf Of Chris Recoskie
Sent: Friday, March 23, 2007 11:58 AM
To: cdt-dev@xxxxxxxxxxx
Subject: [cdt-dev] remote indexing benchmarks

Hello everyone,

As per some of the requests we received I ran some benchmarks comparing
remote indexing prototype to a few other scenarios.

Since we can't build Firefox on the zOS server that we're using a set of
test files that we generated ourselves.  There are 3,000 files each
containing 500 class definitions (15,000 class definitions).  The index
ends up being about 280 MB or so.  We'll attach this test suite to the
appropriate bugzillas soon.

Where I reference paging below it is in reference to the paging scheme
implemented for 3.1.2, which I ported over to our remote prototype that is
based on CDT 4.0.

local windows PC:
1565 seconds 26.1 minutes w/ paging

SMB share on zOS machine:
7429 seconds 123.82 minutes with paging

Remote zOS machine w/ remote prototype (w/ paging).
730.0 seconds or 28.83 minutes.  This doesn't include the 15-20 seconds it
takes to send over the list of files to index (due to the asynchronous
nature of the prototype it's hard to directly measure the sum time of the
entire operation).  So, call it 29 minutes.  Essentially this is roughly
the same as locally on the PC (which you'd expect).

So, already indexing over SMB takes four times as long.  Part of the
slowdown is due to codepage conversions of the text files that are
automatically taken care of by Samba (the codepage used by the zOS machine
is not the same as the PC so unless you translate the files you get
gibberish).  It should be noted that this remote zOS machine we used is
still not very indicative of our typical use case, as usually our
are using machines in other cities or even on other continents, and the
machine I'm using is located in the same lab as me with a 100 megabit

I think the SMB numbers reflect what will be typical for a standard EFS
implementation as really the operations involved are more or less the
same... translate the file to the required codepage and send over the
bytes.  Our users are not going to accept indexing times measured in hours
so it seems to us that indexing locally with remote files is not really
going to work.

Also, something to note is that it takes a good hour or so to copy the
files to/from the zOS machine via SMB.  Given that codepage conversion has
to happen, even if you want to get smart about compressing all the files
and doing a bulk transfer, you have to worry about converting them all
either before you send them, or after you receive them.  Both are messy
because you have to store the translated files so your project size on
doubles, not to mention this costs CPU cycles and time to translate the
files.  With our remote scheme, the files only ever have to be downloaded
and translated when you actually want to edit one of them.

I am open to discussion on the topic, but like we have been saying I don't
think the alternative scenarios like EFS that are being suggested will fit
our needs.


Chris Recoskie
Team Lead, IBM CDT Team
IBM Toronto

cdt-dev mailing list
cdt-dev mailing list

Back to the top