[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
RE: Project Model Improvements Re: [cdt-dev] CDT Summit Report
|
No the real problem is that we're storing way to much data.
There is a lot of duplication that's there now. And I mean a lot. We should only
be storing data that the user has changed from the defaults. And we need to look
at how the scanner discovery information is stored which also polutes this
file.
The problem isn't XML. There are efficient ways of loading
XML, i.e. SAX. We also need to make sure the data structures we create from it
are efficient.
Doug.
I agree that a proper relational database is a better solution for storing
and retrieving data than an XML file. However there may be a problem with
going this route. Users should be able to check their entire project into
version control, including the .project and .cproject files. (There are
problems with sharing .cproject, but ideally it should work.) Now, if we use a
relational db instead, I'd assume the db would be stored in some kind of
binary file, and that may be a problem for some version control
systems.
So in the end we may be stuck using a text based format. I
don't think XML is inherently evil, its just the way we are handing the XML
that is clearly flawed. I'm not very experienced in this area but there may be
better approaches to processing XML, like xpath queries or something. Another
solution might be to write a SAX parser that directly builds the project
description AST, without loading the entire DOM into memory.
There are
ways of dealing with the problem of a crash during serialization. Whenever a
change needs to be saved we could first rename the .cproject file, then write
out a new version of the .cproject file, then delete the old one after. That
way if power is cut during serialization the old file is still
there.
Mike Kucera
Software Developer
IBM Eclipse CDT
Team
mkucera@xxxxxxxxxx
"James Blackburn" ---10/01/2008 07:00:11 AM---Hi
Doug,
From: |
"James Blackburn"
<jamesblackburn@xxxxxxxxx> |
To: |
"CDT General developers list."
<cdt-dev@xxxxxxxxxxx> |
Date: |
10/01/2008 07:00 AM |
Subject: |
Project Model Improvements Re:
[cdt-dev] CDT Summit Report |
Hi Doug,
> Not yet. We didn't get very much time on
build as we did on the other
> topics. I think the summary of it is to
redo the Project model to simplify
> it and to break the dependency on
managed build that was introduced with it
> in 4.0.
Is there any
indication on what's planned, who might be doing this
work, and in what
time frame?
Time's come full circle again for me and I'm once again
focusing on my
users with larger projects. The problem with the
current
implementation is twofold:
1) For an XML file >3M CDT exceeds
a 512M HEAP and performance is
really bad (BUG238421)
2) The current XML
model is not threadsafe (get the indexer going,
open settings page and hit
apply rapidly... BUG239627).
I think this all boils down to choice of
XML as the data structure.
It's verbose, inherently not threadsafe and, for
all its verbosity,
it's still not human readable. The tree duplication is
expensive in
terms of time and memory, and the end result is that changes
made to
the project description from different threads can easily be
lost
(BUG248962).
And all this before we consider what happens if a
powercut or crash
happens during serialization.
It's my feeling that
for something as important as the project model
we actually need a data
store with ACID properties with reasonable
performance -- a very
lightweight db would seem to be an ideal
solution.
I've got some
time in my schedule to start work on this, but am keen
not to tread on
anyone else's toes if they're intending on working on
the project model.
My first aim would be to port the existing project
model to use
sqllite as its db backend, and any changes to the actual
structure of the
model could be made in parallel or after.
If this all sounds like a
really bad idea then someone please say!
Otherwise it's the only solution I
can see that would ensure we have a
scalable project description with ACID
properties.
Cheers,
James
_______________________________________________
cdt-dev
mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev