[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
RE: [stellation-res] Database Package Audit.
|
On Sat, 2002-08-31 at 20:35, Jonathan Gossage wrote:
>
>
> > >-----Original Message-----
> > >From: stellation-res-admin@xxxxxxxxxxxxxxx
> > >[mailto:stellation-res-admin@xxxxxxxxxxxxxxx]On Behalf Of Mark C.
> > >Chu-Carroll
> > >Sent: August 31, 2002 7:57 AM
> > >To: stellation-res@xxxxxxxxxxxxxxx
> > >Subject: RE: [stellation-res] Database Package Audit.
> > >
> > >
> > >On Sat, 2002-08-31 at 15:10, Jonathan Gossage wrote:
> > >>I will be sending a separate document discussing extensions to the
> > >>database support needed by MySql as well as some suggestions for using
> > >>database batch facilities and other performance enhancing methods that
> > >>may be database specific.
> > >
> > >Batching is what I'm trying to do next. There's preliminary support
> > >for it in the new database code, and I'm going to be modifying the code
> > >to use batch where appropriate. We never did it before because postgres
> > >doesn't really support it. Now we're adding support for quite a few
> > >databases, and postgres is the only one that doesn't do JDBC batch.
> > >
>
> You might want to consider a couple of scenarios here. The first would be
> "manual" batching where the batching decisions are made in the code the
> generates the SQL statements.
This is what the code currently in the database component is implemented
to do.
The second would be to push both batching and
> SQL statement generation lower down into database specific code and simply
> pass in the collection of values that are to be inserted or updated along
> with a template for the SQL statement. The routine would then take care of
> batching and generation of the SQL "under the hood". This approach is
> particularly attractive for MySQL since it has a special optimization for
> INSERT which is considerably more efficient that simply batching INSERT
> statements. MySQL allows INSERT statements with multiple VALUE sets, thus
> allowing multiple rows to be inserted in a single INSERT statement. This is
> the recommended optimization for bulk insertions in a table in MySQL.
I think that's a good idea, but it's also a lot more work than I really
have time to put into this part of the system. It means building a whole
abstraction layer around the execution of SQL statements. I'm not
opposed to that in principle, but as I said, I really don't have time to
do it myself.
My work schedule has me committed to a first cut at import of
fine-grained artifacts by the end of september, and a proof-of-concept
level implementation of all of the fine-grained functionality by the end
of november. To be able to do that, I can't afford to spend too much
more time on this database abstraction work. It's important stuff to
make the system work well, but it's a sideline to the research project,
which is really what I'm payed for.
If you'd like to try this, I'd love to see it in the system. I'd be glad
to help as much as I can (which will mostly be limited to things like
participating in design discussions).
One option that I could do would be to add something to the database
layer specifically for the kind of values batching that you can do in
MySQL. We could add another set of batch commands.
void startValuesBatch(String basicCommand);
void addValueSetToBatch(String valueset);
void executeValuesBatch();
Then, in databases like MySQL it could be done with a multiple values
statement. (I think DB2 also supports that.) In databases without, it
would end up being just normal batch execution.
-Mark
--
Mark Craig Chu-Carroll, IBM T.J. Watson Research Center
*** The Stellation project: Advanced SCM for Collaboration
*** http://www.eclipse.org/stellation
*** Work Email: mcc@xxxxxxxxxxxxxx ------- Personal Email:
markcc@xxxxxxxxxxx