Eclipse Community Forums: EMF » [CDO] Transfering large data

Help

Home

Home » Modeling » EMF » [CDO] Transfering large data

Show: Today's Messages :: Show Polls :: Message Navigator

[CDO] Transfering large data [message #650422]

Mon, 24 January 2011 19:08

Marco Lehmann-Mörz

Messages: 53
Registered: July 2009

Member

Hi all,

we have a simple model. It contains a tree of nodes and on each node there may be leafs.
Each leaf may have a huge table of data (worst case today: 60000 rows and 96 columns). Each table may be of different size in each dimension, which is driven by the user.
What may be to best approach to model the table? Down to row and column or threat the whole table as BLOB/CLOB.

With CDO4 we could use the new BLOB/CLOB-support but at the moment we are stuck to the 3.x branch.

Any hints are welcome!

Thanks in advance,
Marco

[Updated on: Mon, 24 January 2011 20:16]

Report message to a moderator

Re: [CDO] Transfering large data [message #650430 is a reply to message #650422]

Mon, 24 January 2011 20:20

Marco Lehmann-Mörz

Messages: 53
Registered: July 2009

Member

Hi all again,

how is the serialisation of objects organised?. The tables are quite big, but sometimes sparse. The sample table is 8MB as raw CSV, but only 300KB when compressed. So it makes sense to compress that special data type during transfer from/to server.

Marco

Report message to a moderator

Re: [CDO] Transfering large data [message #650464 is a reply to message #650422]

Tue, 25 January 2011 06:06

Eike Stepper

Messages: 6682
Registered: July 2009

Senior Member

Hi Marco,

Whether you formally model something or just write code for it, the software solution not only depends on the structure of the data (dimensions, dense/sparse, ...) but also on the common usage patterns (read/write ratio, ...). I can hardly give good advice without knowing all your application requirements and that would be a consulting effort ;-)

The new LOB support in CDO 4.0 differs in these aspects from normal (e.g. String, byte[]) data types:

1) Internally LOBs have an own ID
2) Within an EObject they consume only fixed space (ID+size)
3) Only ID+size are eagerly loaded
4) They are immutable, changing their content creates new LOBs
5) Their ID is a digest of their content
6) They are shared among EObjects, each possible content is stored at most once
7) When they are changed only their new ID+size are sent to other clients with passiveUpdates=true
8) Their content is lazily loaded independent of their containing EObjects
9) Their content, once loaded, is cached on the local disk, configurable through CDOSession.setLobCache()
10) Their content is accessed through streams, similar to IFile.getContents()/setContents()

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Am 24.01.2011 20:08, schrieb Marco Lehmann-Mörz:
> Hi all,
>
> we have a simple model. It contains a tree of nodes and on each node there may be leafs.
> Each leaf may have a huge table of data (worst case today: 60000 rows and 96 columns). Each table may be of different size in each dimension, which is driven by the user.
> What may be to best approach to model that? Down to row and column or threat the whole table as BLOB.
>
> With CDO4 we could use the new BLOB/CLOB-support but at the moment we are stuck to the 3.x branch.
>
> Any hints would welcome!
>
> Thanks in advance,
> Marco

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

Re: [CDO] Transfering large data [message #650465 is a reply to message #650430]

Tue, 25 January 2011 06:16

Eike Stepper

Messages: 6682
Registered: July 2009

Senior Member

Am 24.01.2011 21:20, schrieb Marco Lehmann-Mörz:
> Hi all again,
>
> how is the serialisation of objects organised?.
That's a quite complex mechanism. Generally only new objects are serialized as whole CDORevisions, changes to existing objects are serialized as CDORevisionDeltas, deletions as CDOIDs/versions (plus EClass for certain optional CDO features). The values of most of the Ecore "built-in" data types are serialized in binary form, the values of all other data types are trasfered as Strings (as returned from their XyzFactoryImpl.convertAbcToString() method).

> The tables are quite big, but sometimes sparse. The sample table is 8MB as raw CSV, but only 300KB when compressed. So it makes sense to compress that special data type during transfer from/to server.
If you encode the structure externally and put the result into an Lob-like data type (String, byte[], ...) you'd lose the ability to lazily load parts of it or commit deltas of specificparts of it.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator