Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » non containment references to large data sets
non containment references to large data sets [message #425805] Tue, 09 December 2008 00:22 Go to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Take an example model like

<eClassifiers xsi:type="ecore:EClass" name="Transaction">
<eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
eType="#//Person"/>
<eStructuralFeatures xsi:type="ecore:EReference" name="seller"
eType="#//Person"/>
</eClassifiers>

where the buyer and seller references are non containment references to
Person and each Person comes from a large data base of people.

There are really two types of data here: transactional data; and static
data (i.e. Person).

The data set for Person is too large to hold in memory. Instead we
maintain a cache of loaded Persons.

When we serialise Transactions out into resources the Person objects must
be contained within another resource in that resource set.

The problem is that we process a lot of transactions and we don't really
want to hold onto them in a resource once we finished processing them as
this will just fill up memory.

We can simply toss the resource set at that point but that would toss the
cached Person objects.

Is there some way to chain up resource sets so that we can keep all the
cached Person objects in one resource set then create new resource sets
for each transaction and point that resource set to the shared Person
resource set?

Any advice on how best to manage data like this where we are primarily
focussed on processing transactional data which is associated with static
data from large data sets?
Re: non containment references to large data sets [message #425807 is a reply to message #425805] Tue, 09 December 2008 02:48 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33113
Registered: July 2009
Senior Member
Andrew,

Comments below.


Andrew H wrote:
> Take an example model like
>
> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
> eType="#//Person"/>
> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
> eType="#//Person"/>
> </eClassifiers>
>
> where the buyer and seller references are non containment references
> to Person and each Person comes from a large data base of people.
>
> There are really two types of data here: transactional data; and
> static data (i.e. Person).
>
> The data set for Person is too large to hold in memory. Instead we
> maintain a cache of loaded Persons.
>
> When we serialise Transactions out into resources the Person objects
> must be contained within another resource in that resource set.
>
> The problem is that we process a lot of transactions and we don't
> really want to hold onto them in a resource once we finished
> processing them as this will just fill up memory.
>
> We can simply toss the resource set at that point but that would toss
> the cached Person objects.
>
> Is there some way to chain up resource sets so that we can keep all
> the cached Person objects in one resource set then create new resource
> sets for each transaction and point that resource set to the shared
> Person resource set?
There's a delegatedGetResource method you could specialize to look up a
resource in another resource set.
>
> Any advice on how best to manage data like this where we are primarily
> focussed on processing transactional data which is associated with
> static data from large data sets?
CDO's probably quite good for this.


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: non containment references to large data sets [message #425810 is a reply to message #425805] Tue, 09 December 2008 08:47 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andrew,

Maybe my recent blog
http://thegordian.blogspot.com/2008/11/how-scalable-are-my-m odels.html
is interesting for you.

Cheers
/Eike

----
http://thegordian.blogspot.com



Andrew H schrieb:
> Take an example model like
>
> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
> eType="#//Person"/>
> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
> eType="#//Person"/>
> </eClassifiers>
>
> where the buyer and seller references are non containment references
> to Person and each Person comes from a large data base of people.
>
> There are really two types of data here: transactional data; and
> static data (i.e. Person).
>
> The data set for Person is too large to hold in memory. Instead we
> maintain a cache of loaded Persons.
>
> When we serialise Transactions out into resources the Person objects
> must be contained within another resource in that resource set.
>
> The problem is that we process a lot of transactions and we don't
> really want to hold onto them in a resource once we finished
> processing them as this will just fill up memory.
>
> We can simply toss the resource set at that point but that would toss
> the cached Person objects.
>
> Is there some way to chain up resource sets so that we can keep all
> the cached Person objects in one resource set then create new resource
> sets for each transaction and point that resource set to the shared
> Person resource set?
>
> Any advice on how best to manage data like this where we are primarily
> focussed on processing transactional data which is associated with
> static data from large data sets?
>


Re: non containment references to large data sets [message #425842 is a reply to message #425805] Tue, 09 December 2008 21:15 Go to previous messageGo to next message
Jason Henriksen is currently offline Jason HenriksenFriend
Messages: 231
Registered: July 2009
Senior Member
> We can simply toss the resource set at that point but that would toss
> the cached Person objects.

Out of curiosity, why does that happen? We use hibernate's cache and it
seems to hold on to the right stuff. We have on Resource per thread
that we fill it with stuff, serialize the stuff and then clear out the
resource at the end of each request. That seems to work for us.
(Or is it not working optimally and we haven't discovered it yet? Hmm....)


Jason

Andrew H wrote:
> Take an example model like
>
> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
> eType="#//Person"/>
> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
> eType="#//Person"/>
> </eClassifiers>
>
> where the buyer and seller references are non containment references to
> Person and each Person comes from a large data base of people.
>
> There are really two types of data here: transactional data; and static
> data (i.e. Person).
>
> The data set for Person is too large to hold in memory. Instead we
> maintain a cache of loaded Persons.
>
> When we serialise Transactions out into resources the Person objects
> must be contained within another resource in that resource set.
>
> The problem is that we process a lot of transactions and we don't really
> want to hold onto them in a resource once we finished processing them as
> this will just fill up memory.
>
> We can simply toss the resource set at that point but that would toss
> the cached Person objects.
>
> Is there some way to chain up resource sets so that we can keep all the
> cached Person objects in one resource set then create new resource sets
> for each transaction and point that resource set to the shared Person
> resource set?
>
> Any advice on how best to manage data like this where we are primarily
> focussed on processing transactional data which is associated with
> static data from large data sets?
>
Re: non containment references to large data sets [message #425844 is a reply to message #425807] Wed, 10 December 2008 02:58 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Ed Merks wrote:

> Andrew,

> Comments below.


> Andrew H wrote:
>> Take an example model like
>>
>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>> eType="#//Person"/>
>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>> eType="#//Person"/>
>> </eClassifiers>
>>
>> where the buyer and seller references are non containment references
>> to Person and each Person comes from a large data base of people.
>>
>> There are really two types of data here: transactional data; and
>> static data (i.e. Person).
>>
>> The data set for Person is too large to hold in memory. Instead we
>> maintain a cache of loaded Persons.
>>
>> When we serialise Transactions out into resources the Person objects
>> must be contained within another resource in that resource set.
>>
>> The problem is that we process a lot of transactions and we don't
>> really want to hold onto them in a resource once we finished
>> processing them as this will just fill up memory.
>>
>> We can simply toss the resource set at that point but that would toss
>> the cached Person objects.
>>
>> Is there some way to chain up resource sets so that we can keep all
>> the cached Person objects in one resource set then create new resource
>> sets for each transaction and point that resource set to the shared
>> Person resource set?
> There's a delegatedGetResource method you could specialize to look up a
> resource in another resource set.

Thanks this may do the trick for now.

>>
>> Any advice on how best to manage data like this where we are primarily
>> focussed on processing transactional data which is associated with
>> static data from large data sets?
> CDO's probably quite good for this.

I suspect you may be right. I'll have to take a closer look at it to see
how well it fits what we are doing
Re: non containment references to large data sets [message #425845 is a reply to message #425810] Wed, 10 December 2008 03:02 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Thanks Eike
That was very helpful. Its convinced me that I need to take a closer look
at CDO. It seems to have many features that will be very helpful to us
like resolving, loading, caching and garbage collecting data. This is just
what we need for our static data. It may even be helpful for some of our
lower level persistence related services for our transactional data.

I'll need to find some time to look more closely at it

cheers

Andrew

Eike Stepper wrote:

> Andrew,

> Maybe my recent blog
> http://thegordian.blogspot.com/2008/11/how-scalable-are-my-m odels.html
> is interesting for you.

> Cheers
> /Eike

> ----
> http://thegordian.blogspot.com



> Andrew H schrieb:
>> Take an example model like
>>
>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>> eType="#//Person"/>
>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>> eType="#//Person"/>
>> </eClassifiers>
>>
>> where the buyer and seller references are non containment references
>> to Person and each Person comes from a large data base of people.
>>
>> There are really two types of data here: transactional data; and
>> static data (i.e. Person).
>>
>> The data set for Person is too large to hold in memory. Instead we
>> maintain a cache of loaded Persons.
>>
>> When we serialise Transactions out into resources the Person objects
>> must be contained within another resource in that resource set.
>>
>> The problem is that we process a lot of transactions and we don't
>> really want to hold onto them in a resource once we finished
>> processing them as this will just fill up memory.
>>
>> We can simply toss the resource set at that point but that would toss
>> the cached Person objects.
>>
>> Is there some way to chain up resource sets so that we can keep all
>> the cached Person objects in one resource set then create new resource
>> sets for each transaction and point that resource set to the shared
>> Person resource set?
>>
>> Any advice on how best to manage data like this where we are primarily
>> focussed on processing transactional data which is associated with
>> static data from large data sets?
>>
Re: non containment references to large data sets [message #425846 is a reply to message #425842] Wed, 10 December 2008 03:26 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
To be honest, exactly how we will manage this is not yet determined. We
are migrating our application to EMF in a phased approach. Persistence
will be the next phase. However, in this phase I need to deal with the
fact that the transaction needs to link to the static data in a non
containment way which is causing me issues when I serialise the
transaction (mainly for debugging).

Hibernate caching is what I am expecting to use to cache the static data,
but I wasn't sure how that plays with Resources.

How are you using the two together? And how does that cope with if the
same cached object is needed on two threads at the same time? What
resource would that object live in? Or do you have a copy per thread?

cheers

Andrew

jason henriksen wrote:


> > We can simply toss the resource set at that point but that would toss
> > the cached Person objects.

> Out of curiosity, why does that happen? We use hibernate's cache and it
> seems to hold on to the right stuff. We have on Resource per thread
> that we fill it with stuff, serialize the stuff and then clear out the
> resource at the end of each request. That seems to work for us.
> (Or is it not working optimally and we haven't discovered it yet? Hmm....)


> Jason

> Andrew H wrote:
>> Take an example model like
>>
>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>> eType="#//Person"/>
>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>> eType="#//Person"/>
>> </eClassifiers>
>>
>> where the buyer and seller references are non containment references to
>> Person and each Person comes from a large data base of people.
>>
>> There are really two types of data here: transactional data; and static
>> data (i.e. Person).
>>
>> The data set for Person is too large to hold in memory. Instead we
>> maintain a cache of loaded Persons.
>>
>> When we serialise Transactions out into resources the Person objects
>> must be contained within another resource in that resource set.
>>
>> The problem is that we process a lot of transactions and we don't really
>> want to hold onto them in a resource once we finished processing them as
>> this will just fill up memory.
>>
>> We can simply toss the resource set at that point but that would toss
>> the cached Person objects.
>>
>> Is there some way to chain up resource sets so that we can keep all the
>> cached Person objects in one resource set then create new resource sets
>> for each transaction and point that resource set to the shared Person
>> resource set?
>>
>> Any advice on how best to manage data like this where we are primarily
>> focussed on processing transactional data which is associated with
>> static data from large data sets?
>>
Re: non containment references to large data sets [message #425849 is a reply to message #425845] Wed, 10 December 2008 07:52 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andrew,

Please take the time you need. If you're missing some interesting
functionality we'd appreciate to discuss that with you ;-)

Cheers
/Eike

----
http://thegordian.blogspot.com



Andrew H schrieb:
> Thanks Eike
> That was very helpful. Its convinced me that I need to take a closer
> look at CDO. It seems to have many features that will be very helpful
> to us like resolving, loading, caching and garbage collecting data.
> This is just what we need for our static data. It may even be helpful
> for some of our lower level persistence related services for our
> transactional data.
>
> I'll need to find some time to look more closely at it
>
> cheers
>
> Andrew
>
> Eike Stepper wrote:
>
>> Andrew,
>
>> Maybe my recent blog
>> http://thegordian.blogspot.com/2008/11/how-scalable-are-my-m odels.html
>> is interesting for you.
>
>> Cheers
>> /Eike
>
>> ----
>> http://thegordian.blogspot.com
>
>
>
>> Andrew H schrieb:
>>> Take an example model like
>>>
>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>> eType="#//Person"/>
>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>> eType="#//Person"/>
>>> </eClassifiers>
>>>
>>> where the buyer and seller references are non containment references
>>> to Person and each Person comes from a large data base of people.
>>>
>>> There are really two types of data here: transactional data; and
>>> static data (i.e. Person).
>>>
>>> The data set for Person is too large to hold in memory. Instead we
>>> maintain a cache of loaded Persons.
>>>
>>> When we serialise Transactions out into resources the Person objects
>>> must be contained within another resource in that resource set.
>>>
>>> The problem is that we process a lot of transactions and we don't
>>> really want to hold onto them in a resource once we finished
>>> processing them as this will just fill up memory.
>>>
>>> We can simply toss the resource set at that point but that would
>>> toss the cached Person objects.
>>>
>>> Is there some way to chain up resource sets so that we can keep all
>>> the cached Person objects in one resource set then create new
>>> resource sets for each transaction and point that resource set to
>>> the shared Person resource set?
>>>
>>> Any advice on how best to manage data like this where we are
>>> primarily focussed on processing transactional data which is
>>> associated with static data from large data sets?
>>>
>
>


Re: non containment references to large data sets [message #425850 is a reply to message #425846] Wed, 10 December 2008 07:55 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andrew,

Just one more side note. With CDO you also have the advantage that you
can switch the back-end persistence technology without impacting the
application. We're even working on a back-end agnostic query language
handler...

Cheers
/Eike

----
http://thegordian.blogspot.com



Andrew H schrieb:
> To be honest, exactly how we will manage this is not yet determined.
> We are migrating our application to EMF in a phased approach.
> Persistence will be the next phase. However, in this phase I need to
> deal with the fact that the transaction needs to link to the static
> data in a non containment way which is causing me issues when I
> serialise the transaction (mainly for debugging).
>
> Hibernate caching is what I am expecting to use to cache the static
> data, but I wasn't sure how that plays with Resources.
>
> How are you using the two together? And how does that cope with if the
> same cached object is needed on two threads at the same time? What
> resource would that object live in? Or do you have a copy per thread?
>
> cheers
>
> Andrew
>
> jason henriksen wrote:
>
>
>> > We can simply toss the resource set at that point but that would toss
>> > the cached Person objects.
>
>> Out of curiosity, why does that happen? We use hibernate's cache and
>> it seems to hold on to the right stuff. We have on Resource per
>> thread that we fill it with stuff, serialize the stuff and then clear
>> out the resource at the end of each request. That seems to work for us.
>> (Or is it not working optimally and we haven't discovered it yet?
>> Hmm....)
>
>
>> Jason
>
>> Andrew H wrote:
>>> Take an example model like
>>>
>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>> eType="#//Person"/>
>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>> eType="#//Person"/>
>>> </eClassifiers>
>>>
>>> where the buyer and seller references are non containment references
>>> to Person and each Person comes from a large data base of people.
>>>
>>> There are really two types of data here: transactional data; and
>>> static data (i.e. Person).
>>>
>>> The data set for Person is too large to hold in memory. Instead we
>>> maintain a cache of loaded Persons.
>>>
>>> When we serialise Transactions out into resources the Person objects
>>> must be contained within another resource in that resource set.
>>>
>>> The problem is that we process a lot of transactions and we don't
>>> really want to hold onto them in a resource once we finished
>>> processing them as this will just fill up memory.
>>>
>>> We can simply toss the resource set at that point but that would
>>> toss the cached Person objects.
>>>
>>> Is there some way to chain up resource sets so that we can keep all
>>> the cached Person objects in one resource set then create new
>>> resource sets for each transaction and point that resource set to
>>> the shared Person resource set?
>>>
>>> Any advice on how best to manage data like this where we are
>>> primarily focussed on processing transactional data which is
>>> associated with static data from large data sets?
>>>
>
>


Re: non containment references to large data sets [message #425882 is a reply to message #425846] Wed, 10 December 2008 22:38 Go to previous messageGo to next message
Jason Henriksen is currently offline Jason HenriksenFriend
Messages: 231
Registered: July 2009
Senior Member
We use Teneo. Teneo is excellent because it takes your EMF objects and
persists them into a database via Hibernate. It works really well.

It's an eclipse project, but you can find out more here:
http://elver.org/hibernate/index.html

Also, I'm wondering if you're running into a containment problem.

Also:
Is your Cache also an EMF object? If so, you might be having a problem
with EMF's containment modeling.

Jason






Andrew H wrote:
> To be honest, exactly how we will manage this is not yet determined. We
> are migrating our application to EMF in a phased approach. Persistence
> will be the next phase. However, in this phase I need to deal with the
> fact that the transaction needs to link to the static data in a non
> containment way which is causing me issues when I serialise the
> transaction (mainly for debugging).
>
> Hibernate caching is what I am expecting to use to cache the static
> data, but I wasn't sure how that plays with Resources.
>
> How are you using the two together? And how does that cope with if the
> same cached object is needed on two threads at the same time? What
> resource would that object live in? Or do you have a copy per thread?
>
> cheers
>
> Andrew
>
> jason henriksen wrote:
>
>
>> > We can simply toss the resource set at that point but that would toss
>> > the cached Person objects.
>
>> Out of curiosity, why does that happen? We use hibernate's cache and
>> it seems to hold on to the right stuff. We have on Resource per
>> thread that we fill it with stuff, serialize the stuff and then clear
>> out the resource at the end of each request. That seems to work for us.
>> (Or is it not working optimally and we haven't discovered it yet?
>> Hmm....)
>
>
>> Jason
>
>> Andrew H wrote:
>>> Take an example model like
>>>
>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>> eType="#//Person"/>
>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>> eType="#//Person"/>
>>> </eClassifiers>
>>>
>>> where the buyer and seller references are non containment references
>>> to Person and each Person comes from a large data base of people.
>>>
>>> There are really two types of data here: transactional data; and
>>> static data (i.e. Person).
>>>
>>> The data set for Person is too large to hold in memory. Instead we
>>> maintain a cache of loaded Persons.
>>>
>>> When we serialise Transactions out into resources the Person objects
>>> must be contained within another resource in that resource set.
>>>
>>> The problem is that we process a lot of transactions and we don't
>>> really want to hold onto them in a resource once we finished
>>> processing them as this will just fill up memory.
>>>
>>> We can simply toss the resource set at that point but that would toss
>>> the cached Person objects.
>>>
>>> Is there some way to chain up resource sets so that we can keep all
>>> the cached Person objects in one resource set then create new
>>> resource sets for each transaction and point that resource set to the
>>> shared Person resource set?
>>>
>>> Any advice on how best to manage data like this where we are
>>> primarily focussed on processing transactional data which is
>>> associated with static data from large data sets?
>>>
>
>
Re: non containment references to large data sets [message #425884 is a reply to message #425882] Thu, 11 December 2008 04:44 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
I evaluated Teneo about a month ago and was impressed with it. We plan to
use it when we do our new persistence in the next phase of the project.

However, one thing that I am yet to resolve is how to handle our static
data. Our transactional data will refer to the static data but when it
serialises we will serialise only references to static data. This means
any parts of our system that manipulate these transactions will need a
mechanism to resolve the static data again.

The reason we don't want to serialise the static data with the
transactional data is that the full object graph for the static data might
be huge. For example take a simple model where a Department has a list of
Person's which is a containment relationship. If we associate the
Department with a Transaction then you can reach all the Person's (eg.
>100,000) from the Department. So if this whole tree was serialised we'd
send the whole database across the wire.

To avoid this we plan to use the Resource mechanism in EMF. Another
approach might be to use Hibernates proxy mechanism here.

All that is for the next phase but I ran into an issue that I need to
resolve now. Because the relationships to the static data are non
containment (for the above reasons) then when I go to serialise a
transaction (e.g. to print it out) it fails cause the static data is not
in a resource.

I was hoping that for now I could do something simple that aligns with
what we will do for the next phase which is why I asked this question.

But I think its to big an issue for now so I am going with a cludgy
workaround.

jason henriksen wrote:


> We use Teneo. Teneo is excellent because it takes your EMF objects and
> persists them into a database via Hibernate. It works really well.

> It's an eclipse project, but you can find out more here:
> http://elver.org/hibernate/index.html

> Also, I'm wondering if you're running into a containment problem.

> Also:
> Is your Cache also an EMF object? If so, you might be having a problem
> with EMF's containment modeling.

> Jason






> Andrew H wrote:
>> To be honest, exactly how we will manage this is not yet determined. We
>> are migrating our application to EMF in a phased approach. Persistence
>> will be the next phase. However, in this phase I need to deal with the
>> fact that the transaction needs to link to the static data in a non
>> containment way which is causing me issues when I serialise the
>> transaction (mainly for debugging).
>>
>> Hibernate caching is what I am expecting to use to cache the static
>> data, but I wasn't sure how that plays with Resources.
>>
>> How are you using the two together? And how does that cope with if the
>> same cached object is needed on two threads at the same time? What
>> resource would that object live in? Or do you have a copy per thread?
>>
>> cheers
>>
>> Andrew
>>
>> jason henriksen wrote:
>>
>>
>>> > We can simply toss the resource set at that point but that would toss
>>> > the cached Person objects.
>>
>>> Out of curiosity, why does that happen? We use hibernate's cache and
>>> it seems to hold on to the right stuff. We have on Resource per
>>> thread that we fill it with stuff, serialize the stuff and then clear
>>> out the resource at the end of each request. That seems to work for us.
>>> (Or is it not working optimally and we haven't discovered it yet?
>>> Hmm....)
>>
>>
>>> Jason
>>
>>> Andrew H wrote:
>>>> Take an example model like
>>>>
>>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>>> eType="#//Person"/>
>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>>> eType="#//Person"/>
>>>> </eClassifiers>
>>>>
>>>> where the buyer and seller references are non containment references
>>>> to Person and each Person comes from a large data base of people.
>>>>
>>>> There are really two types of data here: transactional data; and
>>>> static data (i.e. Person).
>>>>
>>>> The data set for Person is too large to hold in memory. Instead we
>>>> maintain a cache of loaded Persons.
>>>>
>>>> When we serialise Transactions out into resources the Person objects
>>>> must be contained within another resource in that resource set.
>>>>
>>>> The problem is that we process a lot of transactions and we don't
>>>> really want to hold onto them in a resource once we finished
>>>> processing them as this will just fill up memory.
>>>>
>>>> We can simply toss the resource set at that point but that would toss
>>>> the cached Person objects.
>>>>
>>>> Is there some way to chain up resource sets so that we can keep all
>>>> the cached Person objects in one resource set then create new
>>>> resource sets for each transaction and point that resource set to the
>>>> shared Person resource set?
>>>>
>>>> Any advice on how best to manage data like this where we are
>>>> primarily focussed on processing transactional data which is
>>>> associated with static data from large data sets?
>>>>
>>
>>
Re: non containment references to large data sets [message #425885 is a reply to message #425850] Thu, 11 December 2008 04:47 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Thanks Eike

Our plans were to use Teneo for persistence but I may review that given
that SDO may help with some of our static data issues.

Can CDO and Teneo be used together?

It will be a couple of months before we hit the persistence phase of our
project so I will defer the eval till then.

cheers

Andrew

Eike Stepper wrote:

> Andrew,

> Just one more side note. With CDO you also have the advantage that you
> can switch the back-end persistence technology without impacting the
> application. We're even working on a back-end agnostic query language
> handler...

> Cheers
> /Eike

> ----
> http://thegordian.blogspot.com



> Andrew H schrieb:
>> To be honest, exactly how we will manage this is not yet determined.
>> We are migrating our application to EMF in a phased approach.
>> Persistence will be the next phase. However, in this phase I need to
>> deal with the fact that the transaction needs to link to the static
>> data in a non containment way which is causing me issues when I
>> serialise the transaction (mainly for debugging).
>>
>> Hibernate caching is what I am expecting to use to cache the static
>> data, but I wasn't sure how that plays with Resources.
>>
>> How are you using the two together? And how does that cope with if the
>> same cached object is needed on two threads at the same time? What
>> resource would that object live in? Or do you have a copy per thread?
>>
>> cheers
>>
>> Andrew
>>
>> jason henriksen wrote:
>>
>>
>>> > We can simply toss the resource set at that point but that would toss
>>> > the cached Person objects.
>>
>>> Out of curiosity, why does that happen? We use hibernate's cache and
>>> it seems to hold on to the right stuff. We have on Resource per
>>> thread that we fill it with stuff, serialize the stuff and then clear
>>> out the resource at the end of each request. That seems to work for us.
>>> (Or is it not working optimally and we haven't discovered it yet?
>>> Hmm....)
>>
>>
>>> Jason
>>
>>> Andrew H wrote:
>>>> Take an example model like
>>>>
>>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>>> eType="#//Person"/>
>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>>> eType="#//Person"/>
>>>> </eClassifiers>
>>>>
>>>> where the buyer and seller references are non containment references
>>>> to Person and each Person comes from a large data base of people.
>>>>
>>>> There are really two types of data here: transactional data; and
>>>> static data (i.e. Person).
>>>>
>>>> The data set for Person is too large to hold in memory. Instead we
>>>> maintain a cache of loaded Persons.
>>>>
>>>> When we serialise Transactions out into resources the Person objects
>>>> must be contained within another resource in that resource set.
>>>>
>>>> The problem is that we process a lot of transactions and we don't
>>>> really want to hold onto them in a resource once we finished
>>>> processing them as this will just fill up memory.
>>>>
>>>> We can simply toss the resource set at that point but that would
>>>> toss the cached Person objects.
>>>>
>>>> Is there some way to chain up resource sets so that we can keep all
>>>> the cached Person objects in one resource set then create new
>>>> resource sets for each transaction and point that resource set to
>>>> the shared Person resource set?
>>>>
>>>> Any advice on how best to manage data like this where we are
>>>> primarily focussed on processing transactional data which is
>>>> associated with static data from large data sets?
>>>>
>>
>>
Re: non containment references to large data sets [message #425887 is a reply to message #425885] Thu, 11 December 2008 07:40 Go to previous messageGo to next message
Thomas Schindl is currently offline Thomas SchindlFriend
Messages: 6651
Registered: July 2009
Senior Member
Andrew H schrieb:
> Thanks Eike
>
> Our plans were to use Teneo for persistence but I may review that given
> that SDO may help with some of our static data issues.
> Can CDO and Teneo be used together?
>

Yes Teneo is one of the persistence stores currently supported by CDO.

Tom

--
B e s t S o l u t i o n . at
------------------------------------------------------------ --------
Tom Schindl JFace-Committer
------------------------------------------------------------ --------
Re: non containment references to large data sets [message #425889 is a reply to message #425885] Thu, 11 December 2008 08:11 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andrew,

Comments below...



Andrew H schrieb:
> Thanks Eike
>
> Our plans were to use Teneo for persistence but I may review that
> given that SDO may help with some of our static data issues.
I bet it's just a typo but CDO is *not* SDO!

> Can CDO and Teneo be used together?
Yes, as Tom pointed out, we have a Teneo-based Hibernate backend
integration. Martin Taal and I developed it.

>
> It will be a couple of months before we hit the persistence phase of
> our project so I will defer the eval till then.
I'll be here ;-)

Cheers
/Eike

----
http://thegordian.blogspot.com


>
> cheers
>
> Andrew
>
> Eike Stepper wrote:
>
>> Andrew,
>
>> Just one more side note. With CDO you also have the advantage that
>> you can switch the back-end persistence technology without impacting
>> the application. We're even working on a back-end agnostic query
>> language handler...
>
>> Cheers
>> /Eike
>
>> ----
>> http://thegordian.blogspot.com
>
>
>
>> Andrew H schrieb:
>>> To be honest, exactly how we will manage this is not yet determined.
>>> We are migrating our application to EMF in a phased approach.
>>> Persistence will be the next phase. However, in this phase I need to
>>> deal with the fact that the transaction needs to link to the static
>>> data in a non containment way which is causing me issues when I
>>> serialise the transaction (mainly for debugging).
>>>
>>> Hibernate caching is what I am expecting to use to cache the static
>>> data, but I wasn't sure how that plays with Resources.
>>>
>>> How are you using the two together? And how does that cope with if
>>> the same cached object is needed on two threads at the same time?
>>> What resource would that object live in? Or do you have a copy per
>>> thread?
>>>
>>> cheers
>>>
>>> Andrew
>>>
>>> jason henriksen wrote:
>>>
>>>
>>>> > We can simply toss the resource set at that point but that would
>>>> toss
>>>> > the cached Person objects.
>>>
>>>> Out of curiosity, why does that happen? We use hibernate's cache
>>>> and it seems to hold on to the right stuff. We have on Resource
>>>> per thread that we fill it with stuff, serialize the stuff and then
>>>> clear out the resource at the end of each request. That seems to
>>>> work for us.
>>>> (Or is it not working optimally and we haven't discovered it yet?
>>>> Hmm....)
>>>
>>>
>>>> Jason
>>>
>>>> Andrew H wrote:
>>>>> Take an example model like
>>>>>
>>>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>>>> eType="#//Person"/>
>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>>>> eType="#//Person"/>
>>>>> </eClassifiers>
>>>>>
>>>>> where the buyer and seller references are non containment
>>>>> references to Person and each Person comes from a large data base
>>>>> of people.
>>>>>
>>>>> There are really two types of data here: transactional data; and
>>>>> static data (i.e. Person).
>>>>>
>>>>> The data set for Person is too large to hold in memory. Instead we
>>>>> maintain a cache of loaded Persons.
>>>>>
>>>>> When we serialise Transactions out into resources the Person
>>>>> objects must be contained within another resource in that resource
>>>>> set.
>>>>>
>>>>> The problem is that we process a lot of transactions and we don't
>>>>> really want to hold onto them in a resource once we finished
>>>>> processing them as this will just fill up memory.
>>>>>
>>>>> We can simply toss the resource set at that point but that would
>>>>> toss the cached Person objects.
>>>>>
>>>>> Is there some way to chain up resource sets so that we can keep
>>>>> all the cached Person objects in one resource set then create new
>>>>> resource sets for each transaction and point that resource set to
>>>>> the shared Person resource set?
>>>>>
>>>>> Any advice on how best to manage data like this where we are
>>>>> primarily focussed on processing transactional data which is
>>>>> associated with static data from large data sets?
>>>>>
>>>
>>>
>
>


Re: non containment references to large data sets [message #425890 is a reply to message #425884] Thu, 11 December 2008 08:23 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andrew,

Comments below...



Andrew H schrieb:
> I evaluated Teneo about a month ago and was impressed with it. We plan
> to use it when we do our new persistence in the next phase of the
> project.
>
> However, one thing that I am yet to resolve is how to handle our
> static data. Our transactional data will refer to the static data but
> when it serialises we will serialise only references to static data.
> This means any parts of our system that manipulate these transactions
> will need a mechanism to resolve the static data again.
>
> The reason we don't want to serialise the static data with the
> transactional data is that the full object graph for the static data
> might be huge.
This was one of the main reasons to create CDO: Load and save
containment references with the same caharcteristics as cross
references. In CDO there's internally no difference between them. In
other words, you can load objects without their children (and that's the
default) and you can even load children without their parent (through
cross references, queries or by ID).

This enables you at model-design-time to focus on your business aspects
and create containment references where they are logically desired
instead of being influenced already by later persistence/scalability
decisions.

Cheers
/Eike

----
http://thegordian.blogspot.com


> For example take a simple model where a Department has a list of
> Person's which is a containment relationship. If we associate the
> Department with a Transaction then you can reach all the Person's (eg.
> > 100,000) from the Department. So if this whole tree was serialised
> we'd send the whole database across the wire.
>
> To avoid this we plan to use the Resource mechanism in EMF. Another
> approach might be to use Hibernates proxy mechanism here.
>
> All that is for the next phase but I ran into an issue that I need to
> resolve now. Because the relationships to the static data are non
> containment (for the above reasons) then when I go to serialise a
> transaction (e.g. to print it out) it fails cause the static data is
> not in a resource.
>
> I was hoping that for now I could do something simple that aligns with
> what we will do for the next phase which is why I asked this question.
>
> But I think its to big an issue for now so I am going with a cludgy
> workaround.
>
> jason henriksen wrote:
>
>
>> We use Teneo. Teneo is excellent because it takes your EMF objects
>> and persists them into a database via Hibernate. It works really well.
>
>> It's an eclipse project, but you can find out more here:
>> http://elver.org/hibernate/index.html
>
>> Also, I'm wondering if you're running into a containment problem.
>
>> Also:
>> Is your Cache also an EMF object? If so, you might be having a
>> problem with EMF's containment modeling.
>
>> Jason
>
>
>
>
>
>
>> Andrew H wrote:
>>> To be honest, exactly how we will manage this is not yet determined.
>>> We are migrating our application to EMF in a phased approach.
>>> Persistence will be the next phase. However, in this phase I need to
>>> deal with the fact that the transaction needs to link to the static
>>> data in a non containment way which is causing me issues when I
>>> serialise the transaction (mainly for debugging).
>>>
>>> Hibernate caching is what I am expecting to use to cache the static
>>> data, but I wasn't sure how that plays with Resources.
>>>
>>> How are you using the two together? And how does that cope with if
>>> the same cached object is needed on two threads at the same time?
>>> What resource would that object live in? Or do you have a copy per
>>> thread?
>>>
>>> cheers
>>>
>>> Andrew
>>>
>>> jason henriksen wrote:
>>>
>>>
>>>> > We can simply toss the resource set at that point but that would
>>>> toss
>>>> > the cached Person objects.
>>>
>>>> Out of curiosity, why does that happen? We use hibernate's cache
>>>> and it seems to hold on to the right stuff. We have on Resource
>>>> per thread that we fill it with stuff, serialize the stuff and then
>>>> clear out the resource at the end of each request. That seems to
>>>> work for us.
>>>> (Or is it not working optimally and we haven't discovered it yet?
>>>> Hmm....)
>>>
>>>
>>>> Jason
>>>
>>>> Andrew H wrote:
>>>>> Take an example model like
>>>>>
>>>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>>>> eType="#//Person"/>
>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>>>> eType="#//Person"/>
>>>>> </eClassifiers>
>>>>>
>>>>> where the buyer and seller references are non containment
>>>>> references to Person and each Person comes from a large data base
>>>>> of people.
>>>>>
>>>>> There are really two types of data here: transactional data; and
>>>>> static data (i.e. Person).
>>>>>
>>>>> The data set for Person is too large to hold in memory. Instead we
>>>>> maintain a cache of loaded Persons.
>>>>>
>>>>> When we serialise Transactions out into resources the Person
>>>>> objects must be contained within another resource in that resource
>>>>> set.
>>>>>
>>>>> The problem is that we process a lot of transactions and we don't
>>>>> really want to hold onto them in a resource once we finished
>>>>> processing them as this will just fill up memory.
>>>>>
>>>>> We can simply toss the resource set at that point but that would
>>>>> toss the cached Person objects.
>>>>>
>>>>> Is there some way to chain up resource sets so that we can keep
>>>>> all the cached Person objects in one resource set then create new
>>>>> resource sets for each transaction and point that resource set to
>>>>> the shared Person resource set?
>>>>>
>>>>> Any advice on how best to manage data like this where we are
>>>>> primarily focussed on processing transactional data which is
>>>>> associated with static data from large data sets?
>>>>>
>>>
>>>
>
>


Re: non containment references to large data sets [message #425916 is a reply to message #425884] Thu, 11 December 2008 19:35 Go to previous messageGo to next message
Jason Henriksen is currently offline Jason HenriksenFriend
Messages: 231
Registered: July 2009
Senior Member
> However, one thing that I am yet to resolve is how to handle our static
> data. Our transactional data will refer to the static data but when it
> serialises we will serialise only references to static data. This means
> any parts of our system that manipulate these transactions will need a
> mechanism to resolve the static data again.

I've run across similar issues. What we did was to set up a resolver
chain. We can make a reference to an object that does not appear in our
xml document and then make the resolver go load it.

So in psuedo-code I might send something like this across the wire:

<someData>
<someStaticObject>
StaticObjectCache.xml:objectId=12345
</someStaticObject>
</someData>

The client receiving this XML would try to resolve the reference in the
'someStaticObject' tag by looking in the file StaticObjectCache.xml and
pulling out the object with objectId set to 12345.

We use a resolver chain that uses urls for the form "file://", "sql://"
or "http://". Based on the protocol heading it knows where to load the
data from.

You could pre-install the static data with the client, and then just let
the resolver handling loading it from the local file system.

I haven't played with that code in a while, but if this sounds useful to
you I can try to dig up some examples to share.

Jason Henriksen
Re: non containment references to large data sets [message #425922 is a reply to message #425889] Thu, 11 December 2008 22:52 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Eike Stepper wrote:
>>
>> Our plans were to use Teneo for persistence but I may review that
>> given that SDO may help with some of our static data issues.
> I bet it's just a typo but CDO is *not* SDO!

Yeah I keep mixing the two up.

>> Can CDO and Teneo be used together?
> Yes, as Tom pointed out, we have a Teneo-based Hibernate backend
> integration. Martin Taal and I developed it.

Great. Should make it easier for us to use CDO

>>
>> It will be a couple of months before we hit the persistence phase of
>> our project so I will defer the eval till then.
> I'll be here ;-)

> Cheers
> /Eike

> ----
> http://thegordian.blogspot.com


>>
>> cheers
>>
>> Andrew
>>
>> Eike Stepper wrote:
>>
>>> Andrew,
>>
>>> Just one more side note. With CDO you also have the advantage that
>>> you can switch the back-end persistence technology without impacting
>>> the application. We're even working on a back-end agnostic query
>>> language handler...
>>
>>> Cheers
>>> /Eike
>>
>>> ----
>>> http://thegordian.blogspot.com
>>
>>
>>
>>> Andrew H schrieb:
>>>> To be honest, exactly how we will manage this is not yet determined.
>>>> We are migrating our application to EMF in a phased approach.
>>>> Persistence will be the next phase. However, in this phase I need to
>>>> deal with the fact that the transaction needs to link to the static
>>>> data in a non containment way which is causing me issues when I
>>>> serialise the transaction (mainly for debugging).
>>>>
>>>> Hibernate caching is what I am expecting to use to cache the static
>>>> data, but I wasn't sure how that plays with Resources.
>>>>
>>>> How are you using the two together? And how does that cope with if
>>>> the same cached object is needed on two threads at the same time?
>>>> What resource would that object live in? Or do you have a copy per
>>>> thread?
>>>>
>>>> cheers
>>>>
>>>> Andrew
>>>>
>>>> jason henriksen wrote:
>>>>
>>>>
>>>>> > We can simply toss the resource set at that point but that would
>>>>> toss
>>>>> > the cached Person objects.
>>>>
>>>>> Out of curiosity, why does that happen? We use hibernate's cache
>>>>> and it seems to hold on to the right stuff. We have on Resource
>>>>> per thread that we fill it with stuff, serialize the stuff and then
>>>>> clear out the resource at the end of each request. That seems to
>>>>> work for us.
>>>>> (Or is it not working optimally and we haven't discovered it yet?
>>>>> Hmm....)
>>>>
>>>>
>>>>> Jason
>>>>
>>>>> Andrew H wrote:
>>>>>> Take an example model like
>>>>>>
>>>>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>>>>> eType="#//Person"/>
>>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>>>>> eType="#//Person"/>
>>>>>> </eClassifiers>
>>>>>>
>>>>>> where the buyer and seller references are non containment
>>>>>> references to Person and each Person comes from a large data base
>>>>>> of people.
>>>>>>
>>>>>> There are really two types of data here: transactional data; and
>>>>>> static data (i.e. Person).
>>>>>>
>>>>>> The data set for Person is too large to hold in memory. Instead we
>>>>>> maintain a cache of loaded Persons.
>>>>>>
>>>>>> When we serialise Transactions out into resources the Person
>>>>>> objects must be contained within another resource in that resource
>>>>>> set.
>>>>>>
>>>>>> The problem is that we process a lot of transactions and we don't
>>>>>> really want to hold onto them in a resource once we finished
>>>>>> processing them as this will just fill up memory.
>>>>>>
>>>>>> We can simply toss the resource set at that point but that would
>>>>>> toss the cached Person objects.
>>>>>>
>>>>>> Is there some way to chain up resource sets so that we can keep
>>>>>> all the cached Person objects in one resource set then create new
>>>>>> resource sets for each transaction and point that resource set to
>>>>>> the shared Person resource set?
>>>>>>
>>>>>> Any advice on how best to manage data like this where we are
>>>>>> primarily focussed on processing transactional data which is
>>>>>> associated with static data from large data sets?
>>>>>>
>>>>
>>>>
>>
>>
Re: non containment references to large data sets [message #425923 is a reply to message #425890] Thu, 11 December 2008 23:04 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Hi Eike

See inline

Eike Stepper wrote:

>>
>> The reason we don't want to serialise the static data with the
>> transactional data is that the full object graph for the static data
>> might be huge.
> This was one of the main reasons to create CDO: Load and save
> containment references with the same caharcteristics as cross
> references. In CDO there's internally no difference between them. In
> other words, you can load objects without their children (and that's the
> default) and you can even load children without their parent (through
> cross references, queries or by ID).

> This enables you at model-design-time to focus on your business aspects
> and create containment references where they are logically desired
> instead of being influenced already by later persistence/scalability
> decisions.

On the surface it certainly sounds like CDO will be very useful to us.
Particularly for our static data. For our transactional data we tend to
send the whole transaction around all the time so a simple EMF / Teneo
solution would probably work well. But anything that processes our
transactions would need to resolve the associated static data and it seems
like CDO would do this nicely.

Just out of curiosity, if it turned out that CDO was a good fit for our
static data but not for our transactional, can that work? i.e. our
transactional models would not be CDO models (just normal EMF) but our
static data ones would be.

I'm not saying this is necessarily likely but would be interested to know
if its an all or nothing proposition.

Also, I noticed there was an audit view capability. That might be very
useful to us to be able to see what the static data looked like when the
transaction was created. Does this work only off timestamps or is there a
way to associate a transaction with a particular version of our static
data?

> Cheers
> /Eike

> ----
> http://thegordian.blogspot.com


>> For example take a simple model where a Department has a list of
>> Person's which is a containment relationship. If we associate the
>> Department with a Transaction then you can reach all the Person's (eg.
>> > 100,000) from the Department. So if this whole tree was serialised
>> we'd send the whole database across the wire.
>>
>> To avoid this we plan to use the Resource mechanism in EMF. Another
>> approach might be to use Hibernates proxy mechanism here.
>>
>> All that is for the next phase but I ran into an issue that I need to
>> resolve now. Because the relationships to the static data are non
>> containment (for the above reasons) then when I go to serialise a
>> transaction (e.g. to print it out) it fails cause the static data is
>> not in a resource.
>>
>> I was hoping that for now I could do something simple that aligns with
>> what we will do for the next phase which is why I asked this question.
>>
>> But I think its to big an issue for now so I am going with a cludgy
>> workaround.
>>
>> jason henriksen wrote:
>>
>>
>>> We use Teneo. Teneo is excellent because it takes your EMF objects
>>> and persists them into a database via Hibernate. It works really well.
>>
>>> It's an eclipse project, but you can find out more here:
>>> http://elver.org/hibernate/index.html
>>
>>> Also, I'm wondering if you're running into a containment problem.
>>
>>> Also:
>>> Is your Cache also an EMF object? If so, you might be having a
>>> problem with EMF's containment modeling.
>>
>>> Jason
>>
>>
>>
>>
>>
>>
>>> Andrew H wrote:
>>>> To be honest, exactly how we will manage this is not yet determined.
>>>> We are migrating our application to EMF in a phased approach.
>>>> Persistence will be the next phase. However, in this phase I need to
>>>> deal with the fact that the transaction needs to link to the static
>>>> data in a non containment way which is causing me issues when I
>>>> serialise the transaction (mainly for debugging).
>>>>
>>>> Hibernate caching is what I am expecting to use to cache the static
>>>> data, but I wasn't sure how that plays with Resources.
>>>>
>>>> How are you using the two together? And how does that cope with if
>>>> the same cached object is needed on two threads at the same time?
>>>> What resource would that object live in? Or do you have a copy per
>>>> thread?
>>>>
>>>> cheers
>>>>
>>>> Andrew
>>>>
>>>> jason henriksen wrote:
>>>>
>>>>
>>>>> > We can simply toss the resource set at that point but that would
>>>>> toss
>>>>> > the cached Person objects.
>>>>
>>>>> Out of curiosity, why does that happen? We use hibernate's cache
>>>>> and it seems to hold on to the right stuff. We have on Resource
>>>>> per thread that we fill it with stuff, serialize the stuff and then
>>>>> clear out the resource at the end of each request. That seems to
>>>>> work for us.
>>>>> (Or is it not working optimally and we haven't discovered it yet?
>>>>> Hmm....)
>>>>
>>>>
>>>>> Jason
>>>>
>>>>> Andrew H wrote:
>>>>>> Take an example model like
>>>>>>
>>>>>> <eClassifiers xsi:type="ecore:EClass" name="Transaction">
>>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="buyer"
>>>>>> eType="#//Person"/>
>>>>>> <eStructuralFeatures xsi:type="ecore:EReference" name="seller"
>>>>>> eType="#//Person"/>
>>>>>> </eClassifiers>
>>>>>>
>>>>>> where the buyer and seller references are non containment
>>>>>> references to Person and each Person comes from a large data base
>>>>>> of people.
>>>>>>
>>>>>> There are really two types of data here: transactional data; and
>>>>>> static data (i.e. Person).
>>>>>>
>>>>>> The data set for Person is too large to hold in memory. Instead we
>>>>>> maintain a cache of loaded Persons.
>>>>>>
>>>>>> When we serialise Transactions out into resources the Person
>>>>>> objects must be contained within another resource in that resource
>>>>>> set.
>>>>>>
>>>>>> The problem is that we process a lot of transactions and we don't
>>>>>> really want to hold onto them in a resource once we finished
>>>>>> processing them as this will just fill up memory.
>>>>>>
>>>>>> We can simply toss the resource set at that point but that would
>>>>>> toss the cached Person objects.
>>>>>>
>>>>>> Is there some way to chain up resource sets so that we can keep
>>>>>> all the cached Person objects in one resource set then create new
>>>>>> resource sets for each transaction and point that resource set to
>>>>>> the shared Person resource set?
>>>>>>
>>>>>> Any advice on how best to manage data like this where we are
>>>>>> primarily focussed on processing transactional data which is
>>>>>> associated with static data from large data sets?
>>>>>>
>>>>
>>>>
>>
>>
Re: non containment references to large data sets [message #425924 is a reply to message #425887] Thu, 11 December 2008 23:05 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Thanks Tom. Good to know

Tom Schindl wrote:

> Andrew H schrieb:
>> Thanks Eike
>>
>> Our plans were to use Teneo for persistence but I may review that given
>> that SDO may help with some of our static data issues.
>> Can CDO and Teneo be used together?
>>

> Yes Teneo is one of the persistence stores currently supported by CDO.

> Tom
Re: non containment references to large data sets [message #425925 is a reply to message #425916] Thu, 11 December 2008 23:42 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
jason henriksen wrote:


>> However, one thing that I am yet to resolve is how to handle our static
>> data. Our transactional data will refer to the static data but when it
>> serialises we will serialise only references to static data. This means
>> any parts of our system that manipulate these transactions will need a
>> mechanism to resolve the static data again.

> I've run across similar issues. What we did was to set up a resolver
> chain. We can make a reference to an object that does not appear in our
> xml document and then make the resolver go load it.

> So in psuedo-code I might send something like this across the wire:

> <someData>
> <someStaticObject>
> StaticObjectCache.xml:objectId=12345
> </someStaticObject>
> </someData>

> The client receiving this XML would try to resolve the reference in the
> 'someStaticObject' tag by looking in the file StaticObjectCache.xml and
> pulling out the object with objectId set to 12345.

> We use a resolver chain that uses urls for the form "file://", "sql://"
> or "http://". Based on the protocol heading it knows where to load the
> data from.

Right so you have your own custom resource & resource set impls to do this?

> You could pre-install the static data with the client, and then just let
> the resolver handling loading it from the local file system.

In our case the static data is too big to pre install on the clients. We'd
need a way (e.g. a custom resource) to handle the loading from the
database and caching locally. This cache would need to automatically
refresh stale data and use weak references so the cache doesn't get too
big.

I suspect we might need a custom impl of the EList in
resource.getContents() too.

Probably all stuff CDO does I guess.

> I haven't played with that code in a while, but if this sounds useful to
> you I can try to dig up some examples to share.

thanks for sharing your experiences. It helps a lot.

cheers

Andrew
Re: non containment references to large data sets [message #425927 is a reply to message #425923] Fri, 12 December 2008 06:49 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andreww,

Comments below...

Andrew H schrieb:
> Hi Eike
>
> [...] On the surface it certainly sounds like CDO will be very useful
> to us. Particularly for our static data. For our transactional data we
> tend to send the whole transaction around all the time so a simple EMF
> / Teneo solution would probably work well. But anything that processes
> our transactions would need to resolve the associated static data and
> it seems like CDO would do this nicely.
>
> Just out of curiosity, if it turned out that CDO was a good fit for
> our static data but not for our transactional, can that work? i.e. our
> transactional models would not be CDO models (just normal EMF) but our
> static data ones would be.
>
> I'm not saying this is necessarily likely but would be interested to
> know if its an all or nothing proposition.
In CDO 2.0 you can mix different types of Resources in a single
ResourceSet and have cross references between them. It's possible to use
multiple CDOResources (possibly from/to multiple model repositories)
together wit XMLResources, TeneoResource and others.

CDO generally supports these "external" references while some of our
back-end are not yet uptodate with this framework feature. We're
planning to work on this issue until 2.0 GA.

That said, in your case it looks as if you want to reference the static
data (possibly CDO-managed) from the transactional data (possibly
Teneo-managed). Wouldn't that require Teneo to be able to handle these
Teneo-external references?

Of course it's your own decision and I know that Teneo is generally a
good choice, but if you already decide to use CDO for some data, why not
use it for all data. You could still choose different CDO back-end types
for different resources...

>
> Also, I noticed there was an audit view capability. That might be very
> useful to us to be able to see what the static data looked like when
> the transaction was created. Does this work only off timestamps or is
> there a way to associate a transaction with a particular version of
> our static data?
Auditing is currently only supported by our proprietary JDBC back-end
adapter (although we discussed support in the other adapters as well). I
guess adding this feature to our Teneo/Hibernate adapter will be a
particular challenge ;-)

Regarding timestamps and versions in historical data, CDO works like
this: Each committed transaction creates a set of revisions for the new
and changed objects. The ID of such a transaction (== revision set) is
the timestamp of the commit operation. Each revision carries its own
version number together with the timestamps of two transactions, the one
that created the revision and the one that created the following
revisions (valid from -> until).

You could for any EObject determine the validity range (e.g. the
creation time) and then open an audit view to look at the whole object
graph as it was exactly at that time.

Cheers
/Eike

----
http://thegordian.blogspot.com


Re: non containment references to large data sets [message #425969 is a reply to message #425927] Sun, 14 December 2008 19:16 Go to previous messageGo to next message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Hi Andrew,
For your info, currently Teneo does not out-of-the-box support external references to other objects
persisted by other persistence engines. You can make references transient (using an annotation) to
prevent them from being persisted in the database and then use some other string field to persist
the uri. But this can be improved I think.
So triggered by this discussion (and that the same thing has come up before) I have started to
implement a special annotation @URI which makes it possible to denote an EReference as an external
reference. This should work for both Teneo standalone and is a good basis for adding the same
support to Teneo embedded in CDO. A first implementation should be in the next build (this week
somewhere).

gr. Martin

Eike Stepper wrote:
> Andreww,
>
> Comments below...
>
> Andrew H schrieb:
>> Hi Eike
>>
>> [...] On the surface it certainly sounds like CDO will be very useful
>> to us. Particularly for our static data. For our transactional data we
>> tend to send the whole transaction around all the time so a simple EMF
>> / Teneo solution would probably work well. But anything that processes
>> our transactions would need to resolve the associated static data and
>> it seems like CDO would do this nicely.
>>
>> Just out of curiosity, if it turned out that CDO was a good fit for
>> our static data but not for our transactional, can that work? i.e. our
>> transactional models would not be CDO models (just normal EMF) but our
>> static data ones would be.
>>
>> I'm not saying this is necessarily likely but would be interested to
>> know if its an all or nothing proposition.
> In CDO 2.0 you can mix different types of Resources in a single
> ResourceSet and have cross references between them. It's possible to use
> multiple CDOResources (possibly from/to multiple model repositories)
> together wit XMLResources, TeneoResource and others.
>
> CDO generally supports these "external" references while some of our
> back-end are not yet uptodate with this framework feature. We're
> planning to work on this issue until 2.0 GA.
>
> That said, in your case it looks as if you want to reference the static
> data (possibly CDO-managed) from the transactional data (possibly
> Teneo-managed). Wouldn't that require Teneo to be able to handle these
> Teneo-external references?
>
> Of course it's your own decision and I know that Teneo is generally a
> good choice, but if you already decide to use CDO for some data, why not
> use it for all data. You could still choose different CDO back-end types
> for different resources...
>
>>
>> Also, I noticed there was an audit view capability. That might be very
>> useful to us to be able to see what the static data looked like when
>> the transaction was created. Does this work only off timestamps or is
>> there a way to associate a transaction with a particular version of
>> our static data?
> Auditing is currently only supported by our proprietary JDBC back-end
> adapter (although we discussed support in the other adapters as well). I
> guess adding this feature to our Teneo/Hibernate adapter will be a
> particular challenge ;-)
>
> Regarding timestamps and versions in historical data, CDO works like
> this: Each committed transaction creates a set of revisions for the new
> and changed objects. The ID of such a transaction (== revision set) is
> the timestamp of the commit operation. Each revision carries its own
> version number together with the timestamps of two transactions, the one
> that created the revision and the one that created the following
> revisions (valid from -> until).
>
> You could for any EObject determine the validity range (e.g. the
> creation time) and then open an audit view to look at the whole object
> graph as it was exactly at that time.
>
> Cheers
> /Eike
>
> ----
> http://thegordian.blogspot.com


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: non containment references to large data sets [message #426126 is a reply to message #425969] Wed, 17 December 2008 02:55 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Thanks Martin

So if we used CDO w Teneo for static data and straight Teneo for
transactional data this is not going to be a seamless experience. i.e. its
not going to be as clean as if we went straight teneo for both or CDO w
Teneo for both?

cheers
Andrew

Martin Taal wrote:

> Hi Andrew,
> For your info, currently Teneo does not out-of-the-box support external
references to other objects
> persisted by other persistence engines. You can make references transient
(using an annotation) to
> prevent them from being persisted in the database and then use some other
string field to persist
> the uri. But this can be improved I think.
> So triggered by this discussion (and that the same thing has come up before)
I have started to
> implement a special annotation @URI which makes it possible to denote an
EReference as an external
> reference. This should work for both Teneo standalone and is a good basis
for adding the same
> support to Teneo embedded in CDO. A first implementation should be in the
next build (this week
> somewhere).

> gr. Martin

> Eike Stepper wrote:
>> Andreww,
>>
>> Comments below...
>>
>> Andrew H schrieb:
>>> Hi Eike
>>>
>>> [...] On the surface it certainly sounds like CDO will be very useful
>>> to us. Particularly for our static data. For our transactional data we
>>> tend to send the whole transaction around all the time so a simple EMF
>>> / Teneo solution would probably work well. But anything that processes
>>> our transactions would need to resolve the associated static data and
>>> it seems like CDO would do this nicely.
>>>
>>> Just out of curiosity, if it turned out that CDO was a good fit for
>>> our static data but not for our transactional, can that work? i.e. our
>>> transactional models would not be CDO models (just normal EMF) but our
>>> static data ones would be.
>>>
>>> I'm not saying this is necessarily likely but would be interested to
>>> know if its an all or nothing proposition.
>> In CDO 2.0 you can mix different types of Resources in a single
>> ResourceSet and have cross references between them. It's possible to use
>> multiple CDOResources (possibly from/to multiple model repositories)
>> together wit XMLResources, TeneoResource and others.
>>
>> CDO generally supports these "external" references while some of our
>> back-end are not yet uptodate with this framework feature. We're
>> planning to work on this issue until 2.0 GA.
>>
>> That said, in your case it looks as if you want to reference the static
>> data (possibly CDO-managed) from the transactional data (possibly
>> Teneo-managed). Wouldn't that require Teneo to be able to handle these
>> Teneo-external references?
>>
>> Of course it's your own decision and I know that Teneo is generally a
>> good choice, but if you already decide to use CDO for some data, why not
>> use it for all data. You could still choose different CDO back-end types
>> for different resources...
>>
>>>
>>> Also, I noticed there was an audit view capability. That might be very
>>> useful to us to be able to see what the static data looked like when
>>> the transaction was created. Does this work only off timestamps or is
>>> there a way to associate a transaction with a particular version of
>>> our static data?
>> Auditing is currently only supported by our proprietary JDBC back-end
>> adapter (although we discussed support in the other adapters as well). I
>> guess adding this feature to our Teneo/Hibernate adapter will be a
>> particular challenge ;-)
>>
>> Regarding timestamps and versions in historical data, CDO works like
>> this: Each committed transaction creates a set of revisions for the new
>> and changed objects. The ID of such a transaction (== revision set) is
>> the timestamp of the commit operation. Each revision carries its own
>> version number together with the timestamps of two transactions, the one
>> that created the revision and the one that created the following
>> revisions (valid from -> until).
>>
>> You could for any EObject determine the validity range (e.g. the
>> creation time) and then open an audit view to look at the whole object
>> graph as it was exactly at that time.
>>
>> Cheers
>> /Eike
>>
>> ----
>> http://thegordian.blogspot.com
Re: non containment references to large data sets [message #426127 is a reply to message #425927] Wed, 17 December 2008 02:56 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Thanks Eike

See inline

Eike Stepper wrote:

> Andreww,

> Comments below...

> Andrew H schrieb:
>> Hi Eike
>>
>> [...] On the surface it certainly sounds like CDO will be very useful
>> to us. Particularly for our static data. For our transactional data we
>> tend to send the whole transaction around all the time so a simple EMF
>> / Teneo solution would probably work well. But anything that processes
>> our transactions would need to resolve the associated static data and
>> it seems like CDO would do this nicely.
>>
>> Just out of curiosity, if it turned out that CDO was a good fit for
>> our static data but not for our transactional, can that work? i.e. our
>> transactional models would not be CDO models (just normal EMF) but our
>> static data ones would be.
>>
>> I'm not saying this is necessarily likely but would be interested to
>> know if its an all or nothing proposition.
> In CDO 2.0 you can mix different types of Resources in a single
> ResourceSet and have cross references between them. It's possible to use
> multiple CDOResources (possibly from/to multiple model repositories)
> together wit XMLResources, TeneoResource and others.

> CDO generally supports these "external" references while some of our
> back-end are not yet uptodate with this framework feature. We're
> planning to work on this issue until 2.0 GA.

> That said, in your case it looks as if you want to reference the static
> data (possibly CDO-managed) from the transactional data (possibly
> Teneo-managed). Wouldn't that require Teneo to be able to handle these
> Teneo-external references?

> Of course it's your own decision and I know that Teneo is generally a
> good choice, but if you already decide to use CDO for some data, why not
> use it for all data. You could still choose different CDO back-end types
> for different resources...

I agree that it would be preferable to have it all in one place with one
persistence solution. This will certainly be what we will aim for.

When this project kicked off I did an evaluation of several areas of EMF.
This was so that we were comfortable we would have an acceptable answer
for all the things we need to do with our models (GUI, business rules,
transformation, persistence, serialisation to / from XML, use in JEE & Web
services etc).

We are currently in the first phase which is introducing the model itself
and transforming it for integration with various systems. Persistence
isn't till the next phase, but unfortunately I have had to anticipate some
aspects of persistence (next phase) now. These relate to static data.

My evaluation for persistence was done for persisting our transactional
data with Teneo because Teneo is based on hibernate which I know pretty
well is a proven industry approach. This was enough to give the tick to
persistence and for a high level plan. Teneo was an excellent fit for the
transactional data.

Unfortunately, I didn't do a separate evaluation for our static data. I
assumed simply that the same approach would fit well. However, our static
data has some differing requirements to our transactional data.

- transactions tend to be self contained where we hold the entire
transaction in memory and send it around the system in its entirety.
Static data tends to be large and interconnected and we only ever want a
small part of the model in memory.

- Static data needs to be able to be fetched on demand so it can be
reattached to the transactional data as needed. In a way the Resource for
static data needs to provide a virtual window on the the persisted static
data, so that it can resolve any URI's for these objects without needing
to hold the entire database in memory.

We can certainly build support for these requirements on top of Teneo but
I expect that there could be quite a lot of work involved. Work that seems
to already be done by CDO.

So I'm kind of stuck now, where I have an answer for transactional data
persistence that I feel I understand well and am very comfortable with,
but only a potential answer for static data that I don't understand at all
yet.

The only solution is for me to evaluate CDO, but I don't have time right
now. Thanks to your feedback, at least I know a little more now. That
whilst there can be some possibilities of having a CDO solution for static
data and a teneo one for transactional data it is likely to be less than
ideal so the most likely outcome is to choose one over the other.

thanks for your help

>>
>> Also, I noticed there was an audit view capability. That might be very
>> useful to us to be able to see what the static data looked like when
>> the transaction was created. Does this work only off timestamps or is
>> there a way to associate a transaction with a particular version of
>> our static data?
> Auditing is currently only supported by our proprietary JDBC back-end
> adapter (although we discussed support in the other adapters as well). I
> guess adding this feature to our Teneo/Hibernate adapter will be a
> particular challenge ;-)

OK good to know.

> Regarding timestamps and versions in historical data, CDO works like
> this: Each committed transaction creates a set of revisions for the new
> and changed objects. The ID of such a transaction (== revision set) is
> the timestamp of the commit operation. Each revision carries its own
> version number together with the timestamps of two transactions, the one
> that created the revision and the one that created the following
> revisions (valid from -> until).

> You could for any EObject determine the validity range (e.g. the
> creation time) and then open an audit view to look at the whole object
> graph as it was exactly at that time.

Right so for a revision of one of our Transactions (sorry to overload the
term here but I'm referring to our transactional data) I could determine
the timestamp for its creation and use that to unambiguously determine the
correct revision for the static data that it is associated with?

cheers

Andrew
Re: non containment references to large data sets [message #426142 is a reply to message #426126] Wed, 17 December 2008 09:07 Go to previous messageGo to next message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Hi Andrew,
It depends on your requirements. The latest Teneo build (30 minutes ago) has a new feature called
External (see below) which maybe of help in combining Teneo and CDO. Here are also some other
thoughts to help you in your questions. Let me know.

There are two ways to use Teneo: standalone (the standard way), embedded within CDO.

Teneo standalone is very good in J2ee/webserver/webservice environments as it allows you to work
with generated model objects on the server so that you can write your custom server-side logic using
those model objects. Teneo with EMF has great value in a webservice environment, Xml/xmi
transformation of EMF combined with persisting using Teneo/Hibernate has a really good fit there.
Other nice Teneo features:
- works with advanced models (ecore itself for example)
- supports most xsd constructs (like choices, union, lists, extension, substitution groups) and all
ecore constructs (featuremaps for example)
- allows you to really fine-tune the relational model to your needs using annotations
- can operate on an existing relational schema (using annotations to match up the model and the
relational schema)
- very unintruisive, Teneo works with your model without changing the genmodel or changing the way
you generate your code
- supports advanced querying through hibernate

Teneo embedded within CDO: here the mapping functionality of Teneo is used so you have advantages of
being able to really influence how the relational schema looks like (using annotations).
Unfortunately the Teneo embedded in CDO functionality is not mature enough at the moment (because of
lack of time on my part). So I would wait for this solution a bit longer.

Then I added a new feature in Teneo which should allow you to work with CDOresources from Teneo
standalone:

The latest maintenance build of Teneo now has support for external references to non-teneo-persisted
objects. See here for more info:
http://www.elver.org/hibernate/hibernate_relations.html#exte rnal

This new feature also allows you to specify your own Hibernate UserType when persisting or loading
references. This allows you to completely customize how the external reference is stored in the
database (if even there) and also how it is resolved when loading the owner from the db.

Let me know if this new annotation helps a bit, and if not what can be changed to make it more usefull.

So with the above I think you can combine static data in CDO with transactional data in Teneo. Or
static data in a custom garbage collecting resource with transactional data in Teneo.

Btw, as I said I think Teneo with EMF has great value in a webservice environment. So if you use
Teneo in that area I am very interested to hear about your experience there. I am also interested in
contributions in that area, so if there are opportunities let me know.

gr. Martin

Andrew H wrote:
> Thanks Martin
>
> So if we used CDO w Teneo for static data and straight Teneo for
> transactional data this is not going to be a seamless experience. i.e.
> its not going to be as clean as if we went straight teneo for both or
> CDO w Teneo for both?
>
> cheers
> Andrew
>
> Martin Taal wrote:
>
>> Hi Andrew,
>> For your info, currently Teneo does not out-of-the-box support external
> references to other objects
>> persisted by other persistence engines. You can make references transient
> (using an annotation) to
>> prevent them from being persisted in the database and then use some other
> string field to persist
>> the uri. But this can be improved I think.
>> So triggered by this discussion (and that the same thing has come up
>> before)
> I have started to
>> implement a special annotation @URI which makes it possible to denote an
> EReference as an external
>> reference. This should work for both Teneo standalone and is a good basis
> for adding the same
>> support to Teneo embedded in CDO. A first implementation should be in the
> next build (this week
>> somewhere).
>
>> gr. Martin
>
>> Eike Stepper wrote:
>>> Andreww,
>>>
>>> Comments below...
>>>
>>> Andrew H schrieb:
>>>> Hi Eike
>>>>
>>>> [...] On the surface it certainly sounds like CDO will be very
>>>> useful to us. Particularly for our static data. For our
>>>> transactional data we tend to send the whole transaction around all
>>>> the time so a simple EMF / Teneo solution would probably work well.
>>>> But anything that processes our transactions would need to resolve
>>>> the associated static data and it seems like CDO would do this nicely.
>>>>
>>>> Just out of curiosity, if it turned out that CDO was a good fit for
>>>> our static data but not for our transactional, can that work? i.e.
>>>> our transactional models would not be CDO models (just normal EMF)
>>>> but our static data ones would be.
>>>>
>>>> I'm not saying this is necessarily likely but would be interested to
>>>> know if its an all or nothing proposition.
>>> In CDO 2.0 you can mix different types of Resources in a single
>>> ResourceSet and have cross references between them. It's possible to
>>> use multiple CDOResources (possibly from/to multiple model
>>> repositories) together wit XMLResources, TeneoResource and others.
>>>
>>> CDO generally supports these "external" references while some of our
>>> back-end are not yet uptodate with this framework feature. We're
>>> planning to work on this issue until 2.0 GA.
>>>
>>> That said, in your case it looks as if you want to reference the
>>> static data (possibly CDO-managed) from the transactional data
>>> (possibly Teneo-managed). Wouldn't that require Teneo to be able to
>>> handle these Teneo-external references?
>>>
>>> Of course it's your own decision and I know that Teneo is generally a
>>> good choice, but if you already decide to use CDO for some data, why
>>> not use it for all data. You could still choose different CDO
>>> back-end types for different resources...
>>>
>>>>
>>>> Also, I noticed there was an audit view capability. That might be
>>>> very useful to us to be able to see what the static data looked like
>>>> when the transaction was created. Does this work only off timestamps
>>>> or is there a way to associate a transaction with a particular
>>>> version of our static data?
>>> Auditing is currently only supported by our proprietary JDBC back-end
>>> adapter (although we discussed support in the other adapters as
>>> well). I guess adding this feature to our Teneo/Hibernate adapter
>>> will be a particular challenge ;-)
>>>
>>> Regarding timestamps and versions in historical data, CDO works like
>>> this: Each committed transaction creates a set of revisions for the
>>> new and changed objects. The ID of such a transaction (== revision
>>> set) is the timestamp of the commit operation. Each revision carries
>>> its own version number together with the timestamps of two
>>> transactions, the one that created the revision and the one that
>>> created the following revisions (valid from -> until).
>>>
>>> You could for any EObject determine the validity range (e.g. the
>>> creation time) and then open an audit view to look at the whole
>>> object graph as it was exactly at that time.
>>>
>>> Cheers
>>> /Eike
>>>
>>> ----
>>> http://thegordian.blogspot.com
>
>
>
>


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: non containment references to large data sets [message #426152 is a reply to message #426127] Wed, 17 December 2008 11:15 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Andrew,

Comments below...




Andrew H schrieb:
> [...] I agree that it would be preferable to have it all in one place
> with one persistence solution. This will certainly be what we will aim
> for.
>
> When this project kicked off I did an evaluation of several areas of
> EMF. This was so that we were comfortable we would have an acceptable
> answer for all the things we need to do with our models (GUI, business
> rules, transformation, persistence, serialisation to / from XML, use
> in JEE & Web services etc).
>
> We are currently in the first phase which is introducing the model
> itself and transforming it for integration with various systems.
> Persistence isn't till the next phase, but unfortunately I have had to
> anticipate some aspects of persistence (next phase) now. These relate
> to static data.
>
> My evaluation for persistence was done for persisting our
> transactional data with Teneo because Teneo is based on hibernate
> which I know pretty well is a proven industry approach. This was
> enough to give the tick to persistence and for a high level plan.
> Teneo was an excellent fit for the transactional data.
>
> Unfortunately, I didn't do a separate evaluation for our static data.
> I assumed simply that the same approach would fit well. However, our
> static data has some differing requirements to our transactional data.
>
> - transactions tend to be self contained where we hold the entire
> transaction in memory and send it around the system in its entirety.
> Static data tends to be large and interconnected and we only ever want
> a small part of the model in memory.
>
> - Static data needs to be able to be fetched on demand so it can be
> reattached to the transactional data as needed. In a way the Resource
> for static data needs to provide a virtual window on the the persisted
> static data, so that it can resolve any URI's for these objects
> without needing to hold the entire database in memory.
>
> We can certainly build support for these requirements on top of Teneo
> but I expect that there could be quite a lot of work involved. Work
> that seems to already be done by CDO.
>
> So I'm kind of stuck now, where I have an answer for transactional
> data persistence that I feel I understand well and am very comfortable
> with, but only a potential answer for static data that I don't
> understand at all yet.
>
> The only solution is for me to evaluate CDO, but I don't have time
> right now. Thanks to your feedback, at least I know a little more now.
> That whilst there can be some possibilities of having a CDO solution
> for static data and a teneo one for transactional data it is likely to
> be less than ideal so the most likely outcome is to choose one over
> the other.
I understand all the arguments pro Hibernate (including all the ones
that Martin gave) and that Teneo is a good fit between EMF and Hibernate.
That's why we have developed a Teneo-based HibernateStore for CDO. I'd
see it as an additional layer between EMF and Teneo with all the
possible advantages and draw-backs tat additional layers might impose.

If you don't need the additional advantages (or can't live with possible
disadvantages) for only parts of your data you should, at least in the
mid term, be fine with Teneo's new support for external references.
Since support for these is now in CDO and Teneo I really hope that we
will soon integrate this into our HibernateStore (CDO/Teneo integration)
as well.

> [...] Right so for a revision of one of our Transactions (sorry to
> overload the term here but I'm referring to our transactional data) I
> could determine the timestamp for its creation and use that to
> unambiguously determine the correct revision for the static data that
> it is associated with?
Yes, this happens automatically in CDO.

Cheers
/Eike

----
http://thegordian.blogspot.com


Re: non containment references to large data sets [message #426170 is a reply to message #426142] Wed, 17 December 2008 23:31 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Hi Martin

Let me start by saying that I was very impressed with Teneo when I
evaluated it a month or two ago. What I was able to achieve in the short
time I spent with it was quite extraordinary. And the responsiveness of
yourself and all the EMF teams gives newcomers like myself a lot of
confidence in jumping on the EMF band wagon. So a big thanks to all.

Most of what we do fits very well within the standalone mode you outline
below.
This is why I evaluated Teneo only and not CDO as at the time I thought it
was a simple question.

JEE, Web Services, business services with business logic, plenty of
different searches across our data etc. are all major pieces.

I really do want the static data and transactional data stored together in
the database. We will undoubtedly want to do searches that span the two.
For example we may want to find all the transactions that a given person
is involved in. This could simply be searching on the person and joining
on to Transaction.

When we capture the Transaction in the GUI I want to wire it into the real
static data objects as this will make writing business rules that span
them easy. But when we send the transaction for processing (business
process, persistence etc.) I want to send only proxies for the static data
associated with it. When this reaches its destination it will need to be
reunited with its static data. And when persisted I'd like it attached
properly to it (not external). Similarly when we pull the transaction out
later.

So this is the real complication on the standalone Teneo approach. How to
manage this way of working with static data. So that it is associated
properly with the transactions, serialised as proxies, and reunited with
the static data wherever it needs to be, persisted with the static data
etc.

Its really this aspect that seems to fit well with (my limited
understanding of) CDO.

Do you have any thoughts on how to satisfy this requirement in a
standalone Teneo approach?

One thing that may help in this is that the transactions themselves are
predominantly containment relationships. But all the relationships to
static data are not.

Incidentally, Web Services is still an unknown at this stage. The story
for EMF + JEE Web Services is not good thanks to JAXB being welded into
that stack. When I have time I will explore the Spring Web Service path as
Spring tends to be much more customisable. Their new app server may be
interesting for us.
When I queried on the list about this a while ago someone mentioned they
were looking into this but I never heard back.

regards

Andrew

Martin Taal wrote:

> Hi Andrew,
> It depends on your requirements. The latest Teneo build (30 minutes ago) has
a new feature called
> External (see below) which maybe of help in combining Teneo and CDO. Here
are also some other
> thoughts to help you in your questions. Let me know.

> There are two ways to use Teneo: standalone (the standard way), embedded
within CDO.

> Teneo standalone is very good in J2ee/webserver/webservice environments as
it allows you to work
> with generated model objects on the server so that you can write your custom
server-side logic using
> those model objects. Teneo with EMF has great value in a webservice
environment, Xml/xmi
> transformation of EMF combined with persisting using Teneo/Hibernate has a
really good fit there.
> Other nice Teneo features:
> - works with advanced models (ecore itself for example)
> - supports most xsd constructs (like choices, union, lists, extension,
substitution groups) and all
> ecore constructs (featuremaps for example)
> - allows you to really fine-tune the relational model to your needs using
annotations
> - can operate on an existing relational schema (using annotations to match
up the model and the
> relational schema)
> - very unintruisive, Teneo works with your model without changing the
genmodel or changing the way
> you generate your code
> - supports advanced querying through hibernate

> Teneo embedded within CDO: here the mapping functionality of Teneo is used
so you have advantages of
> being able to really influence how the relational schema looks like (using
annotations).
> Unfortunately the Teneo embedded in CDO functionality is not mature enough
at the moment (because of
> lack of time on my part). So I would wait for this solution a bit longer.

> Then I added a new feature in Teneo which should allow you to work with
CDOresources from Teneo
> standalone:

> The latest maintenance build of Teneo now has support for external
references to non-teneo-persisted
> objects. See here for more info:
> http://www.elver.org/hibernate/hibernate_relations.html#exte rnal

> This new feature also allows you to specify your own Hibernate UserType when
persisting or loading
> references. This allows you to completely customize how the external
reference is stored in the
> database (if even there) and also how it is resolved when loading the owner
from the db.

> Let me know if this new annotation helps a bit, and if not what can be
changed to make it more usefull.

> So with the above I think you can combine static data in CDO with
transactional data in Teneo. Or
> static data in a custom garbage collecting resource with transactional data
in Teneo.

> Btw, as I said I think Teneo with EMF has great value in a webservice
environment. So if you use
> Teneo in that area I am very interested to hear about your experience there.
I am also interested in
> contributions in that area, so if there are opportunities let me know.

> gr. Martin

> Andrew H wrote:
>> Thanks Martin
>>
>> So if we used CDO w Teneo for static data and straight Teneo for
>> transactional data this is not going to be a seamless experience. i.e.
>> its not going to be as clean as if we went straight teneo for both or
>> CDO w Teneo for both?
>>
>> cheers
>> Andrew
>>
>> Martin Taal wrote:
>>
>>> Hi Andrew,
>>> For your info, currently Teneo does not out-of-the-box support external
>> references to other objects
>>> persisted by other persistence engines. You can make references transient
>> (using an annotation) to
>>> prevent them from being persisted in the database and then use some other
>> string field to persist
>>> the uri. But this can be improved I think.
>>> So triggered by this discussion (and that the same thing has come up
>>> before)
>> I have started to
>>> implement a special annotation @URI which makes it possible to denote an
>> EReference as an external
>>> reference. This should work for both Teneo standalone and is a good basis
>> for adding the same
>>> support to Teneo embedded in CDO. A first implementation should be in the
>> next build (this week
>>> somewhere).
>>
>>> gr. Martin
>>
>>> Eike Stepper wrote:
>>>> Andreww,
>>>>
>>>> Comments below...
>>>>
>>>> Andrew H schrieb:
>>>>> Hi Eike
>>>>>
>>>>> [...] On the surface it certainly sounds like CDO will be very
>>>>> useful to us. Particularly for our static data. For our
>>>>> transactional data we tend to send the whole transaction around all
>>>>> the time so a simple EMF / Teneo solution would probably work well.
>>>>> But anything that processes our transactions would need to resolve
>>>>> the associated static data and it seems like CDO would do this nicely.
>>>>>
>>>>> Just out of curiosity, if it turned out that CDO was a good fit for
>>>>> our static data but not for our transactional, can that work? i.e.
>>>>> our transactional models would not be CDO models (just normal EMF)
>>>>> but our static data ones would be.
>>>>>
>>>>> I'm not saying this is necessarily likely but would be interested to
>>>>> know if its an all or nothing proposition.
>>>> In CDO 2.0 you can mix different types of Resources in a single
>>>> ResourceSet and have cross references between them. It's possible to
>>>> use multiple CDOResources (possibly from/to multiple model
>>>> repositories) together wit XMLResources, TeneoResource and others.
>>>>
>>>> CDO generally supports these "external" references while some of our
>>>> back-end are not yet uptodate with this framework feature. We're
>>>> planning to work on this issue until 2.0 GA.
>>>>
>>>> That said, in your case it looks as if you want to reference the
>>>> static data (possibly CDO-managed) from the transactional data
>>>> (possibly Teneo-managed). Wouldn't that require Teneo to be able to
>>>> handle these Teneo-external references?
>>>>
>>>> Of course it's your own decision and I know that Teneo is generally a
>>>> good choice, but if you already decide to use CDO for some data, why
>>>> not use it for all data. You could still choose different CDO
>>>> back-end types for different resources...
>>>>
>>>>>
>>>>> Also, I noticed there was an audit view capability. That might be
>>>>> very useful to us to be able to see what the static data looked like
>>>>> when the transaction was created. Does this work only off timestamps
>>>>> or is there a way to associate a transaction with a particular
>>>>> version of our static data?
>>>> Auditing is currently only supported by our proprietary JDBC back-end
>>>> adapter (although we discussed support in the other adapters as
>>>> well). I guess adding this feature to our Teneo/Hibernate adapter
>>>> will be a particular challenge ;-)
>>>>
>>>> Regarding timestamps and versions in historical data, CDO works like
>>>> this: Each committed transaction creates a set of revisions for the
>>>> new and changed objects. The ID of such a transaction (== revision
>>>> set) is the timestamp of the commit operation. Each revision carries
>>>> its own version number together with the timestamps of two
>>>> transactions, the one that created the revision and the one that
>>>> created the following revisions (valid from -> until).
>>>>
>>>> You could for any EObject determine the validity range (e.g. the
>>>> creation time) and then open an audit view to look at the whole
>>>> object graph as it was exactly at that time.
>>>>
>>>> Cheers
>>>> /Eike
>>>>
>>>> ----
>>>> http://thegordian.blogspot.com
>>
>>
>>
>>
Re: non containment references to large data sets [message #426174 is a reply to message #426152] Thu, 18 December 2008 02:34 Go to previous messageGo to next message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
Thanks Eike

I'll take a close look at CDO in the new year and get a feel for just what
those pros and cons are.

Thanks again for your help

regards

Andrew

Eike Stepper wrote:

> Andrew,

> Comments below...




> Andrew H schrieb:
>> [...] I agree that it would be preferable to have it all in one place
>> with one persistence solution. This will certainly be what we will aim
>> for.
>>
>> When this project kicked off I did an evaluation of several areas of
>> EMF. This was so that we were comfortable we would have an acceptable
>> answer for all the things we need to do with our models (GUI, business
>> rules, transformation, persistence, serialisation to / from XML, use
>> in JEE & Web services etc).
>>
>> We are currently in the first phase which is introducing the model
>> itself and transforming it for integration with various systems.
>> Persistence isn't till the next phase, but unfortunately I have had to
>> anticipate some aspects of persistence (next phase) now. These relate
>> to static data.
>>
>> My evaluation for persistence was done for persisting our
>> transactional data with Teneo because Teneo is based on hibernate
>> which I know pretty well is a proven industry approach. This was
>> enough to give the tick to persistence and for a high level plan.
>> Teneo was an excellent fit for the transactional data.
>>
>> Unfortunately, I didn't do a separate evaluation for our static data.
>> I assumed simply that the same approach would fit well. However, our
>> static data has some differing requirements to our transactional data.
>>
>> - transactions tend to be self contained where we hold the entire
>> transaction in memory and send it around the system in its entirety.
>> Static data tends to be large and interconnected and we only ever want
>> a small part of the model in memory.
>>
>> - Static data needs to be able to be fetched on demand so it can be
>> reattached to the transactional data as needed. In a way the Resource
>> for static data needs to provide a virtual window on the the persisted
>> static data, so that it can resolve any URI's for these objects
>> without needing to hold the entire database in memory.
>>
>> We can certainly build support for these requirements on top of Teneo
>> but I expect that there could be quite a lot of work involved. Work
>> that seems to already be done by CDO.
>>
>> So I'm kind of stuck now, where I have an answer for transactional
>> data persistence that I feel I understand well and am very comfortable
>> with, but only a potential answer for static data that I don't
>> understand at all yet.
>>
>> The only solution is for me to evaluate CDO, but I don't have time
>> right now. Thanks to your feedback, at least I know a little more now.
>> That whilst there can be some possibilities of having a CDO solution
>> for static data and a teneo one for transactional data it is likely to
>> be less than ideal so the most likely outcome is to choose one over
>> the other.
> I understand all the arguments pro Hibernate (including all the ones
> that Martin gave) and that Teneo is a good fit between EMF and Hibernate.
> That's why we have developed a Teneo-based HibernateStore for CDO. I'd
> see it as an additional layer between EMF and Teneo with all the
> possible advantages and draw-backs tat additional layers might impose.

> If you don't need the additional advantages (or can't live with possible
> disadvantages) for only parts of your data you should, at least in the
> mid term, be fine with Teneo's new support for external references.
> Since support for these is now in CDO and Teneo I really hope that we
> will soon integrate this into our HibernateStore (CDO/Teneo integration)
> as well.

>> [...] Right so for a revision of one of our Transactions (sorry to
>> overload the term here but I'm referring to our transactional data) I
>> could determine the timestamp for its creation and use that to
>> unambiguously determine the correct revision for the static data that
>> it is associated with?
> Yes, this happens automatically in CDO.

> Cheers
> /Eike

> ----
> http://thegordian.blogspot.com
Re: non containment references to large data sets [message #426177 is a reply to message #426170] Thu, 18 December 2008 05:36 Go to previous messageGo to next message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Hi Andrew,
As Teneo also supports EMFResources you can read transactional data in one HibernateResource and
static data in another HibernateResource (both resources sharing the same Hibernate Session). Then
when serializing the transactional resource to xml/xmi (to send the data accross) the references to
the static data are handled as references (so not contained in the xml/xmi). Then when deserializing
the static data is again read from the db. The hibernate (EMF) resources support standard hibernate
lazy loading behavior (using cglib proxies and lazy loaded lists).

One thing you have to take into account is that Teneo is more a server oriented solution (although
there are people who use it at client side also) than an Eclipse RCP-oriented solution. CDO has much
better support for Eclipse RCP and also for synchronising the rcp client with a server running the
CDO server software (and CDO has many other nice features).
So which fits best also depends on where most of the business logic runs.

gr. Martin

Andrew H wrote:
> Hi Martin
>
> Let me start by saying that I was very impressed with Teneo when I
> evaluated it a month or two ago. What I was able to achieve in the short
> time I spent with it was quite extraordinary. And the responsiveness of
> yourself and all the EMF teams gives newcomers like myself a lot of
> confidence in jumping on the EMF band wagon. So a big thanks to all.
>
> Most of what we do fits very well within the standalone mode you outline
> below.
> This is why I evaluated Teneo only and not CDO as at the time I thought
> it was a simple question.
>
> JEE, Web Services, business services with business logic, plenty of
> different searches across our data etc. are all major pieces.
>
> I really do want the static data and transactional data stored together
> in the database. We will undoubtedly want to do searches that span the
> two. For example we may want to find all the transactions that a given
> person is involved in. This could simply be searching on the person and
> joining on to Transaction.
>
> When we capture the Transaction in the GUI I want to wire it into the
> real static data objects as this will make writing business rules that
> span them easy. But when we send the transaction for processing
> (business process, persistence etc.) I want to send only proxies for the
> static data associated with it. When this reaches its destination it
> will need to be reunited with its static data. And when persisted I'd
> like it attached properly to it (not external). Similarly when we pull
> the transaction out later.
>
> So this is the real complication on the standalone Teneo approach. How
> to manage this way of working with static data. So that it is associated
> properly with the transactions, serialised as proxies, and reunited with
> the static data wherever it needs to be, persisted with the static data
> etc.
>
> Its really this aspect that seems to fit well with (my limited
> understanding of) CDO.
>
> Do you have any thoughts on how to satisfy this requirement in a
> standalone Teneo approach?
>
> One thing that may help in this is that the transactions themselves are
> predominantly containment relationships. But all the relationships to
> static data are not.
>
> Incidentally, Web Services is still an unknown at this stage. The story
> for EMF + JEE Web Services is not good thanks to JAXB being welded into
> that stack. When I have time I will explore the Spring Web Service path
> as Spring tends to be much more customisable. Their new app server may
> be interesting for us.
> When I queried on the list about this a while ago someone mentioned they
> were looking into this but I never heard back.
>
> regards
>
> Andrew
>
> Martin Taal wrote:
>
>> Hi Andrew,
>> It depends on your requirements. The latest Teneo build (30 minutes
>> ago) has
> a new feature called
>> External (see below) which maybe of help in combining Teneo and CDO. Here
> are also some other
>> thoughts to help you in your questions. Let me know.
>
>> There are two ways to use Teneo: standalone (the standard way), embedded
> within CDO.
>
>> Teneo standalone is very good in J2ee/webserver/webservice
>> environments as
> it allows you to work
>> with generated model objects on the server so that you can write your
>> custom
> server-side logic using
>> those model objects. Teneo with EMF has great value in a webservice
> environment, Xml/xmi
>> transformation of EMF combined with persisting using Teneo/Hibernate
>> has a
> really good fit there.
>> Other nice Teneo features:
>> - works with advanced models (ecore itself for example)
>> - supports most xsd constructs (like choices, union, lists, extension,
> substitution groups) and all
>> ecore constructs (featuremaps for example)
>> - allows you to really fine-tune the relational model to your needs using
> annotations
>> - can operate on an existing relational schema (using annotations to
>> match
> up the model and the
>> relational schema)
>> - very unintruisive, Teneo works with your model without changing the
> genmodel or changing the way
>> you generate your code
>> - supports advanced querying through hibernate
>
>> Teneo embedded within CDO: here the mapping functionality of Teneo is
>> used
> so you have advantages of
>> being able to really influence how the relational schema looks like
>> (using
> annotations).
>> Unfortunately the Teneo embedded in CDO functionality is not mature
>> enough
> at the moment (because of
>> lack of time on my part). So I would wait for this solution a bit longer.
>
>> Then I added a new feature in Teneo which should allow you to work with
> CDOresources from Teneo
>> standalone:
>
>> The latest maintenance build of Teneo now has support for external
> references to non-teneo-persisted
>> objects. See here for more info:
>> http://www.elver.org/hibernate/hibernate_relations.html#exte rnal
>
>> This new feature also allows you to specify your own Hibernate
>> UserType when
> persisting or loading
>> references. This allows you to completely customize how the external
> reference is stored in the
>> database (if even there) and also how it is resolved when loading the
>> owner
> from the db.
>
>> Let me know if this new annotation helps a bit, and if not what can be
> changed to make it more usefull.
>
>> So with the above I think you can combine static data in CDO with
> transactional data in Teneo. Or
>> static data in a custom garbage collecting resource with transactional
>> data
> in Teneo.
>
>> Btw, as I said I think Teneo with EMF has great value in a webservice
> environment. So if you use
>> Teneo in that area I am very interested to hear about your experience
>> there.
> I am also interested in
>> contributions in that area, so if there are opportunities let me know.
>
>> gr. Martin
>
>> Andrew H wrote:
>>> Thanks Martin
>>>
>>> So if we used CDO w Teneo for static data and straight Teneo for
>>> transactional data this is not going to be a seamless experience.
>>> i.e. its not going to be as clean as if we went straight teneo for
>>> both or CDO w Teneo for both?
>>>
>>> cheers
>>> Andrew
>>>
>>> Martin Taal wrote:
>>>
>>>> Hi Andrew,
>>>> For your info, currently Teneo does not out-of-the-box support external
>>> references to other objects
>>>> persisted by other persistence engines. You can make references
>>>> transient
>>> (using an annotation) to
>>>> prevent them from being persisted in the database and then use some
>>>> other
>>> string field to persist
>>>> the uri. But this can be improved I think.
>>>> So triggered by this discussion (and that the same thing has come up
>>>> before)
>>> I have started to
>>>> implement a special annotation @URI which makes it possible to
>>>> denote an
>>> EReference as an external
>>>> reference. This should work for both Teneo standalone and is a good
>>>> basis
>>> for adding the same
>>>> support to Teneo embedded in CDO. A first implementation should be
>>>> in the
>>> next build (this week
>>>> somewhere).
>>>
>>>> gr. Martin
>>>
>>>> Eike Stepper wrote:
>>>>> Andreww,
>>>>>
>>>>> Comments below...
>>>>>
>>>>> Andrew H schrieb:
>>>>>> Hi Eike
>>>>>>
>>>>>> [...] On the surface it certainly sounds like CDO will be very
>>>>>> useful to us. Particularly for our static data. For our
>>>>>> transactional data we tend to send the whole transaction around
>>>>>> all the time so a simple EMF / Teneo solution would probably work
>>>>>> well. But anything that processes our transactions would need to
>>>>>> resolve the associated static data and it seems like CDO would do
>>>>>> this nicely.
>>>>>>
>>>>>> Just out of curiosity, if it turned out that CDO was a good fit
>>>>>> for our static data but not for our transactional, can that work?
>>>>>> i.e. our transactional models would not be CDO models (just normal
>>>>>> EMF) but our static data ones would be.
>>>>>>
>>>>>> I'm not saying this is necessarily likely but would be interested
>>>>>> to know if its an all or nothing proposition.
>>>>> In CDO 2.0 you can mix different types of Resources in a single
>>>>> ResourceSet and have cross references between them. It's possible
>>>>> to use multiple CDOResources (possibly from/to multiple model
>>>>> repositories) together wit XMLResources, TeneoResource and others.
>>>>>
>>>>> CDO generally supports these "external" references while some of
>>>>> our back-end are not yet uptodate with this framework feature.
>>>>> We're planning to work on this issue until 2.0 GA.
>>>>>
>>>>> That said, in your case it looks as if you want to reference the
>>>>> static data (possibly CDO-managed) from the transactional data
>>>>> (possibly Teneo-managed). Wouldn't that require Teneo to be able to
>>>>> handle these Teneo-external references?
>>>>>
>>>>> Of course it's your own decision and I know that Teneo is generally
>>>>> a good choice, but if you already decide to use CDO for some data,
>>>>> why not use it for all data. You could still choose different CDO
>>>>> back-end types for different resources...
>>>>>
>>>>>>
>>>>>> Also, I noticed there was an audit view capability. That might be
>>>>>> very useful to us to be able to see what the static data looked
>>>>>> like when the transaction was created. Does this work only off
>>>>>> timestamps or is there a way to associate a transaction with a
>>>>>> particular version of our static data?
>>>>> Auditing is currently only supported by our proprietary JDBC
>>>>> back-end adapter (although we discussed support in the other
>>>>> adapters as well). I guess adding this feature to our
>>>>> Teneo/Hibernate adapter will be a particular challenge ;-)
>>>>>
>>>>> Regarding timestamps and versions in historical data, CDO works
>>>>> like this: Each committed transaction creates a set of revisions
>>>>> for the new and changed objects. The ID of such a transaction (==
>>>>> revision set) is the timestamp of the commit operation. Each
>>>>> revision carries its own version number together with the
>>>>> timestamps of two transactions, the one that created the revision
>>>>> and the one that created the following revisions (valid from ->
>>>>> until).
>>>>>
>>>>> You could for any EObject determine the validity range (e.g. the
>>>>> creation time) and then open an audit view to look at the whole
>>>>> object graph as it was exactly at that time.
>>>>>
>>>>> Cheers
>>>>> /Eike
>>>>>
>>>>> ----
>>>>> http://thegordian.blogspot.com
>>>
>>>
>>>
>>>
>
>
>
>


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: non containment references to large data sets [message #426197 is a reply to message #426177] Thu, 18 December 2008 23:05 Go to previous message
Andrew H is currently offline Andrew HFriend
Messages: 117
Registered: July 2009
Senior Member
So it may be possible to build a client side HibernateResource for static
data that works off a hibernate cache. I'll look into this in the new year

thanks again for your help

Martin Taal wrote:

> Hi Andrew,
> As Teneo also supports EMFResources you can read transactional data in one
HibernateResource and
> static data in another HibernateResource (both resources sharing the same
Hibernate Session). Then
> when serializing the transactional resource to xml/xmi (to send the data
accross) the references to
> the static data are handled as references (so not contained in the xml/xmi).
Then when deserializing
> the static data is again read from the db. The hibernate (EMF) resources
support standard hibernate
> lazy loading behavior (using cglib proxies and lazy loaded lists).

> One thing you have to take into account is that Teneo is more a server
oriented solution (although
> there are people who use it at client side also) than an Eclipse
RCP-oriented solution. CDO has much
> better support for Eclipse RCP and also for synchronising the rcp client
with a server running the
> CDO server software (and CDO has many other nice features).
> So which fits best also depends on where most of the business logic runs.

> gr. Martin

> Andrew H wrote:
>> Hi Martin
>>
>> Let me start by saying that I was very impressed with Teneo when I
>> evaluated it a month or two ago. What I was able to achieve in the short
>> time I spent with it was quite extraordinary. And the responsiveness of
>> yourself and all the EMF teams gives newcomers like myself a lot of
>> confidence in jumping on the EMF band wagon. So a big thanks to all.
>>
>> Most of what we do fits very well within the standalone mode you outline
>> below.
>> This is why I evaluated Teneo only and not CDO as at the time I thought
>> it was a simple question.
>>
>> JEE, Web Services, business services with business logic, plenty of
>> different searches across our data etc. are all major pieces.
>>
>> I really do want the static data and transactional data stored together
>> in the database. We will undoubtedly want to do searches that span the
>> two. For example we may want to find all the transactions that a given
>> person is involved in. This could simply be searching on the person and
>> joining on to Transaction.
>>
>> When we capture the Transaction in the GUI I want to wire it into the
>> real static data objects as this will make writing business rules that
>> span them easy. But when we send the transaction for processing
>> (business process, persistence etc.) I want to send only proxies for the
>> static data associated with it. When this reaches its destination it
>> will need to be reunited with its static data. And when persisted I'd
>> like it attached properly to it (not external). Similarly when we pull
>> the transaction out later.
>>
>> So this is the real complication on the standalone Teneo approach. How
>> to manage this way of working with static data. So that it is associated
>> properly with the transactions, serialised as proxies, and reunited with
>> the static data wherever it needs to be, persisted with the static data
>> etc.
>>
>> Its really this aspect that seems to fit well with (my limited
>> understanding of) CDO.
>>
>> Do you have any thoughts on how to satisfy this requirement in a
>> standalone Teneo approach?
>>
>> One thing that may help in this is that the transactions themselves are
>> predominantly containment relationships. But all the relationships to
>> static data are not.
>>
>> Incidentally, Web Services is still an unknown at this stage. The story
>> for EMF + JEE Web Services is not good thanks to JAXB being welded into
>> that stack. When I have time I will explore the Spring Web Service path
>> as Spring tends to be much more customisable. Their new app server may
>> be interesting for us.
>> When I queried on the list about this a while ago someone mentioned they
>> were looking into this but I never heard back.
>>
>> regards
>>
>> Andrew
>>
>> Martin Taal wrote:
>>
>>> Hi Andrew,
>>> It depends on your requirements. The latest Teneo build (30 minutes
>>> ago) has
>> a new feature called
>>> External (see below) which maybe of help in combining Teneo and CDO. Here
>> are also some other
>>> thoughts to help you in your questions. Let me know.
>>
>>> There are two ways to use Teneo: standalone (the standard way), embedded
>> within CDO.
>>
>>> Teneo standalone is very good in J2ee/webserver/webservice
>>> environments as
>> it allows you to work
>>> with generated model objects on the server so that you can write your
>>> custom
>> server-side logic using
>>> those model objects. Teneo with EMF has great value in a webservice
>> environment, Xml/xmi
>>> transformation of EMF combined with persisting using Teneo/Hibernate
>>> has a
>> really good fit there.
>>> Other nice Teneo features:
>>> - works with advanced models (ecore itself for example)
>>> - supports most xsd constructs (like choices, union, lists, extension,
>> substitution groups) and all
>>> ecore constructs (featuremaps for example)
>>> - allows you to really fine-tune the relational model to your needs using
>> annotations
>>> - can operate on an existing relational schema (using annotations to
>>> match
>> up the model and the
>>> relational schema)
>>> - very unintruisive, Teneo works with your model without changing the
>> genmodel or changing the way
>>> you generate your code
>>> - supports advanced querying through hibernate
>>
>>> Teneo embedded within CDO: here the mapping functionality of Teneo is
>>> used
>> so you have advantages of
>>> being able to really influence how the relational schema looks like
>>> (using
>> annotations).
>>> Unfortunately the Teneo embedded in CDO functionality is not mature
>>> enough
>> at the moment (because of
>>> lack of time on my part). So I would wait for this solution a bit longer.
>>
>>> Then I added a new feature in Teneo which should allow you to work with
>> CDOresources from Teneo
>>> standalone:
>>
>>> The latest maintenance build of Teneo now has support for external
>> references to non-teneo-persisted
>>> objects. See here for more info:
>>> http://www.elver.org/hibernate/hibernate_relations.html#exte rnal
>>
>>> This new feature also allows you to specify your own Hibernate
>>> UserType when
>> persisting or loading
>>> references. This allows you to completely customize how the external
>> reference is stored in the
>>> database (if even there) and also how it is resolved when loading the
>>> owner
>> from the db.
>>
>>> Let me know if this new annotation helps a bit, and if not what can be
>> changed to make it more usefull.
>>
>>> So with the above I think you can combine static data in CDO with
>> transactional data in Teneo. Or
>>> static data in a custom garbage collecting resource with transactional
>>> data
>> in Teneo.
>>
>>> Btw, as I said I think Teneo with EMF has great value in a webservice
>> environment. So if you use
>>> Teneo in that area I am very interested to hear about your experience
>>> there.
>> I am also interested in
>>> contributions in that area, so if there are opportunities let me know.
>>
>>> gr. Martin
>>
>>> Andrew H wrote:
>>>> Thanks Martin
>>>>
>>>> So if we used CDO w Teneo for static data and straight Teneo for
>>>> transactional data this is not going to be a seamless experience.
>>>> i.e. its not going to be as clean as if we went straight teneo for
>>>> both or CDO w Teneo for both?
>>>>
>>>> cheers
>>>> Andrew
>>>>
>>>> Martin Taal wrote:
>>>>
>>>>> Hi Andrew,
>>>>> For your info, currently Teneo does not out-of-the-box support external
>>>> references to other objects
>>>>> persisted by other persistence engines. You can make references
>>>>> transient
>>>> (using an annotation) to
>>>>> prevent them from being persisted in the database and then use some
>>>>> other
>>>> string field to persist
>>>>> the uri. But this can be improved I think.
>>>>> So triggered by this discussion (and that the same thing has come up
>>>>> before)
>>>> I have started to
>>>>> implement a special annotation @URI which makes it possible to
>>>>> denote an
>>>> EReference as an external
>>>>> reference. This should work for both Teneo standalone and is a good
>>>>> basis
>>>> for adding the same
>>>>> support to Teneo embedded in CDO. A first implementation should be
>>>>> in the
>>>> next build (this week
>>>>> somewhere).
>>>>
>>>>> gr. Martin
>>>>
>>>>> Eike Stepper wrote:
>>>>>> Andreww,
>>>>>>
>>>>>> Comments below...
>>>>>>
>>>>>> Andrew H schrieb:
>>>>>>> Hi Eike
>>>>>>>
>>>>>>> [...] On the surface it certainly sounds like CDO will be very
>>>>>>> useful to us. Particularly for our static data. For our
>>>>>>> transactional data we tend to send the whole transaction around
>>>>>>> all the time so a simple EMF / Teneo solution would probably work
>>>>>>> well. But anything that processes our transactions would need to
>>>>>>> resolve the associated static data and it seems like CDO would do
>>>>>>> this nicely.
>>>>>>>
>>>>>>> Just out of curiosity, if it turned out that CDO was a good fit
>>>>>>> for our static data but not for our transactional, can that work?
>>>>>>> i.e. our transactional models would not be CDO models (just normal
>>>>>>> EMF) but our static data ones would be.
>>>>>>>
>>>>>>> I'm not saying this is necessarily likely but would be interested
>>>>>>> to know if its an all or nothing proposition.
>>>>>> In CDO 2.0 you can mix different types of Resources in a single
>>>>>> ResourceSet and have cross references between them. It's possible
>>>>>> to use multiple CDOResources (possibly from/to multiple model
>>>>>> repositories) together wit XMLResources, TeneoResource and others.
>>>>>>
>>>>>> CDO generally supports these "external" references while some of
>>>>>> our back-end are not yet uptodate with this framework feature.
>>>>>> We're planning to work on this issue until 2.0 GA.
>>>>>>
>>>>>> That said, in your case it looks as if you want to reference the
>>>>>> static data (possibly CDO-managed) from the transactional data
>>>>>> (possibly Teneo-managed). Wouldn't that require Teneo to be able to
>>>>>> handle these Teneo-external references?
>>>>>>
>>>>>> Of course it's your own decision and I know that Teneo is generally
>>>>>> a good choice, but if you already decide to use CDO for some data,
>>>>>> why not use it for all data. You could still choose different CDO
>>>>>> back-end types for different resources...
>>>>>>
>>>>>>>
>>>>>>> Also, I noticed there was an audit view capability. That might be
>>>>>>> very useful to us to be able to see what the static data looked
>>>>>>> like when the transaction was created. Does this work only off
>>>>>>> timestamps or is there a way to associate a transaction with a
>>>>>>> particular version of our static data?
>>>>>> Auditing is currently only supported by our proprietary JDBC
>>>>>> back-end adapter (although we discussed support in the other
>>>>>> adapters as well). I guess adding this feature to our
>>>>>> Teneo/Hibernate adapter will be a particular challenge ;-)
>>>>>>
>>>>>> Regarding timestamps and versions in historical data, CDO works
>>>>>> like this: Each committed transaction creates a set of revisions
>>>>>> for the new and changed objects. The ID of such a transaction (==
>>>>>> revision set) is the timestamp of the commit operation. Each
>>>>>> revision carries its own version number together with the
>>>>>> timestamps of two transactions, the one that created the revision
>>>>>> and the one that created the following revisions (valid from ->
>>>>>> until).
>>>>>>
>>>>>> You could for any EObject determine the validity range (e.g. the
>>>>>> creation time) and then open an audit view to look at the whole
>>>>>> object graph as it was exactly at that time.
>>>>>>
>>>>>> Cheers
>>>>>> /Eike
>>>>>>
>>>>>> ----
>>>>>> http://thegordian.blogspot.com
>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>>
Previous Topic:[Teneo] @ManyToMany JPA Annotation give double association table
Next Topic:Eclipse 3.4.1 with CDO 2.0
Goto Forum:
  


Current Time: Thu Mar 28 22:42:10 GMT 2024

Powered by FUDForum. Page generated in 0.03965 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top