Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » [CDO] Streaming of binary data
[CDO] Streaming of binary data [message #431646] Mon, 20 July 2009 15:00 Go to next message
Cyril Jaquier is currently offline Cyril JaquierFriend
Messages: 80
Registered: July 2009
Member
Hi all,

Is there a way to efficiently stream binary data in EMF/CDO? I'm looking
for something similar to MTOM attachments [1] which allow the use of
streams to read and write the data from and to disk.

Let's take a (dummy) example. If I create a model with the following
"Message" definition:

=========== Message ==========
messageSender : string
messageRecipient : string
messageText : string
messageAttachment : byte[]
messageAttachmentMime : string

I would like to be able to stream the content of "messageAttachment"
without having to load all the bytes into memory. The attached file
could be really big (video, picture, etc).

Is this possible with EMF/CDO in a way similar to MTOM? Should I use a
different approach like sending/receiving the bytes with e.g. HTTP and
storing the URL in the model? Could Net4J help me?

In a more general way, how do you guys handle big byte arrays (files,
videos, pictures, etc) in your models? How do you integrate them into
EMF/CDO?

As you probably noticed, I'm really new to all these EMF/CDO
technologies so please, be tolerant ;) Thanks.

Regards,

Cyril

[1] http://cwiki.apache.org/CXF20DOC/mtom-attachments.html
Re: [CDO] Streaming of binary data [message #431654 is a reply to message #431646] Mon, 20 July 2009 19:34 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 5590
Registered: July 2009
Senior Member
This is a multi-part message in MIME format.
--------------050800050603040105070503
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Cyril,

Net4j is perfectly suited to stream large amounts of data. Documentation
(other than the code itself) is a bit rare on Net4j but I wrote a blog
that gives a hint on streaming data:
http://thegordian.blogspot.com/2008/12/remoting-with-iprogre ssmonitor.html
and there's a file share example in CVS (see attached PSF).

CDO uses that internally but has no particular support for streamed
EDataTypes.

I'm not so sure what you want to achieve in general, is it more about
just tranferring data or is it about storing and distributing model data?

Cheers
/Eike

----
http://thegordian.blogspot.com
http://twitter.com/eikestepper



Cyril Jaquier schrieb:
> Hi all,
>
> Is there a way to efficiently stream binary data in EMF/CDO? I'm
> looking for something similar to MTOM attachments [1] which allow the
> use of streams to read and write the data from and to disk.
>
> Let's take a (dummy) example. If I create a model with the following
> "Message" definition:
>
> =========== Message ==========
> messageSender : string
> messageRecipient : string
> messageText : string
> messageAttachment : byte[]
> messageAttachmentMime : string
>
> I would like to be able to stream the content of "messageAttachment"
> without having to load all the bytes into memory. The attached file
> could be really big (video, picture, etc).
>
> Is this possible with EMF/CDO in a way similar to MTOM? Should I use a
> different approach like sending/receiving the bytes with e.g. HTTP and
> storing the URL in the model? Could Net4J help me?
>
> In a more general way, how do you guys handle big byte arrays (files,
> videos, pictures, etc) in your models? How do you integrate them into
> EMF/CDO?
>
> As you probably noticed, I'm really new to all these EMF/CDO
> technologies so please, be tolerant ;) Thanks.
>
> Regards,
>
> Cyril
>
> [1] http://cwiki.apache.org/CXF20DOC/mtom-attachments.html

--------------050800050603040105070503
Content-Type: text/xml;
name="net4j-fshare-example.psf"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
filename="net4j-fshare-example.psf"

PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4NCjxwc2Yg dmVyc2lvbj0i
Mi4wIj4NCjxwcm92aWRlciBpZD0ib3JnLmVjbGlwc2UudGVhbS5jdnMuY29y ZS5jdnNuYXR1
cmUiPg0KPHByb2plY3QgcmVmZXJlbmNlPSIxLjAsOmV4dHNzaDpkZXYuZWNs aXBzZS5vcmc6
L2N2c3Jvb3QvbW9kZWxpbmcsb3JnLmVjbGlwc2UuZW1mL29yZy5lY2xpcHNl LmVtZi5uZXQ0
ai9leGFtcGxlcy9vcmcuZWNsaXBzZS5uZXQ0ai5leGFtcGxlcy5mc2hhcmUs b3JnLmVjbGlw
c2UubmV0NGouZXhhbXBsZXMuZnNoYXJlIi8+DQo8cHJvamVjdCByZWZlcmVu Y2U9IjEuMCw6
ZXh0c3NoOmRldi5lY2xpcHNlLm9yZzovY3Zzcm9vdC9tb2RlbGluZyxvcmcu ZWNsaXBzZS5l
bWYvb3JnLmVjbGlwc2UuZW1mLm5ldDRqL2V4YW1wbGVzL29yZy5lY2xpcHNl Lm5ldDRqLmV4
YW1wbGVzLmZzaGFyZS5jb21tb24sb3JnLmVjbGlwc2UubmV0NGouZXhhbXBs ZXMuZnNoYXJl
LmNvbW1vbiIvPg0KPHByb2plY3QgcmVmZXJlbmNlPSIxLjAsOmV4dHNzaDpk ZXYuZWNsaXBz
ZS5vcmc6L2N2c3Jvb3QvbW9kZWxpbmcsb3JnLmVjbGlwc2UuZW1mL29yZy5l Y2xpcHNlLmVt
Zi5uZXQ0ai9leGFtcGxlcy9vcmcuZWNsaXBzZS5uZXQ0ai5leGFtcGxlcy5m c2hhcmUuc2Vy
dmVyLG9yZy5lY2xpcHNlLm5ldDRqLmV4YW1wbGVzLmZzaGFyZS5zZXJ2ZXIi Lz4NCjxwcm9q
ZWN0IHJlZmVyZW5jZT0iMS4wLDpleHRzc2g6ZGV2LmVjbGlwc2Uub3JnOi9j dnNyb290L21v
ZGVsaW5nLG9yZy5lY2xpcHNlLmVtZi9vcmcuZWNsaXBzZS5lbWYubmV0NGov ZXhhbXBsZXMv
b3JnLmVjbGlwc2UubmV0NGouZXhhbXBsZXMuZnNoYXJlLnVpLG9yZy5lY2xp cHNlLm5ldDRq
LmV4YW1wbGVzLmZzaGFyZS51aSIvPg0KPC9wcm92aWRlcj4NCjwvcHNmPg==
--------------050800050603040105070503--
Re: [CDO] Streaming of binary data [message #431661 is a reply to message #431654] Tue, 21 July 2009 09:10 Go to previous messageGo to next message
Cyril Jaquier is currently offline Cyril JaquierFriend
Messages: 80
Registered: July 2009
Member
Hi Eike,

Thanks for your quick reply.

> Net4j is perfectly suited to stream large amounts of data. Documentation
> (other than the code itself) is a bit rare on Net4j but I wrote a blog
> that gives a hint on streaming data:
> http://thegordian.blogspot.com/2008/12/remoting-with-iprogre ssmonitor.html
> and there's a file share example in CVS (see attached PSF).
>

Indeed, the example you describe in your blog would meet our needs
perfectly. But...

> CDO uses that internally but has no particular support for streamed
> EDataTypes.
>
> I'm not so sure what you want to achieve in general, is it more about
> just tranferring data or is it about storing and distributing model data?
>

.... I was looking for some kind of integration with EMF/CDO. Instead of
"getAttachment(): byte[]" and "setAttachment(byte[])" methods (this is
what you get if you define a EByteArray), I would like to have
"getAttachment(): stream" and "setAttachment(stream)". And yes, it is
about storing and distributing model data with EMF/CDO.

EByteArray works for a small amount of data. But now if I want to work
with huge byte arrays (let's say 100 MB), the "setBytes(byte[])" and
"getBytes(): byte[]" approach is not possible.

If EMF/CDO do not support this natively, should I:

1/ Use something like "setAttachment(URI)" and "getAttachment(): URI"
and download/upload the bytes separately using Net4j.

2/ Create a "EStream" type. Because EMF/CDO is completely new to me, I
can't say if this is possible/a good approach.

I hope my question is a bit more clear now :-) I could resume it like
this: "How to use huge byte arrays (the ones that would kill your JVM if
you try to load them in memory) in EMF/CDO?"

Regards,

Cyril
Re: [CDO] Streaming of binary data [message #431668 is a reply to message #431661] Tue, 21 July 2009 12:05 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 5590
Registered: July 2009
Senior Member
Cyril,

I like the idea of enhancing CDO in a way that really large pieces of
data can be used by streaming them but I'm not so sure about the right
way to do it. I'd have to think about it and it would be good to file a
bugzilla so that we don't forget about it. It would also help if you,
based on the requirements of your application, come up with a design
proposal. I know that thios requires you to learn many details about CDO
but others have tried it before ;-)

Cheers
/Eike

----
http://thegordian.blogspot.com
http://twitter.com/eikestepper



Cyril Jaquier schrieb:
> Hi Eike,
>
> Thanks for your quick reply.
>
>> Net4j is perfectly suited to stream large amounts of data. Documentation
>> (other than the code itself) is a bit rare on Net4j but I wrote a blog
>> that gives a hint on streaming data:
>> http://thegordian.blogspot.com/2008/12/remoting-with-iprogre ssmonitor.html
>>
>> and there's a file share example in CVS (see attached PSF).
>>
>
> Indeed, the example you describe in your blog would meet our needs
> perfectly. But...
>
>> CDO uses that internally but has no particular support for streamed
>> EDataTypes.
>>
>> I'm not so sure what you want to achieve in general, is it more about
>> just tranferring data or is it about storing and distributing model
>> data?
>>
>
> .... I was looking for some kind of integration with EMF/CDO. Instead
> of "getAttachment(): byte[]" and "setAttachment(byte[])" methods (this
> is what you get if you define a EByteArray), I would like to have
> "getAttachment(): stream" and "setAttachment(stream)". And yes, it is
> about storing and distributing model data with EMF/CDO.
>
> EByteArray works for a small amount of data. But now if I want to work
> with huge byte arrays (let's say 100 MB), the "setBytes(byte[])" and
> "getBytes(): byte[]" approach is not possible.
>
> If EMF/CDO do not support this natively, should I:
>
> 1/ Use something like "setAttachment(URI)" and "getAttachment(): URI"
> and download/upload the bytes separately using Net4j.
>
> 2/ Create a "EStream" type. Because EMF/CDO is completely new to me, I
> can't say if this is possible/a good approach.
>
> I hope my question is a bit more clear now :-) I could resume it like
> this: "How to use huge byte arrays (the ones that would kill your JVM
> if you try to load them in memory) in EMF/CDO?"
>
> Regards,
>
> Cyril
Re: [CDO] Streaming of binary data [message #431706 is a reply to message #431668] Wed, 22 July 2009 11:51 Go to previous messageGo to next message
Kai Schlamp is currently offline Kai SchlampFriend
Messages: 344
Registered: July 2009
Senior Member
> I like the idea of enhancing CDO in a way that really large pieces of
> data can be used by streaming them but I'm not so sure about the right
> way to do it. I'd have to think about it and it would be good to file a
> bugzilla so that we don't forget about it. It would also help if you,
> based on the requirements of your application, come up with a design
> proposal. I know that thios requires you to learn many details about CDO
> but others have tried it before ;-)

Do we have a bugzilla for this one?
A solution I can think of is to have the additional two EMF types:
ECharStream and EByteStream. I am not sure how easy it is to add those
two custom types. Nor if those make any sense in the EMF context or only
in CDO.
Both types could now provide inputstream and outputstream by their
getter and setter.
Just a thought.

Regards,
Kai
Re: [CDO] Streaming of binary data [message #431707 is a reply to message #431706] Wed, 22 July 2009 11:54 Go to previous messageGo to next message
Kai Schlamp is currently offline Kai SchlampFriend
Messages: 344
Registered: July 2009
Senior Member
Kai Schlamp wrote:
>> I like the idea of enhancing CDO in a way that really large pieces of
>> data can be used by streaming them but I'm not so sure about the right
>> way to do it. I'd have to think about it and it would be good to file a
>> bugzilla so that we don't forget about it. It would also help if you,
>> based on the requirements of your application, come up with a design
>> proposal. I know that thios requires you to learn many details about CDO
>> but others have tried it before ;-)
>
> Do we have a bugzilla for this one?
> A solution I can think of is to have the additional two EMF types:
> ECharStream and EByteStream. I am not sure how easy it is to add those
> two custom types. Nor if those make any sense in the EMF context or only
> in CDO.
> Both types could now provide inputstream and outputstream by their
> getter and setter.

Correction ... the EByteStream should provide inputstream and
outputstream, the ECharStream should provide reader and writer.
Re: [CDO] Streaming of binary data [message #431708 is a reply to message #431706] Wed, 22 July 2009 12:05 Go to previous messageGo to next message
Cyril Jaquier is currently offline Cyril JaquierFriend
Messages: 80
Registered: July 2009
Member
>> I like the idea of enhancing CDO in a way that really large pieces of
>> data can be used by streaming them but I'm not so sure about the right
>> way to do it. I'd have to think about it and it would be good to file a
>> bugzilla so that we don't forget about it. It would also help if you,
>> based on the requirements of your application, come up with a design
>> proposal. I know that thios requires you to learn many details about CDO
>> but others have tried it before ;-)
>
> Do we have a bugzilla for this one?

I didn't create a bugzilla yet. Kai, could you create one please? You
have certainly a much much better understanding of the implications in
EMF and CDO. I will write additional comments to the bug if needed.

Regards,

Cyril
Re: [CDO] Streaming of binary data [message #431714 is a reply to message #431708] Wed, 22 July 2009 16:22 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 5590
Registered: July 2009
Senior Member
Cyril,

I think it's essential that you create the bugzilla because you are
providing the requirements. It's not necessary to provide the solution
(although we love patches :P). After having done so, please post a link
to the bugzilla here.

Kai, I had a similar thing like your new EDataTypes in mind. My main
problem is that I don't fully understand how to provide the streams at
client-side and when to provide the data for them. Maybe the names for
the new data types you proposed are misleading. Since they end in
"...Stream" it creates the impression that the data of this data type
*is* a stream. Even from a Java perspective hard to imagine that a piece
of data is both an InputStream *and* an OutputStream.

In the context of CDO and data transfer and storage I think we should
tend to talk about a kind of "Blob", thereby focussing on the
characteristics of the data rather than on the characteristics of the
*access* to this data. In fact the java.sql.Blob is pretty much like
what we want to have, don't we? It's a potentially huge piece of
arbitrary data with access methods optimized for this kind of data. So,
what about an EBlob, and maybe an EClob?

Then we need to think about *when* to transfer the data. Well for a blob
that was locally modified it's clear that we transfer at commit time.
But I don't think we should fully load the blob data on revision load
time. Hence we need additional protocol to lazily load blob chunks while
the EBlob.getInputStream() is being used. Lazy chunk loading leads to
the question of chunk prefetching. Ok, this can be seen as an
optimization and thought of later on.

Does this seem to make a nice feature? ;-)

Cheers
/Eike

----
http://thegordian.blogspot.com
http://twitter.com/eikestepper





Cyril Jaquier schrieb:
>>> I like the idea of enhancing CDO in a way that really large pieces of
>>> data can be used by streaming them but I'm not so sure about the right
>>> way to do it. I'd have to think about it and it would be good to file a
>>> bugzilla so that we don't forget about it. It would also help if you,
>>> based on the requirements of your application, come up with a design
>>> proposal. I know that thios requires you to learn many details about
>>> CDO
>>> but others have tried it before ;-)
>>
>> Do we have a bugzilla for this one?
>
> I didn't create a bugzilla yet. Kai, could you create one please? You
> have certainly a much much better understanding of the implications in
> EMF and CDO. I will write additional comments to the bug if needed.
>
> Regards,
>
> Cyril
Re: [CDO] Streaming of binary data [message #431716 is a reply to message #431714] Wed, 22 July 2009 17:47 Go to previous messageGo to next message
Cyril Jaquier is currently offline Cyril JaquierFriend
Messages: 80
Registered: July 2009
Member
Eike,

> I think it's essential that you create the bugzilla because you are
> providing the requirements. It's not necessary to provide the solution
> (although we love patches :P). After having done so, please post a link
> to the bugzilla here.
>

I would really like to contribute some code but I first need to write at
least 1 line of EMF/CDO code ;)

https://bugs.eclipse.org/bugs/show_bug.cgi?id=284307

Sorry, Eike, I miswrote your name in the bug report :(

> In the context of CDO and data transfer and storage I think we should
> tend to talk about a kind of "Blob", thereby focussing on the
> characteristics of the data rather than on the characteristics of the
> *access* to this data. In fact the java.sql.Blob is pretty much like
> what we want to have, don't we? It's a potentially huge piece of
> arbitrary data with access methods optimized for this kind of data. So,
> what about an EBlob, and maybe an EClob?
>

That would perfectly meet our needs :) What about the server side? Store
support? I guess most of the database have BLOB support so it shouldn't
be a big problem.

> Does this seem to make a nice feature? ;-)
>

Yes it does :)

Regards,

Cyril
Re: [CDO] Streaming of binary data [message #431719 is a reply to message #431714] Wed, 22 July 2009 19:17 Go to previous messageGo to next message
Kai Schlamp is currently offline Kai SchlampFriend
Messages: 344
Registered: July 2009
Senior Member
> In the context of CDO and data transfer and storage I think we should
> tend to talk about a kind of "Blob", thereby focussing on the
> characteristics of the data rather than on the characteristics of the
> *access* to this data. In fact the java.sql.Blob is pretty much like
> what we want to have, don't we? It's a potentially huge piece of
> arbitrary data with access methods optimized for this kind of data. So,
> what about an EBlob, and maybe an EClob?

After a second thought, I don't think that additional EMF data types are
the way to go.
CDO already supports Clob and Blob (at least the DB Store) by using
annotations, but streaming is not supported. So at the end however we
are focusing on data access.
A complete other way would be to provide some utility methods. For
example, the user must annotate (by using the already existent
annotations) an EString that it should be treated by the underlying data
store as Clob. Now one can (but don't have to) access that attribute by
using DBUtil.getInputStream(myObject, theFeatureStructure), or something
like that.
From the application view the application data is a string or some
bytes. Users of CDO shouldn't have to think about database types like
Clob's or Blob's in the first place. But they could use streams if they
want to optimize existing data fields.
The bottom line in my opinion, we don't need any further EMF data types,
but additional methods to access the existing ones.

Regards,
Kai
Re: [CDO] Streaming of binary data [message #431720 is a reply to message #431719] Wed, 22 July 2009 19:30 Go to previous messageGo to next message
Cyril Jaquier is currently offline Cyril JaquierFriend
Messages: 80
Registered: July 2009
Member
> The bottom line in my opinion, we don't need any further EMF data types,
> but additional methods to access the existing ones.
>

I agree with this. If you look at how JAXB deals with MTOM [1] you will
notice that the data type doesn't change (xsd:base64Binary) but that a
new attribute is used in the schema (xmime:expectedContentTypes).
Instead of creating a byte array, JAXB will create a DataHandler which
can be used to stream the bytes. So the type doesn't change, just the
way how you get and set the data does.

Cheers,

Cyril

[1] http://cwiki.apache.org/CXF20DOC/mtom-attachments-with-jaxb. html
Re: [CDO] Streaming of binary data [message #431721 is a reply to message #431720] Wed, 22 July 2009 21:39 Go to previous messageGo to next message
Kai Schlamp is currently offline Kai SchlampFriend
Messages: 344
Registered: July 2009
Senior Member
@all ... let's discuss further options in the bug report.
(https://bugs.eclipse.org/bugs/show_bug.cgi?id=284307)

Cyril Jaquier wrote:
>> The bottom line in my opinion, we don't need any further EMF data
>> types, but additional methods to access the existing ones.
>>
>
> I agree with this. If you look at how JAXB deals with MTOM [1] you will
> notice that the data type doesn't change (xsd:base64Binary) but that a
> new attribute is used in the schema (xmime:expectedContentTypes).
> Instead of creating a byte array, JAXB will create a DataHandler which
> can be used to stream the bytes. So the type doesn't change, just the
> way how you get and set the data does.
>
> Cheers,
>
> Cyril
>
> [1] http://cwiki.apache.org/CXF20DOC/mtom-attachments-with-jaxb. html
Re: [CDO] Streaming of binary data [message #431728 is a reply to message #431716] Thu, 23 July 2009 11:07 Go to previous message
Eike Stepper is currently offline Eike StepperFriend
Messages: 5590
Registered: July 2009
Senior Member
Cyril Jaquier schrieb:
> Eike,
>
>> I think it's essential that you create the bugzilla because you are
>> providing the requirements. It's not necessary to provide the solution
>> (although we love patches :P). After having done so, please post a link
>> to the bugzilla here.
>>
>
> I would really like to contribute some code but I first need to write
> at least 1 line of EMF/CDO code ;)
Yeah, let the fun begin!

>
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=284307
>
> Sorry, Eike, I miswrote your name in the bug report :(
No worries, there was not much opportunity for confusion ;-)
>
>> In the context of CDO and data transfer and storage I think we should
>> tend to talk about a kind of "Blob", thereby focussing on the
>> characteristics of the data rather than on the characteristics of the
>> *access* to this data. In fact the java.sql.Blob is pretty much like
>> what we want to have, don't we? It's a potentially huge piece of
>> arbitrary data with access methods optimized for this kind of data. So,
>> what about an EBlob, and maybe an EClob?
>>
>
> That would perfectly meet our needs :) What about the server side?
> Store support? I guess most of the database have BLOB support so it
> shouldn't be a big problem.
Well, it requires change of the IStoreAccessor API but that's ok.

Cheers
/Eike

----
http://thegordian.blogspot.com
http://twitter.com/eikestepper


>
>> Does this seem to make a nice feature? ;-)
>>
>
> Yes it does :)
>
> Regards,
>
> Cyril
Previous Topic:load ecore file into dynamic EMF
Next Topic:[Teneo] Hibernate refresh failed with parent/child relationship
Goto Forum:
  


Current Time: Mon Dec 22 08:42:45 GMT 2014

Powered by FUDForum. Page generated in 0.03143 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software