Home » Modeling » EMF "Technology" (Ecore Tools, EMFatic, etc) » New Project Proposal: EMF Index
| | | | |
Re: New Project Proposal: EMF Index [message #132147 is a reply to message #132136] |
Mon, 12 January 2009 07:12   |
Eclipse User |
|
|
|
Sven,
My guess would be that the word "limit" is interpreted as a way to
suggest that the solution won't scale to indexing all resources, as in
all resources in the workspace. The intended meaning though is that
clients are responsible for bounding the space of resources to be
indexed and a workspace, however big, is a great way to define bounds
for such a space. The point this part of the proposal is trying to make
is that because EMF can work with resources directly on the network,
it's clearly not possible to index all network resources (EMF Index
isn't Google!) of even possible to discover all network resources that
might be a good target for indexing As such, establishing bounds, i.e.,
an initial set of URIs as the starting point for crawling, must be the
client's responsibility.
Sven Efftinge wrote:
> Hi Chetan,
>
> what's the problem with the mentioned statement?
>
> regards,
> Sven
>
> Chetan Kumar schrieb:
>> I do have one concern about a statement in the proposal document
>> though. The "Index Scope" sub-section of the document says "clients
>> should limit the set of Indexed Resources."
|
|
| |
Re: New Project Proposal: EMF Index [message #132229 is a reply to message #132147] |
Wed, 14 January 2009 03:18   |
Eclipse User |
|
|
|
that's the confusion what i had too. about the correct meaning of "limit"
in the statement. now i clearly understand it thanks to Ed's way of
putting it "EMF Index isn't Google! " ;)
Ed Merks wrote:
> Sven,
> My guess would be that the word "limit" is interpreted as a way to
> suggest that the solution won't scale to indexing all resources, as in
> all resources in the workspace. The intended meaning though is that
> clients are responsible for bounding the space of resources to be
> indexed and a workspace, however big, is a great way to define bounds
> for such a space. The point this part of the proposal is trying to make
> is that because EMF can work with resources directly on the network,
> it's clearly not possible to index all network resources (EMF Index
> isn't Google!) of even possible to discover all network resources that
> might be a good target for indexing As such, establishing bounds, i.e.,
> an initial set of URIs as the starting point for crawling, must be the
> client's responsibility.
> Sven Efftinge wrote:
>> Hi Chetan,
>>
>> what's the problem with the mentioned statement?
>>
>> regards,
>> Sven
>>
>> Chetan Kumar schrieb:
>>> I do have one concern about a statement in the proposal document
>>> though. The "Index Scope" sub-section of the document says "clients
>>> should limit the set of Indexed Resources."
|
|
| | | |
Re: New Project Proposal: EMF Index [message #132320 is a reply to message #132147] |
Mon, 19 January 2009 05:05   |
Eclipse User |
|
|
|
Thanks for the clarification, Ed.
Indeed, one natural choice for the scope would be the workspace. But we
plan to provide APIs for defining other scopes or extends/restrict
existing ones. That would allow clients to implement a crawler like in
Ed's example.
Best regards
Jan
Ed Merks schrieb:
> Sven,
>
> My guess would be that the word "limit" is interpreted as a way to
> suggest that the solution won't scale to indexing all resources, as in
> all resources in the workspace. The intended meaning though is that
> clients are responsible for bounding the space of resources to be
> indexed and a workspace, however big, is a great way to define bounds
> for such a space. The point this part of the proposal is trying to make
> is that because EMF can work with resources directly on the network,
> it's clearly not possible to index all network resources (EMF Index
> isn't Google!) of even possible to discover all network resources that
> might be a good target for indexing As such, establishing bounds, i.e.,
> an initial set of URIs as the starting point for crawling, must be the
> client's responsibility.
>
>
> Sven Efftinge wrote:
>> Hi Chetan,
>>
>> what's the problem with the mentioned statement?
>>
>> regards,
>> Sven
>>
>> Chetan Kumar schrieb:
>>> I do have one concern about a statement in the proposal document
>>> though. The "Index Scope" sub-section of the document says "clients
>>> should limit the set of Indexed Resources."
|
|
| |
Re: New Project Proposal: EMF Index [message #133987 is a reply to message #133960] |
Wed, 11 March 2009 06:06  |
Eclipse User |
|
|
|
Hi Chris,
we will definitely have a closer look at Lucene in the next weeks.
Nevertheless, the current index implementation - used in Xtext - does
*not* use Lucene for various reasons. Note that the following reflects
my personal experiences, and I'd be happily corrected:
- AFAIKS, Lucene is a wonderful technology to index text. From a
computers point of view, text is quite unstructured. So the focus is on
extracting information from something with a sparse informational
content and execute rather weakly typed queries against a fuzzy thing
(well, that's a bit exaggerated, but I think the idea is clear).
Although Lucene allows to index metadata on these files, too, which
usually have a lot more structure, this does not seem to be the primary
target.
- Models are the complete opposite to text: They're all about structure.
For modeling tools, e.g. for linking cross references or for code
completion, we should leverage that fact to allow more efficient
implementation and stronger typed queries. So we have only specific
properties index, and match queries strictly.
- I made some experiments using Lucene for indexing models about two
years ago, and it did neither feel like matching the usecase nor perform
appropriately. Of course, that could have been due to my own skills.
Hearing now that ikv++ has built something that way sounds very
interesting, and I am pretty keen to have a look at it and share
experiences.
- The last important thing is, we want the storage backend to be
exchangeable, i.e., there should be an efficient implementation for CDT.
I don't know yet (and will investigate) whether Lucene is capable of
keeping its data in a database rather than in memory or files, and of
mapping its queries to database queries. We must make sure not to
deteriorate scalability instead of improving it.
- I do like Lucene's query API a lot, but I also think that might be a
lot more that what is targeted for EMF Index. Note that we don't want to
support a full-fledged query language, as this would mean to store the
whole model in the index, turning it into a repository. Rather than
that, we want to improve the access to model resources to allow more
responsive and better scaling modeling tools.
So, to sum it up, I currently think we are targeting a completely
different usecase, but we're open on investigating and discussing other
technologies, ideas and views within the EMF Index project
Best regards
Jan
Chris Aniszczyk schrieb:
> Matthias Erche wrote:
>> Hi Jan,
>>
>> we here at ikv++ technologies have already implemented a product
>> specific indexing solution based on Apache Lucene/Solr. So it's a very
>> interesting proposal for us. Maybe we could share experiences and code.
>
> To echo Matthias' point, does EMF Index plan to use Lucene under the
> covers for indexing which is a pretty powerful technology instead of
> trying to reinvent the wheel? I would love to see collaboration done
> between the projects, as developers, we tend to reinvent the wheel a lot
> when a little work can go into improving existing stuff.
>
> Other than that, good luck with the new project!
>
> Cheers,
>
> Chris Aniszczyk | EclipseSource Austin | +1 860 839 2465
> http://twitter.com/eclipsesource | http://twitter.com/caniszczyk
|
|
|
Re: New Project Proposal: EMF Index [message #620562 is a reply to message #131919] |
Tue, 06 January 2009 10:08  |
Eclipse User |
|
|
|
This project seems very interesting.
I have a few questions:
- Will it be possible to use an external index (like the one in a
relational database, or a custom one ) ? I really hope so.
- Are you going to work with OCL project to make use of the index when
it is useful ? Are you going to use a specific language for the query ?
Simon
Jan Koehnlein wrote:
> Hi everyone,
>
> at the Modeling Symposium at ESE Europe 2008 the need for an indexing
> technology for EMF models has been discussed. The project proposal is
> online at:
>
> http://www.eclipse.org/proposals/emf-index/
>
> I am looking forward to your comments and feedback.
>
> Best regards
> Jan Köhnlein
|
|
|
Re: New Project Proposal: EMF Index [message #620563 is a reply to message #131919] |
Wed, 07 January 2009 03:50  |
Eclipse User |
|
|
|
Hi Jan,
I'm looking forward to seeing such a project in Eclipse Modeling. I have a implemented a basic, non-configurable indexing solution for our own product. I'd be happy to share insights, requirements, and code with your EMF indexing project.
Cheers,
Achim
Jan Koehnlein wrote:
> Hi everyone,
>
> at the Modeling Symposium at ESE Europe 2008 the need for an indexing
> technology for EMF models has been discussed. The project proposal is
> online at:
>
> http://www.eclipse.org/proposals/emf-index/
>
> I am looking forward to your comments and feedback.
>
> Best regards
> Jan Köhnlein
|
|
|
Re: New Project Proposal: EMF Index [message #620573 is a reply to message #131919] |
Mon, 12 January 2009 01:33  |
Eclipse User |
|
|
|
Hi Jan,
Indexing for EMF models is definately a good idea especially if the
meta-model is based on a schema which in itself is huge. May be the EMF
Index and EMF Query can be made to work together for fast object (data)
retrieval.
I do have one concern about a statement in the proposal document
though. The "Index Scope" sub-section of the document says "clients should
limit the set of Indexed Resources."
- Chetan
|
|
|
Re: New Project Proposal: EMF Index [message #620574 is a reply to message #132121] |
Mon, 12 January 2009 04:07  |
Eclipse User |
|
|
|
Hi Chetan,
what's the problem with the mentioned statement?
regards,
Sven
Chetan Kumar schrieb:
> I do have one concern about a statement in the proposal document
> though. The "Index Scope" sub-section of the document says "clients
> should limit the set of Indexed Resources."
|
|
|
Re: New Project Proposal: EMF Index [message #620575 is a reply to message #132136] |
Mon, 12 January 2009 07:12  |
Eclipse User |
|
|
|
Sven,
My guess would be that the word "limit" is interpreted as a way to
suggest that the solution won't scale to indexing all resources, as in
all resources in the workspace. The intended meaning though is that
clients are responsible for bounding the space of resources to be
indexed and a workspace, however big, is a great way to define bounds
for such a space. The point this part of the proposal is trying to make
is that because EMF can work with resources directly on the network,
it's clearly not possible to index all network resources (EMF Index
isn't Google!) of even possible to discover all network resources that
might be a good target for indexing As such, establishing bounds, i.e.,
an initial set of URIs as the starting point for crawling, must be the
client's responsibility.
Sven Efftinge wrote:
> Hi Chetan,
>
> what's the problem with the mentioned statement?
>
> regards,
> Sven
>
> Chetan Kumar schrieb:
>> I do have one concern about a statement in the proposal document
>> though. The "Index Scope" sub-section of the document says "clients
>> should limit the set of Indexed Resources."
|
|
| |
Re: New Project Proposal: EMF Index [message #620581 is a reply to message #132147] |
Wed, 14 January 2009 03:18  |
Eclipse User |
|
|
|
that's the confusion what i had too. about the correct meaning of "limit"
in the statement. now i clearly understand it thanks to Ed's way of
putting it "EMF Index isn't Google! " ;)
Ed Merks wrote:
> Sven,
> My guess would be that the word "limit" is interpreted as a way to
> suggest that the solution won't scale to indexing all resources, as in
> all resources in the workspace. The intended meaning though is that
> clients are responsible for bounding the space of resources to be
> indexed and a workspace, however big, is a great way to define bounds
> for such a space. The point this part of the proposal is trying to make
> is that because EMF can work with resources directly on the network,
> it's clearly not possible to index all network resources (EMF Index
> isn't Google!) of even possible to discover all network resources that
> might be a good target for indexing As such, establishing bounds, i.e.,
> an initial set of URIs as the starting point for crawling, must be the
> client's responsibility.
> Sven Efftinge wrote:
>> Hi Chetan,
>>
>> what's the problem with the mentioned statement?
>>
>> regards,
>> Sven
>>
>> Chetan Kumar schrieb:
>>> I do have one concern about a statement in the proposal document
>>> though. The "Index Scope" sub-section of the document says "clients
>>> should limit the set of Indexed Resources."
|
|
|
Re: New Project Proposal: EMF Index [message #620583 is a reply to message #131919] |
Thu, 15 January 2009 09:41  |
Eclipse User |
|
|
|
Hi Jan,
we here at ikv++ technologies have already implemented a product
specific indexing solution based on Apache Lucene/Solr. So it's a very
interesting proposal for us. Maybe we could share experiences and code.
Regards,
Matthias
--
Dipl.-Inf. Matthias Erche
ikv++ technologies ag
Dessauer Strasse 28/29, D-10963 Berlin
e-mail: erche@ikv.de, web: http://www.ikv.de
phone: +49 30 34 80 77 92, fax: +49 30 34 80 78 0
|
|
|
Re: New Project Proposal: EMF Index [message #620586 is a reply to message #132188] |
Mon, 19 January 2009 04:51  |
Eclipse User |
|
|
|
Thanks a lot for your interest in EMF Index. I'll add you to the
interested parties.
The bug description looks like you could really benfit from EMF Index.
Did you also have a look at EMF Search?
Regards
Jan
Kaloyan Raev schrieb:
> Hi Jan,
> I am very excited to see this project proposed. There is an idea in the
> Web Tools Platform (WTP) project for implementing an index for the Java
> EE artifacts. The Java EE model is based on EMF. The idea is tracked in
> bug: https://bugs.eclipse.org/233505
> You can add the WTP project as an interested party.
> Greetings,
> Kaloyan
>
|
|
|
Re: New Project Proposal: EMF Index [message #620587 is a reply to message #132257] |
Mon, 19 January 2009 04:59  |
Eclipse User |
|
|
|
Hi Matthias,
great to hear you're interested.
From the technologies you used I guess you have implemented something
like a generic keyword search for models. Sounds very interesting.
EMF Index is focusing on a strongly typed search using predefined
queries, like "Find all instances of type X whose attribute Y contains
the value Z". I tink that's a bit different, but there should be enough
overlapping topics.
Best regards
Jan
Matthias Erche schrieb:
> Hi Jan,
>
> we here at ikv++ technologies have already implemented a product
> specific indexing solution based on Apache Lucene/Solr. So it's a very
> interesting proposal for us. Maybe we could share experiences and code.
>
> Regards,
>
> Matthias
>
|
|
|
Re: New Project Proposal: EMF Index [message #620588 is a reply to message #132147] |
Mon, 19 January 2009 05:05  |
Eclipse User |
|
|
|
Thanks for the clarification, Ed.
Indeed, one natural choice for the scope would be the workspace. But we
plan to provide APIs for defining other scopes or extends/restrict
existing ones. That would allow clients to implement a crawler like in
Ed's example.
Best regards
Jan
Ed Merks schrieb:
> Sven,
>
> My guess would be that the word "limit" is interpreted as a way to
> suggest that the solution won't scale to indexing all resources, as in
> all resources in the workspace. The intended meaning though is that
> clients are responsible for bounding the space of resources to be
> indexed and a workspace, however big, is a great way to define bounds
> for such a space. The point this part of the proposal is trying to make
> is that because EMF can work with resources directly on the network,
> it's clearly not possible to index all network resources (EMF Index
> isn't Google!) of even possible to discover all network resources that
> might be a good target for indexing As such, establishing bounds, i.e.,
> an initial set of URIs as the starting point for crawling, must be the
> client's responsibility.
>
>
> Sven Efftinge wrote:
>> Hi Chetan,
>>
>> what's the problem with the mentioned statement?
>>
>> regards,
>> Sven
>>
>> Chetan Kumar schrieb:
>>> I do have one concern about a statement in the proposal document
>>> though. The "Index Scope" sub-section of the document says "clients
>>> should limit the set of Indexed Resources."
|
|
|
Re: New Project Proposal: EMF Index [message #620715 is a reply to message #132257] |
Tue, 10 March 2009 13:31  |
Eclipse User |
|
|
|
Matthias Erche wrote:
> Hi Jan,
>
> we here at ikv++ technologies have already implemented a product
> specific indexing solution based on Apache Lucene/Solr. So it's a very
> interesting proposal for us. Maybe we could share experiences and code.
To echo Matthias' point, does EMF Index plan to use Lucene under the
covers for indexing which is a pretty powerful technology instead of
trying to reinvent the wheel? I would love to see collaboration done
between the projects, as developers, we tend to reinvent the wheel a lot
when a little work can go into improving existing stuff.
Other than that, good luck with the new project!
Cheers,
Chris Aniszczyk | EclipseSource Austin | +1 860 839 2465
http://twitter.com/eclipsesource | http://twitter.com/caniszczyk
|
|
|
Re: New Project Proposal: EMF Index [message #620719 is a reply to message #133960] |
Wed, 11 March 2009 06:06  |
Eclipse User |
|
|
|
Hi Chris,
we will definitely have a closer look at Lucene in the next weeks.
Nevertheless, the current index implementation - used in Xtext - does
*not* use Lucene for various reasons. Note that the following reflects
my personal experiences, and I'd be happily corrected:
- AFAIKS, Lucene is a wonderful technology to index text. From a
computers point of view, text is quite unstructured. So the focus is on
extracting information from something with a sparse informational
content and execute rather weakly typed queries against a fuzzy thing
(well, that's a bit exaggerated, but I think the idea is clear).
Although Lucene allows to index metadata on these files, too, which
usually have a lot more structure, this does not seem to be the primary
target.
- Models are the complete opposite to text: They're all about structure.
For modeling tools, e.g. for linking cross references or for code
completion, we should leverage that fact to allow more efficient
implementation and stronger typed queries. So we have only specific
properties index, and match queries strictly.
- I made some experiments using Lucene for indexing models about two
years ago, and it did neither feel like matching the usecase nor perform
appropriately. Of course, that could have been due to my own skills.
Hearing now that ikv++ has built something that way sounds very
interesting, and I am pretty keen to have a look at it and share
experiences.
- The last important thing is, we want the storage backend to be
exchangeable, i.e., there should be an efficient implementation for CDT.
I don't know yet (and will investigate) whether Lucene is capable of
keeping its data in a database rather than in memory or files, and of
mapping its queries to database queries. We must make sure not to
deteriorate scalability instead of improving it.
- I do like Lucene's query API a lot, but I also think that might be a
lot more that what is targeted for EMF Index. Note that we don't want to
support a full-fledged query language, as this would mean to store the
whole model in the index, turning it into a repository. Rather than
that, we want to improve the access to model resources to allow more
responsive and better scaling modeling tools.
So, to sum it up, I currently think we are targeting a completely
different usecase, but we're open on investigating and discussing other
technologies, ideas and views within the EMF Index project
Best regards
Jan
Chris Aniszczyk schrieb:
> Matthias Erche wrote:
>> Hi Jan,
>>
>> we here at ikv++ technologies have already implemented a product
>> specific indexing solution based on Apache Lucene/Solr. So it's a very
>> interesting proposal for us. Maybe we could share experiences and code.
>
> To echo Matthias' point, does EMF Index plan to use Lucene under the
> covers for indexing which is a pretty powerful technology instead of
> trying to reinvent the wheel? I would love to see collaboration done
> between the projects, as developers, we tend to reinvent the wheel a lot
> when a little work can go into improving existing stuff.
>
> Other than that, good luck with the new project!
>
> Cheers,
>
> Chris Aniszczyk | EclipseSource Austin | +1 860 839 2465
> http://twitter.com/eclipsesource | http://twitter.com/caniszczyk
|
|
|
Goto Forum:
Current Time: Tue Jul 22 14:51:49 EDT 2025
Powered by FUDForum. Page generated in 0.53102 seconds
|