Skip to main content



      Home
Home » Eclipse Projects » Equinox » more updates on registry resolution
more updates on registry resolution [message #25231] Wed, 28 May 2003 09:14 Go to next message
Eclipse UserFriend
Originally posted by: Rafael_Chaves.ca.ibm.com_do-not-spam-me

Hi all,

This is to keep everybody updated with what is going on regarding the
dynamic registry resolution.

We have made good progress. The existing registry resolver was changed to
call the new dependency resolution algorithm instead of doing it itself,
and so far we've been able to run the Eclipse registry resolution test
cases with a low failure rate (8 in 189), and those failures are related
to known limitations/differences in the new algorithm regarding conflict
resolution and cycle detection. The good news is that some tests known to
fail with the current implementation pass in the new one. The performance
for running Eclipse/WSAD is equivalent to the current implementation, and
it has shown to scale well for huge registries. Also, for the incremental
cases (addition, removal of plug-in versions), the resolution time is
proportional to the number of affected plug-ins, being really fast for
small changes.

Regarding the tests, the initial version failed in 57 of 189 tests - this
was because the current implementation handles library plug-ins (plug-ins
with no extensions nor extension points) differently than we planned to
do. So, for now, we decided to change the version selection algorithm to
be compatible with the current implementation (for library plug-ins, pick
only the versions which are required by other plug-ins, or pick the
highest version if none is required). Later, we may change it back to our
original intent (resolve *all* library plug-ins that can have their
pre-requisites satisfied), if we think it is safe.

Here are some comments/questions from Keith:

<keith>
I didn't understand your comment regarding changing back to original
intent - do library plug-ins currently not get resolved if they have their
prerequisites satisfied? Does this only happen when the library is not
required by other plugins?
</keith>

<rafael>
Library plug-ins may have multiple versions resolved simultaneously, while
non-library plug-ins can have only one. The current implementation only
resolves more than one version of a library plug-in when they are
specifically required by other plug-ins. If a library plug-in is not
required by any other plug-in (it is a "root plug-in" in the current
implementation's terminology), then the highest version is picked.

The idea we had initially was that all library plug-ins versions should be
resolved (we wouldn't have to pick versions), or only those required by
other plug-ins should be resolved (in other words, there would be no root
library plug-ins). But for now we are doing as above.
</rafael>

<keith>
I'm really looking forward to seeing your code.
</keith>

<rafael>I pretend to finish it soon. But I can send the prototype
implementation to you.</rafael>

<keith>
What have you do with the extension/extension point resolver?
</keith>

<rafael>
I haven't changed anything related to validating plug-in descriptors,
adding fragments, linking extensions to extension points, or trimming the
registry. The only thing I changed was to replace in the
RegistryResolver#resolve() method the code that decided the resolution
status of each plug-in descriptor (between the calls to
resolvePluginFragments() and resolvePluginRegistry()) by the creation of a
corresponding dependency system and its resolution (and analysis of the
outcome to make corresponding changes to descriptors' states).
</rafael>

<keith>
What are your thoughts regarding permitting/disallowing the
enablement/disablement of plugins? (...) my thoughts are that we should we
refactor the resolver so that we can run it w/o effecting a running
register (this is not too hard with the current resolver).

Consider enable(plugin-list). My thoughts are that this should only be
allowed if
No currently resolved plugins go unresolved.
Corresponding extension points of extensions implemented by this plugin
must be able to handle dynamic plugins. For example, if the plugin
provides a menu, the menu needs to get displayed.
I'm not sure if there's an issue the other way - if a previously
unresolved plugin provides an extension point, is it ok if there were
dangling extensions?
My thoughts are you run the resolver, see if a desirable end state if
reached, and then carefully add the new plugins to the registry while
keeping it consistent. This would take some refactoring but I believe it
could be done.
</keith>

<rafael>
The resolution of the dependency system is (mostly) apart from the plugin
registry, so it helps in that direction. But fragments addition is
something that modifies the registry and may be harder to handle - unless
we say that fragments are not dynamic - they cannot affect existing
plug-ins, only those that are being added - what makes a lot of sense to
me.
But so far I have been concerned mostly about the dependency resolution,
so I haven't thought much about registry resolution thing as a whole. I
will probably have more comments on that later.
</rafael>

Thanks for the feedback, Keith.

Regards,

Rafael
Re: more updates on registry resolution [message #25313 is a reply to message #25231] Wed, 28 May 2003 15:49 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: kduffey.marketron.com

Good stuff. Question, I thought Equinox was going to do away with the
startup (static) resolution of plugins, extensions, libraries, etc and
resolve these as needed at runtime? That is, when the JVM asks a plugins
classloader to "find a class", the classloader code would look in the
engines registry of plugins, ask each plugins loader to find the class, etc.
Once the class is found it is "cached" and no longer needs the lookup. I am
assuming it would be at this point that extensions and extension points
would also be resolved. The reason I ask this is I am curious how the
unload/reload facility will work if late binding resolution is not
performed? If you unload a plugin, how will it remove its
extensions/extension points, notify any plugins dependent on any extension
points that they no longer exist, clean up any menus, buttons, etc and so
forth? Likewise, if a plugin is loaded (or reloaded) how does it add in its
extensions/extension points in such a way that other dependent plugins can
now "re-connect" to them for thier use?

One thought I had was have the engine "pause" all dependent plugins of a
given plugin IF it is being reloaded, such that the engine notifes all
dependent plugisn, they "freeze" thier status somehow, so that they no
longer work until notified by the engine that the plugin is now reloaded. In
the same cycle, the engine would then reconnect all dependent plugins
extensions to any extension points the reloaded plugin provides, then the
engine would resume all dependent plugins.

Is this basically how it is working? Or are we still using static
resolution, and somehow also adding dynamic resolution as plugins are loaded
at runtime? How will reloads work in terms of unresolving then re-resolving
dependencies, libraries, etc?

Last question, when you say a library plugin, are you referring to one that
provides a library that it depends on, such as in the /lib/xerces.jar type
of style? OR can it be one that uses another plugins library(ies) and does
not provide its own? I never quite understood how it works out if plugin A
and B both provide xerces.jar, and B relies on a later version than A, does
the engine force A and B to use the same version, or do they both use their
own and both versions of xerces are loaded? From the OSGi spec I read, it
appears, unless I misunderstood, that the "latest" version is chosen by the
OSGi implementation and all bundles must use that version. I don't see how
this can be good though because one plugin may not function properly with a
later version of a library it depends on, supposing a method is removed or
something.



<Rafael_Chaves@ca.ibm.com_do-not-spam-me> wrote in message
news:bb2co0$utf$1@rogue.oti.com...
> Hi all,
>
> This is to keep everybody updated with what is going on regarding the
> dynamic registry resolution.
>
> We have made good progress. The existing registry resolver was changed to
> call the new dependency resolution algorithm instead of doing it itself,
> and so far we've been able to run the Eclipse registry resolution test
> cases with a low failure rate (8 in 189), and those failures are related
> to known limitations/differences in the new algorithm regarding conflict
> resolution and cycle detection. The good news is that some tests known to
> fail with the current implementation pass in the new one. The performance
> for running Eclipse/WSAD is equivalent to the current implementation, and
> it has shown to scale well for huge registries. Also, for the incremental
> cases (addition, removal of plug-in versions), the resolution time is
> proportional to the number of affected plug-ins, being really fast for
> small changes.
>
> Regarding the tests, the initial version failed in 57 of 189 tests - this
> was because the current implementation handles library plug-ins (plug-ins
> with no extensions nor extension points) differently than we planned to
> do. So, for now, we decided to change the version selection algorithm to
> be compatible with the current implementation (for library plug-ins, pick
> only the versions which are required by other plug-ins, or pick the
> highest version if none is required). Later, we may change it back to our
> original intent (resolve *all* library plug-ins that can have their
> pre-requisites satisfied), if we think it is safe.
>
> Here are some comments/questions from Keith:
>
> <keith>
> I didn't understand your comment regarding changing back to original
> intent - do library plug-ins currently not get resolved if they have their
> prerequisites satisfied? Does this only happen when the library is not
> required by other plugins?
> </keith>
>
> <rafael>
> Library plug-ins may have multiple versions resolved simultaneously, while
> non-library plug-ins can have only one. The current implementation only
> resolves more than one version of a library plug-in when they are
> specifically required by other plug-ins. If a library plug-in is not
> required by any other plug-in (it is a "root plug-in" in the current
> implementation's terminology), then the highest version is picked.
>
> The idea we had initially was that all library plug-ins versions should be
> resolved (we wouldn't have to pick versions), or only those required by
> other plug-ins should be resolved (in other words, there would be no root
> library plug-ins). But for now we are doing as above.
> </rafael>
>
> <keith>
> I'm really looking forward to seeing your code.
> </keith>
>
> <rafael>I pretend to finish it soon. But I can send the prototype
> implementation to you.</rafael>
>
> <keith>
> What have you do with the extension/extension point resolver?
> </keith>
>
> <rafael>
> I haven't changed anything related to validating plug-in descriptors,
> adding fragments, linking extensions to extension points, or trimming the
> registry. The only thing I changed was to replace in the
> RegistryResolver#resolve() method the code that decided the resolution
> status of each plug-in descriptor (between the calls to
> resolvePluginFragments() and resolvePluginRegistry()) by the creation of a
> corresponding dependency system and its resolution (and analysis of the
> outcome to make corresponding changes to descriptors' states).
> </rafael>
>
> <keith>
> What are your thoughts regarding permitting/disallowing the
> enablement/disablement of plugins? (...) my thoughts are that we should we
> refactor the resolver so that we can run it w/o effecting a running
> register (this is not too hard with the current resolver).
>
> Consider enable(plugin-list). My thoughts are that this should only be
> allowed if
> No currently resolved plugins go unresolved.
> Corresponding extension points of extensions implemented by this plugin
> must be able to handle dynamic plugins. For example, if the plugin
> provides a menu, the menu needs to get displayed.
> I'm not sure if there's an issue the other way - if a previously
> unresolved plugin provides an extension point, is it ok if there were
> dangling extensions?
> My thoughts are you run the resolver, see if a desirable end state if
> reached, and then carefully add the new plugins to the registry while
> keeping it consistent. This would take some refactoring but I believe it
> could be done.
> </keith>
>
> <rafael>
> The resolution of the dependency system is (mostly) apart from the plugin
> registry, so it helps in that direction. But fragments addition is
> something that modifies the registry and may be harder to handle - unless
> we say that fragments are not dynamic - they cannot affect existing
> plug-ins, only those that are being added - what makes a lot of sense to
> me.
> But so far I have been concerned mostly about the dependency resolution,
> so I haven't thought much about registry resolution thing as a whole. I
> will probably have more comments on that later.
> </rafael>
>
> Thanks for the feedback, Keith.
>
> Regards,
>
> Rafael
Re: more updates on registry resolution [message #25354 is a reply to message #25313] Wed, 28 May 2003 16:39 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: Rafael_Chaves.no-spam-ca.ibm.com

Kevin,

The registry resolution is changing to be incremental, i.e., instead of
happening only once at the platform start-up (actually only when there are
changes in the registry between Eclipse sessions), it will happen also
during a Eclipse session, in the case plug-in versions are being
installed/updated/removed. But at these moments the resolution (either
initial/incremental) always happens integrally, and it is completely apart
from the loading of plug-in classes.

I am not aware about how the deactivation/removal of a plug-in will be
handled, though. I would say that once we decide that the registry changes
should be done, we should deactivate all active plug-ins that will be
affected (because they - or a plug-in version they directly/indirectly
require - will be unresolved/replaced), and then the registry changes will
be effectively occur (old extensions/extension points removed, new added).
Then, only after the registry changes have been committed, the new
versions will be eventually activated.

Regards,

Rafael
library plug-ins [message #25394 is a reply to message #25313] Wed, 28 May 2003 17:02 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: Rafael_Chaves.ca.ibm.com

> Last question, when you say a library plugin, are you referring to one
that
>provides a library that it depends on, such as in the /lib/xerces.jar
type
>of style?

Exactly. We usually restrict the number of different versions of a plug-in
to resolved at a given time to 1 (which one will be picked is decided by
the registry resolver), because that usually does not make sense. But for
plug-ins that do not provide extensions/extension points, probably
intended only to provided libraries to other plug-ins, this restriction is
relaxed in a way that if more than one version can be resolved if they are
required.

DJ made a point one of these days saying that even library plug-ins can
cause problems if multiple versions are activated simultaneously and they
try to access their state location (under .metadata/.plugins).

Maybe we should have this concept of library plug-ins more explicit
through a specific attribute in the plug-in manifest or by requiring
library plug-ins to not have a custom plug-in class. And we could change
Plugin#getStateLocation() for library plug-ins to return null (library
plug-ins should not be able to.

Comments?

Rafael
Re: more updates on registry resolution [message #25435 is a reply to message #25354] Wed, 28 May 2003 17:08 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: kduffey.marketron.com

Alrighty, that makes some sense. I happen to be doing this very same thing
in my own engine, and my approach to it is similar. I too have not got it
working yet. My engine is no where near as complex as Eclipse however, but I
find it does things similar. I have borrowed the extension idea from
Eclipse, but am also adding the Services model from OSGi, although in both
cases I have done it differently. I have yet to look at or use any code from
either project. I am working on my own engine for a few reasons. One, to
learn! I have learned a lot from reading in these forums, especially about
class loaders and such. Two, Eclipse is just too big to use as a "generic"
plugin engine, and I have yet to figure out how to decouple it from SWT and
other aspects. In other words, I don't see how it can be a headless "simple"
engine to be used for any application. Equinox I believe is hopefully moving
towards that goal. I am curious what the "size" of equinox will be once it
is done. Currently, my engine is 32K in size with extension support, built
in xmlpull parser, dependency resolution (at startup) and lazy creation of
plugin classes. I was aiming to keep it very small for use in small devices,
but I recently learned that the J2ME doesn't support classloaders, so I am
not sure yet how I will be able to get it to work in J2ME. Perhaps a
replacement for the ClassLoader to use Class.forName() which apparently does
exist, will do the job, but I am not sure yet. Also, I wanted to build a
very simple ui framework around my own engine, and lastly, I started my
engine almost a year ago long before I ever knew Eclipse existed (don't ask
how I didn't know about it!). Even so, I find I want to work on Equinox
still, as I believe Equinox has a greater value to the java community at
large. I am hoping I can contribute by figuring out the xmlpull parsing for
eclipse/equinox plugins.

Anyway, what I thought might work in this case for late resolution and no
startup/static resolution at all, is to have the code in the classloader (in
my engine there is only one classloader, not several like in Eclipse), such
that the classloader asks the engine to find a given class by searching
through every plugins classloader. This would work, I am sure of it, but my
biggest fear is scalability. If you have 1000 plugins loaded, it could take
quite some time to look through all those plugin classloaders to find a
class, right? I guess the reason I consider this a viable solution is after
watching the output of the classloader, it appears that every single plugin,
when looking for classes like java. and javax., there are TONS of calls to
do so. I'd imagine it takes quite a bit of time to find all the java classes
for every plugin, although at runtime it seems like it is plenty fast. So my
thought here was looking through say, 100's of plugins (in the rare case an
application actually has 100's of plugins loaded), would not take too to
long and once the class(es) are found, they are cached. Therefore, I ask you
(and anybody else who knows), if this is infact a plausible problem, the
issue of performance finding classes by resolving them at runtime when they
are needed, as opposed to trying to resolve dependencies (and possibly
extensions) as needed? At startup, you at least get it all done ahead of
time. But I have noticed the more plugins, the longer the app takes to
start. Even though plugin code is not being created, the plugin's
classloaders have to be instantiated so that finding classes in dependent
plugins is possible. Instantiating 100's of classloaders, building a
registry of plugins, extension points, extensions, resolving all those
dependencies, that takes time at startup that would be great if we could
avoid and do at runtime. I guess this is one of those catch 22 issues. We
can trade off startup delays to have potentially slower runtime each time a
plugin class has to be found/instantiated.

I should add that what I have though was to "cache" dependent loaders once a
class is found, so that before looking in ALL plugins loaders, it first
looks in dependent loaders that have already found classes. It is likely
that more than one class will be used from a dependent plugin.

Sorry this was so long. Sometimes I feel these "detailed" aspects require a
bit more information to make a point.


<Rafael_Chaves@no-spam-ca.ibm.com> wrote in message
news:bb36q6$ni2$1@rogue.oti.com...
> Kevin,
>
> The registry resolution is changing to be incremental, i.e., instead of
> happening only once at the platform start-up (actually only when there are
> changes in the registry between Eclipse sessions), it will happen also
> during a Eclipse session, in the case plug-in versions are being
> installed/updated/removed. But at these moments the resolution (either
> initial/incremental) always happens integrally, and it is completely apart
> from the loading of plug-in classes.
>
> I am not aware about how the deactivation/removal of a plug-in will be
> handled, though. I would say that once we decide that the registry changes
> should be done, we should deactivate all active plug-ins that will be
> affected (because they - or a plug-in version they directly/indirectly
> require - will be unresolved/replaced), and then the registry changes will
> be effectively occur (old extensions/extension points removed, new added).
> Then, only after the registry changes have been committed, the new
> versions will be eventually activated.
>
> Regards,
>
> Rafael
Re: library plug-ins [message #25474 is a reply to message #25394] Wed, 28 May 2003 17:18 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: kduffey.marketron.com

Hmm, I would still be concerned with the issue where plugin A needs
xerces.jar version 1.1 and plugin B needs 2.0. You can't, nor should you
automatically force plugin A to use the 2.0 version. But then, what happens
if plugin C depends on xerces.jar indirectly by depending on plugin A. It is
forced to use the 1.1 version as well? Or does the engine give it 2.0?

Frankly, I am not sure quite how the best approach would be in supporting
plugins with libraries. It would be great if you could just make a
classloader instance that can somehow be the parent loader to loaders of
plugins that all use xerces.jar, for example, so that they all "share" the
xerces.jar by finding its classes via the parent loader. Of course, building
a classloader hierarchy like this isn't exactly easy, depending on the
complexity of plugins being loaded and their uses. I imagine it could get
quite messy. What if plugin A and B share xerces.jar, and A and C share
servlet.jar. Then how does A have a parent loader for both xerces.jar and
serlvet.jar? Thus, I think it would be pretty difficult to manage that
correctly, if at all, not to mention why would we want to do that?!

So what is the solution presently? How does Eclipse and Equinox handle the
same library of different versions amongst plugins? OSGi, as I understand
it, always picks the latest version of the library, so that ALL bundles end
up using the same version. Again, I fail to see how that is a good design
because some bundles may require the earlier version. Maybe there is a way
to enforce it? For example, maybe we can have the manifest specify "loose"
binding or "tight" binding, such that loose binding would allow the engine
to configure what library is finally used, and tight binding indicates that
the plugin MUST use the version of the library it provides (or, through a
dependent plugin that provides the library, use that version).


<Rafael_Chaves@ca.ibm.com> wrote in message
news:bb385t$ni2$2@rogue.oti.com...
> > Last question, when you say a library plugin, are you referring to one
> that
> >provides a library that it depends on, such as in the /lib/xerces.jar
> type
> >of style?
>
> Exactly. We usually restrict the number of different versions of a plug-in
> to resolved at a given time to 1 (which one will be picked is decided by
> the registry resolver), because that usually does not make sense. But for
> plug-ins that do not provide extensions/extension points, probably
> intended only to provided libraries to other plug-ins, this restriction is
> relaxed in a way that if more than one version can be resolved if they are
> required.
>
> DJ made a point one of these days saying that even library plug-ins can
> cause problems if multiple versions are activated simultaneously and they
> try to access their state location (under .metadata/.plugins).
>
> Maybe we should have this concept of library plug-ins more explicit
> through a specific attribute in the plug-in manifest or by requiring
> library plug-ins to not have a custom plug-in class. And we could change
> Plugin#getStateLocation() for library plug-ins to return null (library
> plug-ins should not be able to.
>
> Comments?
>
> Rafael
>
>
Re: more updates on registry resolution [message #25514 is a reply to message #25313] Wed, 28 May 2003 23:17 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: pascal_rapicault.yahoo.fr

There are documents available on the newsgroup that explain how we see
(although our heads are probably more full of details than the documents
available) the deactivation,disablement, etc...

By the way, would you mind posting your engine so we can see what you are
talking about?

P

"Kevin" <kduffey@marketron.com> a
Re: more updates on registry resolution [message #25554 is a reply to message #25514] Thu, 29 May 2003 01:58 Go to previous messageGo to next message
Eclipse UserFriend
My engine is no where near done now that I am reworking it. The one I use
does the job, but doesn't handle reloading, etc. Once I get the new one in
working order I'll put up a link to it.

I have been following the various threads, but as you said, my head is full
at this time and I can't recall every aspect of how it will all work.

I had a thought today about one issue that seems to plauge Eclipse, one some
have said is a problem. The use of string values for everything invariably
results in large numbers of strings in the string table, and more so,
requires string comparisons quite a bit even at runtime. The ultimate would
be able to replace strings with int or long guid values. Ideally the engine
would dynamically generate these ID values. This would probably work for
static plugins like Eclipse is now, but in order to support unloadable and
reloadable plugins, there would need to be some way to look up the string
values when resolving the newly loaded strings into GUIDs. The solution I
believe would work if it is possible is to use some sort of serialVersionUID
type of function, that can take the same exact String and ALWAYS generate
the same exact int or long value. I always forget if int is 16-bit or 32-bit
and long is 32-bit or 64-bit (or is it per platform?). Anyway, I think a
16-bit value with 65535 total numbers would be plenty to handle even large
plugin deployments, although using a 32-bit value on a 32-bit register cpu
may prove to be faster..forget how all that works at the machine level.
Anyway, so long as whatever is used can always produce the same unique GUID
from the same string, there should be no reason generating GUID values that
are then compared with == can't be used. I think it would eliminate the
string table lookups/comparisons, speeding things up quite a bit. What do
you think? I am thinking of implementing this in my engine to do away with
the use of storing fully qualified class names, ID values, etc. The hard
part is, dynamically loading a class from a guid. The formula would have to
work in reverse as well, which I can imagine might be a harder target to
figure out. Anyone know of some way to do this? Hell, that would be some
feat. Taking any length string and creating a 16-bit value for it. Haha, not
likely. We'd end up devising the best compression routine in the world! Oh
well, it was a thought. My other thought then, since now that I think about
it that probably won't work, is to create a temporary table in memory,
keying on the GUID generated by the engine, and the value would be the
string ID value from the plugin.xml file. Now, once resolutions are done, we
dump the data as a properties file to disk. Since unloading, and reloading
is likely not to occur that often, when it does, we simply reload the
properties table of GUIDs and string values for the given session. We can
then use this temporarily to look up existing GUID's for strings already
handled by the engine. Of course, if a plugin is unloaded, we'd have to
"remove" the property(ies) value for it in the file as well. At least this
would still use the == to compare things with, instead of string
comparisons, as well as eliminate the large String table issue. Because the
loading/reloading of plugins will most likely require a finite amount of
time anyway, it is likely that performance is not an issue when this step
occurs, so loading a properties table into memory to help resolve
dependencies, extensions, etc wouldn't be all that bad. As for class names,
well, we'd still have to keep a list of those in memory most likely,
otherwise to create instances of classes, or at least the plugin lifecycle
class, might be a problem at runtime for lazy creation.



"Pascal Rapicault" <pascal_rapicault@yahoo.fr> wrote in message
news:bb3u42$8gj$1@rogue.oti.com...
> There are documents available on the newsgroup that explain how we see
> (although our heads are probably more full of details than the documents
> available) the deactivation,disablement, etc...
>
> By the way, would you mind posting your engine so we can see what you are
> talking about?
>
> P
>
> "Kevin" <kduffey@marketron.com> a
Re: more updates on registry resolution [message #25594 is a reply to message #25435] Tue, 03 June 2003 08:11 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: d3l3t3-heavy-n0sp4m.ungoverned.org

Kevin wrote:
> Anyway, what I thought might work in this case for late resolution and no
> startup/static resolution at all, is to have the code in the classloader (in
> my engine there is only one classloader, not several like in Eclipse),

Multiple class loaders provide for multiple name spaces, how do you deal
with name conflicts with only one class loader?

> such
> that the classloader asks the engine to find a given class by searching
> through every plugins classloader. This would work, I am sure of it, but my
> biggest fear is scalability.

With respect to OSGi, you have import and export meta-data, so it is not
necessary to search everything. When you want to find a class, you must
resolve from where you want to load the class, because you might have
multiple exporters. Thus, in OSGi, you choose one of the exporters from
the potential exporters are you are done.

The only overhead is choosing one exporter. From that point on, anyone
subsequently loading classes from the same package will use the same
exporter and there is no need to choose the exporter any longer.

This is what OSGi refers to as "resolving" a bundle.

-> richard
Re: more updates on registry resolution [message #25630 is a reply to message #25594] Wed, 04 June 2003 02:25 Go to previous messageGo to next message
Eclipse UserFriend
I must have stated that in a difficult manner to understand. What I meant is
I only have a single class called PluginClassLoader. Every plugin does
receive its own instance of the PluginClassLoader, thus each has its own
name space. Sorry for the misunderstanding.

"Richard S. Hall" <d3l3t3-heavy-n0sp4m@ungoverned.org> wrote in message
news:bbi39o$fp0$1@rogue.oti.com...
> Kevin wrote:
> > Anyway, what I thought might work in this case for late resolution and
no
> > startup/static resolution at all, is to have the code in the classloader
(in
> > my engine there is only one classloader, not several like in Eclipse),
>
> Multiple class loaders provide for multiple name spaces, how do you deal
> with name conflicts with only one class loader?
>
> > such
> > that the classloader asks the engine to find a given class by searching
> > through every plugins classloader. This would work, I am sure of it, but
my
> > biggest fear is scalability.
>
> With respect to OSGi, you have import and export meta-data, so it is not
> necessary to search everything. When you want to find a class, you must
> resolve from where you want to load the class, because you might have
> multiple exporters. Thus, in OSGi, you choose one of the exporters from
> the potential exporters are you are done.
>
> The only overhead is choosing one exporter. From that point on, anyone
> subsequently loading classes from the same package will use the same
> exporter and there is no need to choose the exporter any longer.
>
> This is what OSGi refers to as "resolving" a bundle.
>
> -> richard
>
String performance (was Re: more updates on registry resolution [message #25666 is a reply to message #25554] Wed, 11 June 2003 15:37 Go to previous messageGo to next message
Eclipse UserFriend
Kevin wrote:
>
> I had a thought today about one issue that seems to plauge Eclipse, one some
> have said is a problem. The use of string values for everything invariably
> results in large numbers of strings in the string table, and more so,
> requires string comparisons quite a bit even at runtime. The ultimate would
> be able to replace strings with int or long guid values. Ideally the engine
> would dynamically generate these ID values. This would probably work for
> static plugins like Eclipse is now, but in order to support unloadable and
> reloadable plugins, there would need to be some way to look up the string
> values when resolving the newly loaded strings into GUIDs. The solution I
> believe would work if it is possible is to use some sort of serialVersionUID
> type of function, that can take the same exact String and ALWAYS generate
> the same exact int or long value. I always forget if int is 16-bit or 32-bit
> and long is 32-bit or 64-bit (or is it per platform?).

In Java, int==32 bits, long==64 bits on all platforms.

> Anyway, I think a
> 16-bit value with 65535 total numbers would be plenty to handle even large
> plugin deployments, although using a 32-bit value on a 32-bit register cpu
> may prove to be faster..forget how all that works at the machine level.

The simple hash used for the String.hashCode() method is 31 bit and
essentially unique for all but the longest Strings. Hashcodes must meet
the contract that if A.equals(B) then A.hashCode() == B.hashCode().
However it is *not* required that if A.hashCode() == B.hashCode() then
A.equals(B). Thus it is possible for unequal objects to have identical
hashcodes. In the case of the String class, the latter is true up to
fairly long strings though (since Java 1.2 that is). The algorythm used is:

public int hashCode() {
int h = hash;
if (h == 0) {
int off = offset;
char val[] = value;
int len = count;

for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}

[Note - String is immutable, so I don't know why they don't cache the
resulting hash. That would greatly speed up hash tables using strings.]

A characteristic of this and good hash algorithms is that they scatter
the results broadly across the hash range. I.E. "dog" and "dogs" end up
far apart:

"dog".hashCode=99644
"dogs".hashCode=3089079

Although "dog" and "doh" (99645) would still be next to each other. In
general, though this is good enough.

> Anyway, so long as whatever is used can always produce the same unique GUID
> from the same string, there should be no reason generating GUID values that
> are then compared with == can't be used. I think it would eliminate the
> string table lookups/comparisons, speeding things up quite a bit. What do
> you think? I am thinking of implementing this in my engine to do away with
> the use of storing fully qualified class names, ID values, etc.

Caching hashCode values in a table and then using "==" for fast string
comparisons instead of equals() would indeed be faster in some cases.
However, that is essentially how a hash map works internally. I.E.
given a string as a key, it uses the hashcode for the key to find the
correct 'bucket' (a bucket is basically a range of hash values) in which
to look for the value. If there is more than one value in the bucket
(even if your buckets are only one-hash value wide, remember, more than
one object can have the same hash), it then resorts to 'equals()' to do
the final match. If you have a good hash algorithm that scatters
objects far and wide across the hashcode range, then you will tend to
only have very few objects within a given bucket. If there is only one
value in the bucket, then you never have to do equals(). That is very
often the case with Strings.

To do as you suggest would gain us a bit of speed mainly because the
String.hashCode() method is not caching the hash that it generates and
so it gets regenerated each time you need it. Your scheme would have us
use a hash/UID externally in place of the string for lookups and such.
We'd need to replace the use of HashTables and HashMaps with structures
that accepted integer primitives as keys, though or we'd lose all our
performance gains because we'd have to stick the integers into Integer
Objects in order to use them as keys.

I'm not sure we have that much to gain here.

If we want to improve performance, we might do better to look at how to
avoid the memory cost of duplicate Strings being loaded into memory. I
suspect that much of the information loaded for each plugin.xml is
recurring. That is, the same individual string bits are repeatedly
occuring again and again in the plugin manifests. That means that when
the plugin.xml's are loaded, multiple instances of Strings with the same
values are being created and maintained in memory. For example, two
plugins providing extensions to the same extension point will have
identical strings in their DOM's. Because Strings are immutable, this
is a totally avoidable by using the String.intern() method as you parse
the plugin.xml files just prior to loading the values into the DOM.
That would assure that duplicate String values result in only one
instance in memory. I don't know if we are currently doing this
optimization or not.

Cheers,

Mel
Re: String performance (was Re: more updates on registry resolution [message #25722 is a reply to message #25666] Wed, 11 June 2003 21:12 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: jeff_mcaffer_REMOVE.ca.ibm.com

Eclipse currently both seeks to reduce the number of strings actually loaded
as well as intern()ing ones that are likely duplicated. However, this is a
very modest attempt. What is really needed is a systematic rethink of using
strings as representations/ids. Java strings cost roughly 44 + 2N bytes
(where N = # chars in the string). Constants cost another >N (UTF8 encoding
of the chars). Literals likewise for every class containing the literal.
Even things like message bundles are consuming massive amounts of memory in
an average Java program.

In many cases strings are used for convenience and readability. There is
generally no need for them in code.

Anyway, there are some design decisions in this area but mostly we want to
minimize the structures which we are forced to keep in memory. Being able
to load/flush this data from disk on demand seems to fit most of our needs.

Jeff


"Mel Martinez" <melm@us.ibm.com> wrote in message
news:bc80ub$etr$1@rogue.oti.com...
> Kevin wrote:
> >
> > I had a thought today about one issue that seems to plauge Eclipse, one
some
> > have said is a problem. The use of string values for everything
invariably
> > results in large numbers of strings in the string table, and more so,
> > requires string comparisons quite a bit even at runtime. The ultimate
would
> > be able to replace strings with int or long guid values. Ideally the
engine
> > would dynamically generate these ID values. This would probably work for
> > static plugins like Eclipse is now, but in order to support unloadable
and
> > reloadable plugins, there would need to be some way to look up the
string
> > values when resolving the newly loaded strings into GUIDs. The solution
I
> > believe would work if it is possible is to use some sort of
serialVersionUID
> > type of function, that can take the same exact String and ALWAYS
generate
> > the same exact int or long value. I always forget if int is 16-bit or
32-bit
> > and long is 32-bit or 64-bit (or is it per platform?).
>
> In Java, int==32 bits, long==64 bits on all platforms.
>
> > Anyway, I think a
> > 16-bit value with 65535 total numbers would be plenty to handle even
large
> > plugin deployments, although using a 32-bit value on a 32-bit register
cpu
> > may prove to be faster..forget how all that works at the machine level.
>
> The simple hash used for the String.hashCode() method is 31 bit and
> essentially unique for all but the longest Strings. Hashcodes must meet
> the contract that if A.equals(B) then A.hashCode() == B.hashCode().
> However it is *not* required that if A.hashCode() == B.hashCode() then
> A.equals(B). Thus it is possible for unequal objects to have identical
> hashcodes. In the case of the String class, the latter is true up to
> fairly long strings though (since Java 1.2 that is). The algorythm used
is:
>
> public int hashCode() {
> int h = hash;
> if (h == 0) {
> int off = offset;
> char val[] = value;
> int len = count;
>
> for (int i = 0; i < len; i++) {
> h = 31*h + val[off++];
> }
> hash = h;
> }
> return h;
> }
>
> [Note - String is immutable, so I don't know why they don't cache the
> resulting hash. That would greatly speed up hash tables using strings.]
>
> A characteristic of this and good hash algorithms is that they scatter
> the results broadly across the hash range. I.E. "dog" and "dogs" end up
> far apart:
>
> "dog".hashCode=99644
> "dogs".hashCode=3089079
>
> Although "dog" and "doh" (99645) would still be next to each other. In
> general, though this is good enough.
>
> > Anyway, so long as whatever is used can always produce the same unique
GUID
> > from the same string, there should be no reason generating GUID values
that
> > are then compared with == can't be used. I think it would eliminate the
> > string table lookups/comparisons, speeding things up quite a bit. What
do
> > you think? I am thinking of implementing this in my engine to do away
with
> > the use of storing fully qualified class names, ID values, etc.
>
> Caching hashCode values in a table and then using "==" for fast string
> comparisons instead of equals() would indeed be faster in some cases.
> However, that is essentially how a hash map works internally. I.E.
> given a string as a key, it uses the hashcode for the key to find the
> correct 'bucket' (a bucket is basically a range of hash values) in which
> to look for the value. If there is more than one value in the bucket
> (even if your buckets are only one-hash value wide, remember, more than
> one object can have the same hash), it then resorts to 'equals()' to do
> the final match. If you have a good hash algorithm that scatters
> objects far and wide across the hashcode range, then you will tend to
> only have very few objects within a given bucket. If there is only one
> value in the bucket, then you never have to do equals(). That is very
> often the case with Strings.
>
> To do as you suggest would gain us a bit of speed mainly because the
> String.hashCode() method is not caching the hash that it generates and
> so it gets regenerated each time you need it. Your scheme would have us
> use a hash/UID externally in place of the string for lookups and such.
> We'd need to replace the use of HashTables and HashMaps with structures
> that accepted integer primitives as keys, though or we'd lose all our
> performance gains because we'd have to stick the integers into Integer
> Objects in order to use them as keys.
>
> I'm not sure we have that much to gain here.
>
> If we want to improve performance, we might do better to look at how to
> avoid the memory cost of duplicate Strings being loaded into memory. I
> suspect that much of the information loaded for each plugin.xml is
> recurring. That is, the same individual string bits are repeatedly
> occuring again and again in the plugin manifests. That means that when
> the plugin.xml's are loaded, multiple instances of Strings with the same
> values are being created and maintained in memory. For example, two
> plugins providing extensions to the same extension point will have
> identical strings in their DOM's. Because Strings are immutable, this
> is a totally avoidable by using the String.intern() method as you parse
> the plugin.xml files just prior to loading the values into the DOM.
> That would assure that duplicate String values result in only one
> instance in memory. I don't know if we are currently doing this
> optimization or not.
>
> Cheers,
>
> Mel
>
Re: String performance (was Re: more updates on registry resolution [message #25816 is a reply to message #25722] Wed, 11 June 2003 22:25 Go to previous messageGo to next message
Eclipse UserFriend
Hey, so my idea of dumping it off to disk, then loading it as needed has
some merit? This is how I figured I would reduce the string overhead in my
engine. Another thought is that it seems the name, id and class names are
all using the same packages. The intern() should reduce this, right? I read
(and someone posted here as well) that if all have the same package name,
then the same "partial" string is stored only one time with refs to the
start of different parts of that string to make up the rest of a string.
Does this mean:

org.eclipse.core.XXX and org.eclipse.core.YYY are two completely different
strings, or does the org.eclipse.core. portion get stored only once in
memory, and somehow XXX and YYY are "concatenated" to form the full string
and thus XXX and YYY would be two other strings in memory.

"Jeff McAffer" <jeff_mcaffer_REMOVE@ca.ibm.com> wrote in message
news:bc8k29$tnq$1@rogue.oti.com...
> Eclipse currently both seeks to reduce the number of strings actually
loaded
> as well as intern()ing ones that are likely duplicated. However, this is
a
> very modest attempt. What is really needed is a systematic rethink of
using
> strings as representations/ids. Java strings cost roughly 44 + 2N bytes
> (where N = # chars in the string). Constants cost another >N (UTF8
encoding
> of the chars). Literals likewise for every class containing the literal.
> Even things like message bundles are consuming massive amounts of memory
in
> an average Java program.
>
> In many cases strings are used for convenience and readability. There is
> generally no need for them in code.
>
> Anyway, there are some design decisions in this area but mostly we want to
> minimize the structures which we are forced to keep in memory. Being able
> to load/flush this data from disk on demand seems to fit most of our
needs.
>
> Jeff
>
>
> "Mel Martinez" <melm@us.ibm.com> wrote in message
> news:bc80ub$etr$1@rogue.oti.com...
> > Kevin wrote:
> > >
> > > I had a thought today about one issue that seems to plauge Eclipse,
one
> some
> > > have said is a problem. The use of string values for everything
> invariably
> > > results in large numbers of strings in the string table, and more so,
> > > requires string comparisons quite a bit even at runtime. The ultimate
> would
> > > be able to replace strings with int or long guid values. Ideally the
> engine
> > > would dynamically generate these ID values. This would probably work
for
> > > static plugins like Eclipse is now, but in order to support unloadable
> and
> > > reloadable plugins, there would need to be some way to look up the
> string
> > > values when resolving the newly loaded strings into GUIDs. The
solution
> I
> > > believe would work if it is possible is to use some sort of
> serialVersionUID
> > > type of function, that can take the same exact String and ALWAYS
> generate
> > > the same exact int or long value. I always forget if int is 16-bit or
> 32-bit
> > > and long is 32-bit or 64-bit (or is it per platform?).
> >
> > In Java, int==32 bits, long==64 bits on all platforms.
> >
> > > Anyway, I think a
> > > 16-bit value with 65535 total numbers would be plenty to handle even
> large
> > > plugin deployments, although using a 32-bit value on a 32-bit register
> cpu
> > > may prove to be faster..forget how all that works at the machine
level.
> >
> > The simple hash used for the String.hashCode() method is 31 bit and
> > essentially unique for all but the longest Strings. Hashcodes must meet
> > the contract that if A.equals(B) then A.hashCode() == B.hashCode().
> > However it is *not* required that if A.hashCode() == B.hashCode() then
> > A.equals(B). Thus it is possible for unequal objects to have identical
> > hashcodes. In the case of the String class, the latter is true up to
> > fairly long strings though (since Java 1.2 that is). The algorythm used
> is:
> >
> > public int hashCode() {
> > int h = hash;
> > if (h == 0) {
> > int off = offset;
> > char val[] = value;
> > int len = count;
> >
> > for (int i = 0; i < len; i++) {
> > h = 31*h + val[off++];
> > }
> > hash = h;
> > }
> > return h;
> > }
> >
> > [Note - String is immutable, so I don't know why they don't cache the
> > resulting hash. That would greatly speed up hash tables using strings.]
> >
> > A characteristic of this and good hash algorithms is that they scatter
> > the results broadly across the hash range. I.E. "dog" and "dogs" end up
> > far apart:
> >
> > "dog".hashCode=99644
> > "dogs".hashCode=3089079
> >
> > Although "dog" and "doh" (99645) would still be next to each other. In
> > general, though this is good enough.
> >
> > > Anyway, so long as whatever is used can always produce the same unique
> GUID
> > > from the same string, there should be no reason generating GUID values
> that
> > > are then compared with == can't be used. I think it would eliminate
the
> > > string table lookups/comparisons, speeding things up quite a bit. What
> do
> > > you think? I am thinking of implementing this in my engine to do away
> with
> > > the use of storing fully qualified class names, ID values, etc.
> >
> > Caching hashCode values in a table and then using "==" for fast string
> > comparisons instead of equals() would indeed be faster in some cases.
> > However, that is essentially how a hash map works internally. I.E.
> > given a string as a key, it uses the hashcode for the key to find the
> > correct 'bucket' (a bucket is basically a range of hash values) in which
> > to look for the value. If there is more than one value in the bucket
> > (even if your buckets are only one-hash value wide, remember, more than
> > one object can have the same hash), it then resorts to 'equals()' to do
> > the final match. If you have a good hash algorithm that scatters
> > objects far and wide across the hashcode range, then you will tend to
> > only have very few objects within a given bucket. If there is only one
> > value in the bucket, then you never have to do equals(). That is very
> > often the case with Strings.
> >
> > To do as you suggest would gain us a bit of speed mainly because the
> > String.hashCode() method is not caching the hash that it generates and
> > so it gets regenerated each time you need it. Your scheme would have us
> > use a hash/UID externally in place of the string for lookups and such.
> > We'd need to replace the use of HashTables and HashMaps with structures
> > that accepted integer primitives as keys, though or we'd lose all our
> > performance gains because we'd have to stick the integers into Integer
> > Objects in order to use them as keys.
> >
> > I'm not sure we have that much to gain here.
> >
> > If we want to improve performance, we might do better to look at how to
> > avoid the memory cost of duplicate Strings being loaded into memory. I
> > suspect that much of the information loaded for each plugin.xml is
> > recurring. That is, the same individual string bits are repeatedly
> > occuring again and again in the plugin manifests. That means that when
> > the plugin.xml's are loaded, multiple instances of Strings with the same
> > values are being created and maintained in memory. For example, two
> > plugins providing extensions to the same extension point will have
> > identical strings in their DOM's. Because Strings are immutable, this
> > is a totally avoidable by using the String.intern() method as you parse
> > the plugin.xml files just prior to loading the values into the DOM.
> > That would assure that duplicate String values result in only one
> > instance in memory. I don't know if we are currently doing this
> > optimization or not.
> >
> > Cheers,
> >
> > Mel
> >
>
>
Re: String performance (was Re: more updates on registry resolution [message #25854 is a reply to message #25666] Mon, 16 June 2003 15:55 Go to previous messageGo to next message
Eclipse UserFriend
Mel Martinez wrote:
>
> [Note - String is immutable, so I don't know why they don't cache the
> resulting hash. That would greatly speed up hash tables using strings.]
>

What a maroon. I don't know what I was looking at when I wrote that
because clearly the hash is indeed being cached in String. This means
that for all intents and purposes, when used with Hashtables & HashMaps,
Strings are being compared as ints most of the time. This does
reinforce my opinion that there is probably not a ton to be gained by
replacing the use of Strings with id numbers unless you never plan to
use the actual string literal forms. This assumes rigorous application
of String.intern(), of course.
Re: String performance (was Re: more updates on registry resolution [message #25936 is a reply to message #25854] Tue, 17 June 2003 00:43 Go to previous messageGo to next message
Eclipse UserFriend
Is there any way to prove this? I mean, perhaps a simple example app that
can take 10000 strings (from say a property file with a max of 100 different
strings), intern them, compare them in some way, then do the same for int
values. That is an interesting issue, to say the least. I would agree that
if 100's of string comparisons via intern()'d strings is almost as fast as
comparing int values, which would probably need to be Integer objects anyway
to be a key for a Map, well, then I'd say fork the idea of using int's over
strings. But, does the length of a String have any penalty?

"Mel Martinez" <melm@us.ibm.com> wrote in message
news:bcl7t6$qnk$1@rogue.oti.com...
> Mel Martinez wrote:
> >
> > [Note - String is immutable, so I don't know why they don't cache the
> > resulting hash. That would greatly speed up hash tables using strings.]
> >
>
> What a maroon. I don't know what I was looking at when I wrote that
> because clearly the hash is indeed being cached in String. This means
> that for all intents and purposes, when used with Hashtables & HashMaps,
> Strings are being compared as ints most of the time. This does
> reinforce my opinion that there is probably not a ton to be gained by
> replacing the use of Strings with id numbers unless you never plan to
> use the actual string literal forms. This assumes rigorous application
> of String.intern(), of course.
>
Re: String performance (was Re: more updates on registry resolution [message #26062 is a reply to message #25854] Tue, 17 June 2003 17:55 Go to previous message
Eclipse UserFriend
Originally posted by: jeff_mcaffer_REMOVE.ca.ibm.com

Random thoughts...

- intern() is not free. Either to do initially or on subsequent lookups.
Also, it affects the overall performance of hte system since classloading
does interns and on many VMs the fuller the table, the slower the lookup.
We have crashed VMs by intern()'ing too much (100,000's of strings)

- I would propose tossing Strings and hash tables in favour of ints and
arrays. Most ofthese strings are just there for dev time simplicity. No
reason why some tool couldn't convert them to packed arrays and change uses.

- Note that the strings I am talking about are the keys for the properties
or plugin ids or... These strings are either NEVER used (as strings, just
as keys) or are seldom used and can be loaded on demand.

Jeff



"Mel Martinez" <melm@us.ibm.com> wrote in message
news:bcl7t6$qnk$1@rogue.oti.com...
> Mel Martinez wrote:
> >
> > [Note - String is immutable, so I don't know why they don't cache the
> > resulting hash. That would greatly speed up hash tables using strings.]
> >
>
> What a maroon. I don't know what I was looking at when I wrote that
> because clearly the hash is indeed being cached in String. This means
> that for all intents and purposes, when used with Hashtables & HashMaps,
> Strings are being compared as ints most of the time. This does
> reinforce my opinion that there is probably not a ton to be gained by
> replacing the use of Strings with id numbers unless you never plan to
> use the actual string literal forms. This assumes rigorous application
> of String.intern(), of course.
>
Previous Topic:M2 plan posted
Next Topic:Enable/disable approach
Goto Forum:
  


Current Time: Sun Aug 10 03:03:46 EDT 2025

Powered by FUDForum. Page generated in 0.05661 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top