Last modified: February 24, 2005
|Large-scale development issues|
One of the major development themes for Eclipse 3.1 is to improve support for
"Large-scale development" in Eclipse. This includes improving collaboration for
large, distributed teams, but it also encompasses support for large workspaces.
This document captures requirements submitted in bug reports, mailing lists,
and other discussions from people using Eclipse for large-scale project development.
Not all of these issues are committed to be solved in Eclipse 3.1, but this list presents,
in no particular order, a problem scope from which work items can be chosen. Some
of these items are already present on the Eclipse 3.1 plan, but are included here
1. Memory footprint
Eclipse imposes a significant RAM footprint when working with a large workspace.
Identify principal areas of memory consumption and explore opportunities to
reduce current footprint.
Extension registry footprint. Eclipse maintains the registry of plug-ins, extensions,
and extension points in memory. As the number of plug-ins and extensions
grows, so does this footprint. Convert the registry to a cache structure that
stores infrequently referenced portions on disk and brings extensions into memory
in a lazy and transient manner.
Workspace tree footprint. The workspace is represented in memory as a tree
containing various data such as resource names, attributes, markers, etc.
Explore reducing the amount of data stored in memory for each resource, and
other optimizations such as uniquification of strings.
Team/CVS metadata footprint. The CVS plug-ins store significant information
in memory about the synchronization and "dirty" state of each resource. Explore
reducing the footprint of this data or using lazy caching to only bring this information
into memory when needed.
Message bundles (bug 37712).
Most plug-ins store translated strings in ResourceBundle
objects. These bundles are not space-efficient, and often use lengthy string-based
keys for message lookup. Explore a more efficient representation, integer-based keys,
or a disk-based bundle for infrequently used messages.
2. Performance of I/O-bound operations
Large teams often store development artifacts (code, diagrams, documentation)
on a network file system in order to increase reliability, facilitate backup and restoring of data,
and to simplify integration and building. I/O-bound operations in Eclipse are typically
much slower in such environments. Explore optimization of I/O-bound operations,
and moving lengthy operations into a background thread.
Project creation. Creating a project at a file system location that contains
a large number of existing files and folders requires significant I/O to discover
all the files and to gather local information such as time stamps. This project
discovery can be moved into a background thread.
Resource copy, move, and delete. Most operations that act on trees of resources
still cannot be run in the background. When these operations take a long time,
the user is forced to wait until they complete. These should be converted to "user" jobs
that can optionally be run in the background.
Recursive deletion (bug 10628).
Java provides no API for recursively deleting a directory
containing files and other directories. This means deleting a large resource tree
requires two native I/O calls per directory (one to list the children, one to delete), and
one native I/O call per file. This particularly impacts compilation, which often needs
to delete large trees of resources in the output (bin) folder. Consider adding a native
method to improve recursive deletion performance.
3. Project interchange
Eclipse has always emphasized first-class support for integration of repository tools,
and has treated repositories as the primary vehicle for code sharing among team members.
This leaves behind groups that either don't use a repository, or don't use a repository
that has Eclipse integration plug-ins. The Import/Export wizards are typically used
by such groups to share code. Some improvements to the import and export tools
would them more powerful as a project interchange (sharing) mechanism.
Import multiple projects at once
If you have unzipped, untarred, or checked out a large group of projects from a repository,
there is no way to load them all into a workspace at once. The current "existing project"
import wizard only allows importing one project at a time.
Import project inside zip file
The Export wizard allows you to export an entire project into a ZIP or JAR file. The corresponding
Import wizards don't allow you to import that ZIP or JAR file back into a workspace
as a top-level project. The user has to unzip the file and import as existing project,
or create a new project with the same name and import the contents. This should
be made easier. Similarly, it should be possible to import a ZIP containing multiple
Rename project on import
The "import existing project" wizard doesn't allow you to import a project but
pick a different name for the project in the workspace. This is often needed
by users who check out projects from a repository into the file system, and then
want to call them something different in the workspace (one common example
is when working on multiple streams of the same project in a single workspace).
4. Support for non-incremental builders
The workspace builder infrastructure is designed primarily with efficient
incremental compilers in mind. Auto-build is turned on by default, and this
is only realistic for fast builders. The workspace should have support
for inherently non-incremental or slower builders (such as C compilers and Ant-based builders).
In particular, we need to support users working in a heterogeneous environment with
some fast incremental builders and some slow non-incremental builders, sometimes
with both on the same project
Read the proposal.
5. Improved working sets
With very large workspaces, working sets are often used to filter the amount of
information showed in various views, and for scoping long-running tasks such
as builds and searches. The current working set support has some problems:
- Shared notion of a working set
Each view has to be explicitly and manually
scoped to a given working set. Each long running search or build also needs to have
the working set manually chosen. One particularly bad example is the Java browsing
perspective, which has four views that each needs to have its working set specified manually.
Consider adding a global notion of a current working set, or a current working set
Dynamic working sets. A working set is defined as a static list of elements.
Consider adding mechanisms to make working sets more flexible, such as
wildcards (bug 62646),
exclusion (bug 22362),
and tracking project creations (bug 15941)
and moves (bug 15938).
Aggregating or nesting working sets would also be useful for very large scale workspaces.
||item is under development.
||item is under investigation.
||item is finished.
||item is time permitted.
||item is deferred.