Overview
“Data
Tools” is a vast domain, yet there are a fairly small number
of foundational requirements when developing with or managing
data-centric systems. A developer is interested in an environment
that is easy to configure -- one in which the challenges of
application development are due to the problem domain, not the
complexity of the tools employed. Data management, whether by a
developer working on an application, or an administrator
maintaining or monitoring a production system, should also provide
a consistent, highly usable environment that works well with
associated technologies.
Such
an environment starts with key frameworks designed both for use
and extensibility. Examples include location and management of
data source drivers, and configurations for access to particular
data source instances. Once a connection is successfully made, the
next task often is to explore the data source, making changes as
required. Some of these operations might be carried out by GUI
actions, others directly through commands. For example, users –
both developers and administrators – typically will create,
edit, and test SQL for these commands. Assistance in editing SQL
through code completion, formatting, and dialect specialization,
greatly enhances productivity. Further, the ability to execute or
debug commands, both SQL and stored procedures, rounds out the
rapid development process that Eclipse supports so well. Finally,
bridging chasms, whether between relational, object, or other
structures, presents challenges that data management tooling
should address.
Mission
The
Data Tools Platform (DTP) project will include extensible
frameworks and exemplary tools, enabling a diverse set of plug-in
offerings specific to particular data-centric technologies and
supported by the DTP ecosystem. In the spirit of Eclipse, the
project will be guided by the following values:
Vendor
neutrality: We intend to provide data management frameworks
and tools not biased toward any vendor. Our intention is that DTP
be leveraged to provide the Eclipse community with the widest
range of choices possible. To this end, we seek community
involvement in formulating key framework interfaces, so that the
largest possible constituency is represented.
Extensibility:
We recognize both the common need for data tooling infrastructure
and the desire to extend the offerings in new and innovative
ways. To support these efforts, our components will be designed
for, and make good use of, extensibility mechanisms supported by
Eclipse.
Community
Involvement: Success for DTP, as with other eclipse.org
projects, is as much a factor of community involvement as the
technical merit of its components. We strongly believe that DTP
will achieve its full potential only as the result of deep and
broad cooperation with the Eclipse membership-at-large. Thus, we
will make every effort to accommodate collaboration, reach
acceptable compromises, and provide a project management
infrastructure that includes all contributors, regardless of
their affiliation, location, interests, or level of involvement.
Regular meetings covering all aspects of DTP, open communication
channels, and equal access to process will be key areas in
driving successful community involvement.
Transparency:
As with all projects under the eclipse.org banner, key
information and discussions at every level – such as
requirements, design, implementation, and testing – will be
easily accessible to the Eclipse membership-at-large.
Agile
development We will strive to incorporate into our planning
process innovations that arise once a project is underway, and
the feedback from our user community on our achievements to date.
We think an agile planning and development process, in which
progress is incremental, near-term deliverables are focused, and
long-term planning is flexible, will be the best way to achieve
this.
Scope
Data-centric
applications are those having an association with a data source,
and a mapping from a data source to an in-memory model. The
distinguishing characteristic of such applications is that their
domain of capabilities is no more specific than data-centric.
For instance, while a Java source file and its in-memory
representation could be considered data, the domain of Java
development is clearly present, and is more specific than just
“data.” Thus, data-centric delineates an
abstract, foundational domain, which is superseded by more
specific domains when the application manipulates something more
than just “data.” An application with a more specific
domain is data-dependent rather than data-centric.
Data-dependent applications are not within the scope of
this project.
Using
a model-driven approach, the Data Tools Project (DTP) consists of
extensible frameworks and exemplary tools for data-centric
applications. These include:
In-memory
representation: Models providing a domain-based interaction with
data, such as database definitions, query models, result sets,
and objects. These models provide the basis upon which all other
DTP components are constructed.
Management:
Administration of data sources including both generic and
vendor-specific configuration options. Examples include adding
and removing tables from a database, setting type information for
contained data, and setting performance parameters.
Data-centric
model transformation: Changing data from one format to another is
a common task in data-centric application development. For
example, there are several popular dialects of the SQL standard.
A query using vendor-specific-dialect extensions is useless when
used with another vendor’s data source. Hence, there is a
need to translate between the two dialects.
Extract-Transform-Load:
Obtaining data from and supplying data to data sources, typically
using large-scale batch movements and involving data checking and
validation. Often includes some operations on data obtained
before loading into target data source.
Data
Mapping: Mapping between data source and in-memory
representation, used for bridging between domains such as object,
relational, hierarchical, and multi-dimensional data structures.
Although
the scope of DTP includes exemplary connectors for popular open
source and commercial data sources, these are not necessarily
intended to be the definitive connectors. Instead, they are
intended to serve two purposes. First, they are intended to enable
users to immediately use these data sources, although possibly
without exploiting all their features. Second, they are intended
to serve as examples to both commercial and open source developers
who want to integrate data sources into Eclipse. It is consistent
with the goals of this project that the exemplary connectors
become superseded by more complete implementations provided by
third parties, both commercial and open source.
Projects
Initially
DTP will contain the following three projects, and emphasize
relational data sources and structures.
Model
Base
The
Model Base project provides the foundation for DTP. Using industry
best practices such as model-driven development with UML, and
taking advantage of the Eclipse Modeling Framework (EMF),
initially included are models for:
Key
features supported and benefits provided include:
Connectivity
The
Connectivity project includes components for defining, connecting
to, and working with data sources. These include:
Driver
Management Framework
Access to the appropriate drivers
is a prerequisite for programmatic interaction with data
sources. The Driver Management Framework (DMF) supplies an
Eclipse preference page enabling users to create driver
definitions based on supplied templates. A number of templates
are provided in the base installation, and additional templates
can be added by component developers contributing to DMF
extension points.
Connection
Management Framework
The Connection Management Framework
(CMF) is the foundation upon which specific connection types
are created. The connection types, called Connection
Profiles (CP), are contributed to the CMF through extension
points. Users then connect to data source instances by creating
and configuring a CP for that data source type. Data
source-standard configuration parameters, such as the
connection URL, user name, and password, are provided on CP
instance creation and stored as secure meta-data for the CP. CP
allow for host connectivity checks (“ping”),
connection, auto-connect on CP startup, and disconnect.
Further, CP Extensions enable additional functionality and
content to be added to a CP. For reuse of CP instance
configuration, base export/export functionality is provided by
CMF and surfaced in tools such as the DTP Explorer (see below).
Data source CP then become the connection providers through
which other DTP tooling accesses data source instances.
JDBC
connection support
DTP will include a JDBC driver
template and CP, as a means of enabling database connectivity,
and serving as an example for further CP development.
Database-specific capabilities can then be surfaced as CP
extensions, allowing for specialization and presentation of
differentiating database functionality directly in that
database's CP.
Data
Source Explorer
The Data Source Explorer (DSE) is an
Eclipse view housing CP instances. From this view, CP
capabilities are surfaced, and data source content is
presented. The type and level of detail for any one instance is
constrained only by the CP itself. DSE also is a provider of CP
instance data to clients, such as drag and drop and API calls.
This allows data tooling requiring connection management to
interact with the DSE as a mediator to CP instances.
Open
Data Access
The Open Data Access (ODA) component is an
open and flexible data access framework that allows
applications to access data from both standard and custom data
sources. It enables data connectivity between data consumers
and data source providers through published run-time and
design-time interfaces. In addition, the framework also
includes an ODA driver management package that helps an ODA
consumer application to manage diverse behavior of individual
ODA data drivers.
A data driver is created simply by
implementing the run-time interfaces defined by the framework.
The run-time interfaces include support for establishing a
connection, accessing meta-data, and executing queries to
retrieve data. A driver can define internal data source
connection profiles and/or work with the CMF's Connection
Profiles extensions. Once developed, the driver can be
registered through an extension point with individual ODA
consumer components to enable data connectivity. The framework
also provides design-time interfaces to integrate custom query
builders within an application designer tool.
SQL
Development Tools
The
SQL Development Tools project provides frameworks and tools for
deep and broad SQL support. The frameworks include:
SQL
Query Parser
Although SQL is defined by a standard,
several major dialects exist. Thus, while the standard must be
supported, practicality also demands flexibility in adjustment
to dialects. The SQL Query Parser meets these needs by
providing an extensible framework, enabling dialect-aware SQL
components and tools.
SQL
Execution Plan Framework
The ability to understand how a
SQL evaluation engine will execute a given query is vital in
tuning queries to optimize performance. The SQL Execution Plan
Framework will provide a means for capturing and presenting
execution plans in a generic fashion, enabling
extensions to
customize support for specific SQL execution engines.
The
tools include:
SQL
Editor
The SQL Editor will provide an exemplary tool for
standard text-based editing of SQL statements. Providing
content assist tied to the SQL Model, syntax colorization, and
multiple statement support, this editor will provide an
essential tool for data-centric development.
Visual
SQL Builder
The Visual SQL Builder allows for graphical
editing of SQL, raising the level of abstraction, increasing
developer productivity, and making query construction possible
for a wider user base.
Script
History
Typically a number of scripts will be executed
repeatedly during the course of data-centric development. The
ability to retain a history of these queries and thereby
quickly repeat execution of them and view results increases
productivity. The Script History is a view meeting these needs,
based on a development session.
Other
Terms
This Charter inherits all terms not otherwise
defined herein from the Eclipse Standard Top-Level Charter
(/org/processes/Eclipse_Standard_TopLevel_Charter_v1.0.html).
This includes, but is not limited to, sections on the Program
Management Committee, Roles, Project Organization, The Development
Process, and Licensing.
|
|