Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [eclipse-dev] What's happening after 2.1?

If the suggestion is to get more ambitious I'd like to see the work on
the 'Extended Java Family' extended towards support for generative
programming.  By restructuring the compiler it should be possible to
allow people to take greater advantage of it.  For instance the code
generation back end could be reusable with different scanner/parsers
generating ASTs from which .class files are generated.  More ambitiously
the scanner could be opened up to support extended syntaxes and language

I've attached a small document that outlines these ideas.  I can't see
myself having the time to develop these ideas any further in the near

Regarding the compatibility issues, could this be addressed - or at least
mitigated - by producing adapter plugins that would allow 2.1 plugins
to continue working.  This would fatten the footprint but could be an
acceptable interim solution until plugin providers are able to migrate.

Dave Wathen
Phone: +44 (0)20 8660 5171
Mobile: +44 (0)7968 167934
Fax: +44 (0)870 051 7664

-----Original Message-----
From: eclipse-dev-admin@xxxxxxxxxxx
[mailto:eclipse-dev-admin@xxxxxxxxxxx]On Behalf Of Kevin Haaland
Sent: 22 April 2003 21:10
To: eclipse-dev@xxxxxxxxxxx
Subject: [eclipse-dev] What's happening after 2.1?

Late last year we posted an initial draft 2.2 plan which we were viewing
as an incremental improvement over Eclipse 2.1. Now that we've started
work on 2.2, we feel we should be setting even more ambitious goals for
the next major release.

We've put together a rough outline of what we'd like to tackle in this
next release of Eclipse:

We'd appreciate community feedback to help us in firming up the next
revision of the development plan.

Please send your comments to the eclipse-dev@xxxxxxxxxxx developer mailing

eclipse-dev mailing list

D. Wathen – August 2002

This document outlines an idea for easier definition of new programming languages.  
The target user group being considered here for such a facility is Java developers 
although the ideas could also be applied to C# developers or any other language with 
a similar runtime model.


General-purpose programming languages, such as Java, can be used to express a great 
range of programs.  Huge libraries that provide access to many resources supplement 
the language.  Using a single language has advantages in terms of developer training 
and availability.  Why not just use a single language to express all problems?

Anyone who has worked in applications' development for long will be able to relate 
to the following scenario.  (As will many application users!)  An application has been 
developed and now its users request some changes.  The users' ask for two changes in 
particular.  One - they say - is a small change but the other is large.  The developers 
assess the changes required to the system and reach the same conclusion.  However 
the change the users perceive to be small is assessed by the developers to be large and 
vice versa.

Why does this happen?  The problem is that, in order to express the application in a 
general-purpose language, the developers have created a document (the source), the 
structure of which doesn't relate well to the users' view of the application.  The 
developers may have anticipated some of the users requirements and coded flexibility 
into the system so some changes to requirements are easy.  Other changes of 
requirement may mandate a large number of changes all over the applications' code 
or even a major restructuring of the code.

These problems have been known about for a long time.  Various attempts to address 
these have been made.  Refactoring [Fow99] catalogues processes for controlling 
code restructuring thus increasing the efficiency of the change process.   This 
approach suggests not attempting to predict future requirement but instead accepting 
change and concentrating efforts on the changes when they actually occur.

Application programmers commonly utilise and create tools to reduce repetition 
within their codebase.  These tools include precompilers and source code generators.  
Some of these tools are provided by tool providers for example rmic [Sun-1] whilst 
others are home grown [Cle01].  The idea is that by generating code instead of writing 
it by hand, broad-scope changes can be facilitated through less demanding changes to 
the generators' input or by changes to the generator itself.

Generative Programming [CE00] provides a detailed survey of techniques for 
generating code from other representations.  The approaches discussed make a much 
deeper attack on the problems of code expressing the intention of the design.  These 
approaches tackle the separation of concerns in software engineering and lead to a 
more diverse decomposition of the problem space.


In short, development of computer languages is a non-trivial activity.  This work is 
usually carried out by specialists and the workings of compilers are a mystery to many 
applications developers.

Generating source code attempts to utilise existing technology (that is compilers) to 
allow easy creation of new languages.  The source language for the compiler is 
generated from some higher-level language.  This approach is much more amenable 
to non-specialists but there are problems.  Tools beyond the compiler (for example 
debuggers and profilers) are unaware of the generative process.   This makes it 
difficult to relate problems in the runtime to the higher-level source and to 
differentiate problems at that level from problems in the generator.


The challenge is to produce a means of creating languages as first-class citizens of the 
development tool set.  The approach should be simply enough to allow for its use by 
application developers as a normal part of their job.

The opportunity to incorporate the more advanced approaches of Generative 
Programming should not be missed.  These approaches promise significant benefits.   
Building in support for these will facilitate their integration into application 
development and allow their practical application to be studied.


The proposal is to research the possibility of making language creation simpler whilst 
at the same time eliminating the problems associated with the generative approach.  
Ideally the result should be that language creation is a skill attainable by all 
programmers and easy enough to be used in day-to-day programming.

In order to achieve this it is proposed to take advantage of the Java runtime 
environment and to place certain reasonable limits on the creatable languages.

Given the target audience the languages created should be similar to Java.  This will 
both simplify their specification and reduce the learning curve.  The created languages 
then may be extensions to Java or may reuse Java features.

4.1 Support from the Java runtime

4.1.1 Compilation of Java programs results in a representation known as bytecodes.  
The bytecodes represent instructions for an imaginary processor.  Platforms wishing 
to execute Java programs are provided with a JVM (Java Virtual Machine) that acts as 
an interpreter of the bytecodes.

The interesting point here is that the Java compiler creates simplistic bytecode 
representations of the Java source.  Optimisation is carried out by the JVM.  The latest 
designs for this perform complex dynamic optimisations ([Sun-2] and [IBM-1]). 

4.1.2 The class files into which the bytecodes are placed allow for extra data to be 
stored.  This can allow other tools, such as debuggers, to operate at the appropriate 
level.  The Java Platform Debugger Architecture now includes support for the 
debugging of non-Java programs [Sun-3].

4.1.3 Using the Java runtime also brings other advantages.  These include cross-
platform support, interoperability with Java and the Java libraries, and a protected 
runtime environment (bounds checking, no pointer arithmetic, type checking, 
bytecode verification). 

4.2 The creation of languages

Compilers traditionally follow a phased approach. The phases are (broadly) lexical 
analysis, syntactical analysis, semantic analysis and generation of the object code. 

The proposal is to allow the user to specify the tokens and syntactic rules to allow the 
first two phases to be specified.  There are many products in existence for generating 
these (Lex, YACC, JavaCC, Jikes Parser Generator, ANTLR, etc.).  Obviously there 
is much to be learnt from these.

The proposed tool should leverage this knowledge but should aim for simplification.  
The opportunity for simplification comes from limiting the languages to Java-like 

At the lexical level the Java Language Specification [GJSB00] defines tokens in 
categories of identifiers, keywords, literals, separators, and operators.  Using such 
taxonomy could lead to clearer definitions in the language specification (and could 
assist tools such as colour-coding editors).  Also languages may be able to import 
parts or all of the token grammar of Java thus reducing the specification task.  Lastly 
since the target audience is Java developers it may well be more suitable to allow Java 
to be used for the definition of some token types (numeric literals would be one 
possible example).

Similarly syntax specification could be simplified by specifying in terms familiar to 
Java programmers.  For example operator type (prefix, postfix, binary, ternary) and 
precedence could be directly expressed to allow for the definition of expressions.  
Also importation could again be useful to integrate features directly from Java or even 
from other languages created using this scheme.

As previously noted the generative model is more accessible to many programmers.  
The semantic analysis then should be specifiable in terms of translation into Java 
source.  This is unlikely to be adequate for all purposes however.  The practices of 
Generative Programming involve the manipulation of one or more modules by others.  
This is probably best performed as manipulations of the abstract syntax trees of those 
modules and could be expressed using an XPath-like addressing mechanism plus 
suitable manipulations.

Once we (logically) have Java source standard Java compilation takes over 
supplemented only by the need to preserve mappings to the original source for later 
use in debugging, etc.


5.1 We need to take account of compilation dependencies.  The interoperability with 
Java means that code specified in a created language may depend on Java classes and 
vice versa.  In fact there can also be dependencies between two programs specified in 
the same or different created languages.  This probably means that the entire 
compilation process should be undertaken by a single, extensible, compiler (including 
the compilation of standard Java code).

5.2 Java class files are not the only necessary result of the compilation process.  To be 
useful many languages also need to generate configuration files (consider for instance 
languages to define JSP Tag Extensions or Enterprise Java Beans) and other 
representations (a database schema for example).  Also the Java principal of 
documentation in the source (Javadoc) is very useful and should be carried into the 
created languages.

5.3 The embedding of created languages within each other adds another dimension.  
Embedding complicates the parsing process.  XML namespaces show one manner in 
which this can be handled.  ANTLR has an interesting approach [Antlr-1] using token 
stream multiplexing (multiple lexers creating a single token stream). A suitable 
mechanism that fits the structure of the creatable languages should be considered.  

5.4 The separation of concerns in Generative Programming complicates the 
relationship between source and object files.  Instead of one:many the relationship 
becomes many:many.  This complicates the dependency resolution especially when 
the requirement for incremental and batch compilation is considered.

5.5 It should be possible to allow for non-Java like languages (for instance embedded 
SQL).  The specification of these languages would not benefit from the advantages 
outlined in this proposal but it should be possible to hook such languages into the 
overall scheme.

5.5 One of the much-touted benefits of domain-specific languages is the scope for 
domain-specific optimisations.  The discussion above did not cover how the semantic 
analysis would specify the translation to Java source.  The most straightforward 
approach would be to state a translation for each BNF production (or its equivalent in 
the simplified specification).  This form of specification would be inadequate for the 
specification of domain-specific optimisations.

5.6 The expression of the creatable languages is a language itself.  It should therefore 
be possible to express it using itself.  Therefore, after some initial work, the 
development of a tool implementing this proposal should be implemented in that tool.


The overall vision here is not dissimilar to that of Intentional Programming (IP) (See 
[CE00]). IP is attempting to create a more encompassing environment.  This means 
that language creation remains a separate discipline.

What this proposal aims for is to obtain the benefits of languages which more clearly 
state the intention of the design whilst taking advantage of the feature-rich Java 
environment and bringing the benefits of language creation to a broader community.

The definition of languages using features from other languages (described as the 
importation of features above) of could lead in interesting directions.  For instance 
one could imagine an application being developed by building out languages until the 
applications specification is directly represented.

This would be well in the future however.  Initially the environment would probably 
be used to better express existing technologies and to provide a test bed for more 
advanced development approaches.

[CE00] Generative Programming, Czarnecki and Eisenecker, 2000, Addison Wesley
[Cle01] Program Generators with XML and Java, Cleaveland, 2001, Prentice Hall 
[Fow99] Refactoring, Fowler, 1999, Addison Wesley
[GJSB00] The Java Language Specification , Gosling, Joy, Steele and Bracha, 2000, 
Addison Wesley 

Back to the top