[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
[ptp-dev] Re: [photran] Telecon
|
On May 7, 2005, at 6:30 AM, Ralph Johnson wrote:
On 5/3/05 10:58 AM, "Craig Rasmussen" <crasmussen@xxxxxxxx> wrote:
3. Discuss integration of outside tools (parsing primarily) and their
relationship to the existing Photran parser. We have been discussing
tools for static analysis to do performance monitoring with the
University of Oregon (and University of Munich in two weeks). These
tools
require commercial quality compilers to munge LANL codes. We have
been thinking that outside parsers could be used with Photran
as an option, in addition to the normal Photran parser. The
University
of Munich uses NAG for parsing and we have been trying to get the IBM
eclipse folk to encourage their Fortran compiler group to output their
IR in XML format.
I assume you really mean "integrate a Fortran front end" and not really
"Fortran parser". You can't really integrate just a parser unless it
is
written in the same language. A parser doesn't do much. We need a
representation of Fortran programs, so at the very least we need an
abstract
syntax tree representation. It would be possible to have a front end
that
produces a program representation in XML (or something more compact).
But
it would take a lot of work to hack a front end to produce this.
See comments below on EDG, Cleanspace, and NAG. We have also
implemented a rudimentary tool based on the --dump-parse-tree output
of gfortran (but we want to redo this and output the AST directly).
I was talking with Bjarne Stroustrup this week about a project of his
to
make a standard intermediate format for C++ and a set of tools that
manipulate it. He has been working at this project off and on for the
better part of a decade. He is working with two compiler groups to
hack
their systems so that they produce the necessary information. It is
clearly
the right long-term strategy for C++. It should have been done years
ago.
But it is very hard. It takes good people many years to do it.
Fortran is
not as hard to compile as C++, but it is harder than C or Java.
It would be cool if Bjarne succeeds. We are trying to work our
contacts at IBM to do this for the XLF Fortran compiler and I've talked
with
other Fortran vendors at J3 meetings.
We looked at various Fortran compilers when the project started. We
are on
our third parser right now. We got the grammar from someone else (the
hard
part) and are generating the parser automatically. In that sense, we
are
using someone else's parser, but it was still a lot of work and
doesn't deal
with the lexical issues that are so important to Fortran. I suppose
you
have better contacts than we do and could get source to front-ends
that we
could not get. However, they won't be written in Java and so
integrating
them with Eclipse tools will take a lot of work.
Do you know how hard it would be to input an xml representation
of a Fortran AST (produced by an external tool) into Java?
If compiler vendors will work with us, we could have them emit an
intermediate representation that all the other tools would use.
First, we
would have to define that intermediate representation. Different
projects
would have different requirements. For example, our refactoring
project
requires that we know exactly where each token came from. In other
words,
each token needs to have its offset in the original file. This is so
we can
pretty print without messing up the original formatting. Most
projects do
not require this information, so compiler vendors are unlikely to
produce it
unless we ask for it.
Apparently there is a standard xml representation for an AST (but I
don't
know much about it). I'll find out more information last week. The
token
offset (at least the line number) is also needed by the current Oregon
and TUM
(Technical University of Munich) tools.
What are the projects at Oregon and Munich? Do they work with existing
compilers, or do they have their own parsers? What kind of
information do
they need?
The University of Oregon has a project, Program Database Toolkit
(PDT, http://www.cs.uoregon.edu/research/paracomp/proj/pdtoolkit/)
that uses a common and fairly terse IR for both C++ and Fortran, called
the Program Database (PDB). They use the EDG and Cleanspace
compiler front ends.
Munich (TUM) is interested in similar things (performance monitoring
tools). They use NAG's Fortran front end to create a standard XML IR
format.
At LANL we have large legacy codes in F77 (and earlier) and
some codes in modern Fortran. Whatever tools we use must be
able to work with LANL codes. In particular, they must be able
to parse fixed format files (and other F77 madness), see attached file.
Our worry is that custom parser tools won't work with LANL codes.
Regards,
Craig
Attachment:
junk.f
Description: Binary data