Using DTP to analyze data flow [message #594234] |
Wed, 14 January 2009 19:00 |
James Synge Messages: 2 Registered: July 2009 |
Junior Member |
|
|
Last summer Brian Fitzpatrick asked the question "How are YOU using DTP?"
on his blog. I finally have a potentially novel/interesting answer. :-)
I needed to know which columns of base tables were being read by a set of
several hundred database views, and to which view columns they
contributed. To answer this I wrote a tool that builds on DTP, and does
the following:
a) Reads the schema from a text file and create corresponding DTP
objects (i.e. Database, Schema, Table, Column).
I didn't have access to the database (Teradata at a customer
site), so I couldn't make a connection to it in order to get
this data.
b) Read a file of records, each containing a DDL statement defining
a VIEW.
c) For each view definition, I did a very trivial parse to get the
query within the DDL. In addition, I had to hack on the query
to replace SQL3 and Teradata syntax with expressions that DTP
could parse.
d) After parsing the query with DTP's SQL Parser, the tool then
walks the resulting tree, determining all the columns that
contribute to each result column, transitively, which requires
handling nested and combined queries. This is by far the most
complicated part, due to challenges 1 and 2 below.
The key challenges were:
1) Very little documentation exists, so I had to explore the parsed
queries to understand the data structures. It would be helpful
if the (EMF?) model from which the data structures were generated
had at least a sentence about each field.
2) Walking the parse tree required a lot of down-casting; i.e. lots of:
if (o instanceof X) { X x = (X)o; handleX(x); }
else if (o instanceof Y) ...
I'm not clear on how best to handle this, but this pattern is one
of those "smells" that says a redesign might be in order.
3) The parser couldn't handle the SQL3 syntax in the view definitions,
so I had to "pre-process" the SELECT statement to replace such
syntax with something simpler. Note that I didn't specify the type
of database (Teradata) as I didn't have the appropriate vendor specific
plug-in.
I don't know if the vendor specific plug-ins have the ability to
extend/replace the SQL parser with vendor extensions.
James Synge
|
|
|
Powered by
FUDForum. Page generated in 0.02152 seconds