SAX parser in WST? [message #216475] |
Tue, 01 July 2008 05:40  |
Eclipse User |
|
|
|
I am using the internal WST DOM model
(org.eclipse.wst.xml.core.internal.provisional.document.IDOM Model) to
parse XML documents, because it is the only one I could find that is able
to retrieve the original textual offsets of an XML node in the source
document. I couldn't figure out a way to do this with w3c.dom or jdom.
Is there maybe a SAX parser with the same functionality hidden somewhere
in the WST packages? I don't need the XML context, and it would make the
parsing much faster.
Regards,
Gerrit
|
|
|
|
|
Re: SAX parser in WST? [message #216746 is a reply to message #216700] |
Thu, 03 July 2008 09:46  |
Eclipse User |
|
|
|
Originally posted by: dcarver.starstandard.org
Nitin Dahyabhai wrote:
> David Carver wrote:
>> WST does implement it's own SAX parser api, that does keep track of
>> the beginning column and ending column number of a tag. It's buried
>> within the XML validation routines as an internal class. So isn't
>> callable directly.
>>
>> As you noticed the w3c.dom and jdom themselves don't keep track of
>> this information.
>
> I wouldn't say it's SAX exactly, since it doesn't even try to implement
> that API, and actually has more in common with StaX (which came along a
> lot later).
I was referring to the Xerces Valadition routines which do override and
implement a SAX parser that keeps track of line numbers. It extends the
SAX pieces, when it instatiates the Xerces parser for validation against
a grammar.
If you make use of the platform's file buffers APIs, the
> documents in the ITextFileBuffers for XML files implement
> IStructuredDocument. That then breaks down into
> IStructuredDocumentRegions, which for XML represent the start and end
> tags individually. Inside of those are ITextRegions marking the
> positions of the various syntactic tokens that make up the tag.
>
> http://help.eclipse.org/stable/topic/org.eclipse.platform.do c.isv/reference/api/org/eclipse/core/filebuffers/ITextFileBu fferManager.html
> would be the logical starting point, then.
The problem with the above approach that while it can be done, it's not
one of the typical API's that most XML programmers are going to be
familiar with. It's again one of the reasons I keep strongly urging
WTP to try and leverage and use more of the existing XML APIs as much as
possible instead of re-inventing the wheel. It makes it much easier for
adopters familiar with these technologies to implement. It may be a
shell that wraps the underlying Eclipse API, but having these wrapper
classes can make adoption easier, just as has been partially done with
the DOM implementation of the SSE.
|
|
|
Powered by
FUDForum. Page generated in 0.05216 seconds