Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Web Tools Project (WTP) » SAX parser in WST?
SAX parser in WST? [message #216475] Tue, 01 July 2008 05:40 Go to next message
Gerrit  is currently offline Gerrit
Messages: 30
Registered: July 2009
Member
I am using the internal WST DOM model
(org.eclipse.wst.xml.core.internal.provisional.document.IDOM Model) to
parse XML documents, because it is the only one I could find that is able
to retrieve the original textual offsets of an XML node in the source
document. I couldn't figure out a way to do this with w3c.dom or jdom.

Is there maybe a SAX parser with the same functionality hidden somewhere
in the WST packages? I don't need the XML context, and it would make the
parsing much faster.

Regards,
Gerrit
Re: SAX parser in WST? [message #216486 is a reply to message #216475] Tue, 01 July 2008 10:06 Go to previous messageGo to next message
Eclipse User
Originally posted by: dcarver.starstandard.org

Gerrit wrote:
> I am using the internal WST DOM model
> (org.eclipse.wst.xml.core.internal.provisional.document.IDOM Model) to
> parse XML documents, because it is the only one I could find that is
> able to retrieve the original textual offsets of an XML node in the
> source document. I couldn't figure out a way to do this with w3c.dom or
> jdom.
>
> Is there maybe a SAX parser with the same functionality hidden somewhere
> in the WST packages? I don't need the XML context, and it would make the
> parsing much faster.

WST does implement it's own SAX parser api, that does keep track of the
beginning column and ending column number of a tag. It's buried within
the XML validation routines as an internal class. So isn't callable
directly.

As you noticed the w3c.dom and jdom themselves don't keep track of this
information.
Re: SAX parser in WST? [message #216700 is a reply to message #216486] Thu, 03 July 2008 02:41 Go to previous messageGo to next message
Nitin Dahyabhai is currently offline Nitin Dahyabhai
Messages: 2237
Registered: July 2009
Senior Member
David Carver wrote:
> WST does implement it's own SAX parser api, that does keep track of the
> beginning column and ending column number of a tag. It's buried within
> the XML validation routines as an internal class. So isn't callable
> directly.
>
> As you noticed the w3c.dom and jdom themselves don't keep track of this
> information.

I wouldn't say it's SAX exactly, since it doesn't even try to
implement that API, and actually has more in common with StaX (which
came along a lot later). If you make use of the platform's file
buffers APIs, the documents in the ITextFileBuffers for XML files
implement IStructuredDocument. That then breaks down into
IStructuredDocumentRegions, which for XML represent the start and
end tags individually. Inside of those are ITextRegions marking the
positions of the various syntactic tokens that make up the tag.

http://help.eclipse.org/stable/topic/org.eclipse.platform.do c.isv/reference/api/org/eclipse/core/filebuffers/ITextFileBu fferManager.html
would be the logical starting point, then.

--
---
Nitin Dahyabhai
Eclipse WTP Source Editing
IBM Rational


---
Nitin Dahyabhai
Eclipse WTP, IBM
Re: SAX parser in WST? [message #216746 is a reply to message #216700] Thu, 03 July 2008 09:46 Go to previous message
Eclipse User
Originally posted by: dcarver.starstandard.org

Nitin Dahyabhai wrote:
> David Carver wrote:
>> WST does implement it's own SAX parser api, that does keep track of
>> the beginning column and ending column number of a tag. It's buried
>> within the XML validation routines as an internal class. So isn't
>> callable directly.
>>
>> As you noticed the w3c.dom and jdom themselves don't keep track of
>> this information.
>
> I wouldn't say it's SAX exactly, since it doesn't even try to implement
> that API, and actually has more in common with StaX (which came along a
> lot later).

I was referring to the Xerces Valadition routines which do override and
implement a SAX parser that keeps track of line numbers. It extends the
SAX pieces, when it instatiates the Xerces parser for validation against
a grammar.



If you make use of the platform's file buffers APIs, the
> documents in the ITextFileBuffers for XML files implement
> IStructuredDocument. That then breaks down into
> IStructuredDocumentRegions, which for XML represent the start and end
> tags individually. Inside of those are ITextRegions marking the
> positions of the various syntactic tokens that make up the tag.
>
> http://help.eclipse.org/stable/topic/org.eclipse.platform.do c.isv/reference/api/org/eclipse/core/filebuffers/ITextFileBu fferManager.html
> would be the logical starting point, then.

The problem with the above approach that while it can be done, it's not
one of the typical API's that most XML programmers are going to be
familiar with. It's again one of the reasons I keep strongly urging
WTP to try and leverage and use more of the existing XML APIs as much as
possible instead of re-inventing the wheel. It makes it much easier for
adopters familiar with these technologies to implement. It may be a
shell that wraps the underlying Eclipse API, but having these wrapper
classes can make adoption easier, just as has been partially done with
the DOM implementation of the SSE.
Previous Topic:I'm trying to use the ibmaio package and I can't seem to tell eclipse about the .so file.
Next Topic:JBoss Publish error on Eclipse 3.4
Goto Forum:
  


Current Time: Thu Jul 31 05:46:26 EDT 2014

Powered by FUDForum. Page generated in 0.01602 seconds