Eclipse Community Forums: ServerTools (WTP)

Help

Home

Home » Language IDEs » ServerTools (WTP) » SAX parser in WST?

Show: Today's Messages :: Show Polls :: Message Navigator

SAX parser in WST? [message #216475]

Tue, 01 July 2008 09:40

Gerrit

Messages: 30
Registered: July 2009

Member

I am using the internal WST DOM model
(org.eclipse.wst.xml.core.internal.provisional.document.IDOM Model) to
parse XML documents, because it is the only one I could find that is able
to retrieve the original textual offsets of an XML node in the source
document. I couldn't figure out a way to do this with w3c.dom or jdom.

Is there maybe a SAX parser with the same functionality hidden somewhere
in the WST packages? I don't need the XML context, and it would make the
parsing much faster.

Regards,
Gerrit

Report message to a moderator

Re: SAX parser in WST? [message #216486 is a reply to message #216475]

Tue, 01 July 2008 14:06

Eclipse User

Originally posted by: dcarver.starstandard.org

Gerrit wrote:
> I am using the internal WST DOM model
> (org.eclipse.wst.xml.core.internal.provisional.document.IDOM Model) to
> parse XML documents, because it is the only one I could find that is
> able to retrieve the original textual offsets of an XML node in the
> source document. I couldn't figure out a way to do this with w3c.dom or
> jdom.
>
> Is there maybe a SAX parser with the same functionality hidden somewhere
> in the WST packages? I don't need the XML context, and it would make the
> parsing much faster.

WST does implement it's own SAX parser api, that does keep track of the
beginning column and ending column number of a tag. It's buried within
the XML validation routines as an internal class. So isn't callable
directly.

As you noticed the w3c.dom and jdom themselves don't keep track of this
information.

Report message to a moderator

Re: SAX parser in WST? [message #216700 is a reply to message #216486]

Thu, 03 July 2008 06:41

Nitin Dahyabhai

Messages: 4435
Registered: July 2009

Senior Member

David Carver wrote:
> WST does implement it's own SAX parser api, that does keep track of the
> beginning column and ending column number of a tag. It's buried within
> the XML validation routines as an internal class. So isn't callable
> directly.
>
> As you noticed the w3c.dom and jdom themselves don't keep track of this
> information.

I wouldn't say it's SAX exactly, since it doesn't even try to
implement that API, and actually has more in common with StaX (which
came along a lot later). If you make use of the platform's file
buffers APIs, the documents in the ITextFileBuffers for XML files
implement IStructuredDocument. That then breaks down into
IStructuredDocumentRegions, which for XML represent the start and
end tags individually. Inside of those are ITextRegions marking the
positions of the various syntactic tokens that make up the tag.

http://help.eclipse.org/stable/topic/org.eclipse.platform.do c.isv/reference/api/org/eclipse/core/filebuffers/ITextFileBu fferManager.html
would be the logical starting point, then.

--
---
Nitin Dahyabhai
Eclipse WTP Source Editing
IBM Rational

_
Nitin Dahyabhai
Eclipse Web Tools Platform

Report message to a moderator

Re: SAX parser in WST? [message #216746 is a reply to message #216700]

Thu, 03 July 2008 13:46

Eclipse User

Originally posted by: dcarver.starstandard.org

Nitin Dahyabhai wrote:
> David Carver wrote:
>> WST does implement it's own SAX parser api, that does keep track of
>> the beginning column and ending column number of a tag. It's buried
>> within the XML validation routines as an internal class. So isn't
>> callable directly.
>>
>> As you noticed the w3c.dom and jdom themselves don't keep track of
>> this information.
>
> I wouldn't say it's SAX exactly, since it doesn't even try to implement
> that API, and actually has more in common with StaX (which came along a
> lot later).

I was referring to the Xerces Valadition routines which do override and
implement a SAX parser that keeps track of line numbers. It extends the
SAX pieces, when it instatiates the Xerces parser for validation against
a grammar.

If you make use of the platform's file buffers APIs, the
> documents in the ITextFileBuffers for XML files implement
> IStructuredDocument. That then breaks down into
> IStructuredDocumentRegions, which for XML represent the start and end
> tags individually. Inside of those are ITextRegions marking the
> positions of the various syntactic tokens that make up the tag.
>
> http://help.eclipse.org/stable/topic/org.eclipse.platform.do c.isv/reference/api/org/eclipse/core/filebuffers/ITextFileBu fferManager.html
> would be the logical starting point, then.

The problem with the above approach that while it can be done, it's not
one of the typical API's that most XML programmers are going to be
familiar with. It's again one of the reasons I keep strongly urging
WTP to try and leverage and use more of the existing XML APIs as much as
possible instead of re-inventing the wheel. It makes it much easier for
adopters familiar with these technologies to implement. It may be a
shell that wraps the underlying Eclipse API, but having these wrapper
classes can make adoption easier, just as has been partially done with
the DOM implementation of the SSE.

Report message to a moderator

Previous Topic:	I'm trying to use the ibmaio package and I can't seem to tell eclipse about the .so file.
Next Topic:	JBoss Publish error on Eclipse 3.4

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 26 12:19:57 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter