Partition rule to ignore inner end sequences? [message #1032562] |
Wed, 03 April 2013 02:32  |
Eclipse User |
|
|
|
Hi
I'm attempting to partition a document of the form:
OUTER
...stuff...
INNER
...stuff...
END
INNER
...stuff...
END
...stuff...
END
where OUTER can contain any number of INNER blocks, or none at all.
I want to create a content type for the OUTER block. As expected, a multi line rule for the sequences "OUTER" and "END" incorrectly detects the first end sequence of any contained INNER blocks instead of the end sequence of the enclosing OUTER block.
Can anyone advise me of any strategies or point me to some examples on how to correctly detect OUTER's end sequence?
Thanks in advance
Delfina
|
|
|
[SOLVED] Partition rule to ignore inner end sequences? [message #1033989 is a reply to message #1032562] |
Thu, 04 April 2013 18:20  |
Eclipse User |
|
|
|
Hi
For anyone who's interested, I've solved this as follows:
private class OuterBlockRule extends MultiLineRule {
public OuterBlockRule(IToken token) {
super("OUTER", "END", token);
}
@Override
protected boolean endSequenceDetected(ICharacterScanner scanner) {
int readCount = 0;
int c;
while ((c = scanner.read()) != ICharacterScanner.EOF) {
if ((fEndSequence.length > 0 && c == fEndSequence[0])
&& sequenceDetected(scanner, fEndSequence, fBreaksOnEOF)
&& correctEndSequenceDetected(scanner)) {
return true;
}
++readCount;
}
for ( ; readCount > 0; readCount--) {
scanner.unread(); // Rewind scanner back to where we started
}
return false;
}
/**
* Returns <code>true</code> if this end sequence belongs to
* <code>OUTER</code>, not <code>INNER</code>
*/
private boolean correctEndSequenceDetected(ICharacterScanner scanner) {
StringBuffer buffer = new StringBuffer();
// Scan all the way back to OUTER
scanner.read();
do {
scanner.unread(); // Rewind before last read char
scanner.unread(); // Rewind before first unread char
buffer.insert(0, (char) scanner.read());
} while (!startsWithOuter(buffer));
// Reset scanner back to the position where we started
for (int i = 0, n = buffer.length(); i < n; i++) {
scanner.read();
}
// Count occurrences of INNER and END
int innerCount = 0;
int endCount = 0;
for (String line : buffer.toString().split("\n")) {
String l = line.trim();
if (l.startsWith("INNER")) {
++innerCount;
} else if (l.startsWith("END")) {
++endCount;
}
}
// If there is 1 more occurrence of END than INNER, this END must
// belong to OUTER
return (endCount == (innerCount + 1));
}
private boolean startsWithOuter(StringBuffer buffer) {
// Ignore words which appear in an unterminated string
return buffer.toString().startsWith("OUTER")
&& allStringsTerminated(buffer);
}
/**
* Returns <code>true</code> if all strings are terminated, i.e. an even
* number of ' chars are found
*/
private boolean allStringsTerminated(StringBuffer buffer) {
int count = 0;
for (Character c : buffer.toString().toCharArray()) {
if (c == '\'') {
++count;
}
}
return ((count % 2) == 0);
}
}
|
|
|
Powered by
FUDForum. Page generated in 0.03663 seconds