Re: [asciidoc-lang-dev] Whitespace handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [asciidoc-lang-dev] Whitespace handling

From: Lex Trotman <exciidoc@xxxxxxxxx>
Date: Sun, 7 Mar 2021 11:11:51 +1000
Delivered-to: asciidoc-lang-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/asciidoc-lang-dev/>
List-help: <mailto:asciidoc-lang-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/asciidoc-lang-dev>, <mailto:asciidoc-lang-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/asciidoc-lang-dev>, <mailto:asciidoc-lang-dev-request@eclipse.org?subject=unsubscribe>

On Fri, 5 Mar 2021 at 23:30, Sylvain Leroux <sylvain@xxxxxxxxxxx> wrote:

On 03/03/2021 02:14, Lex Trotman wrote:
> Interesting question since the spacing is context, not part of the
> markup itself, just like the character on the other side.
Correct. In my own experiments, to identify the context, I used the
lookahead/lookbehind features of the Parsing _expression_ grammar (PEG --
[1]) I implemented. This adds some "context sensibility" on top of an
otherwise context-free grammar.

[1]: http://www.inf.puc-rio.br/%7Eroberto/docs/peg.pdf

Yes, I'm also using a PEG, but for my experiments I separated the lexer (LEG) and the parser (PEG) so there can be a clear table of markup tokens. Those tokens do indeed have copious uses of previous and following non-consuming operators :-).

I have found I needed to extend the PEG to handle nesting of sections and lists without writing out a limited depth set, how do you address it?

>
> Thinking about it, (as well as some defined code points) the non-spacing
> context character must be able to be any Unicode letter code point or it
> prevents the markup being used on some non-English languages, and so I
> don't see why the spacing context should not be any code point with the
> appropriate spacing Unicode property as well. If non-ASCII context on
> one side is valid, there is no reason it should not be valid on both sides.
The valid spacing/non-spacing character around constrained markups needs
clarifications to me. Especially if we consider non-Latin scripts. This
is something we should discuss in its own thread. Or shouldn't we?

Yes, thread started.

Cheers

Lex

PS Sylvain, can you please configure your mailer to only reply to the list otherwise we get two replys of the same mail, and its easy to reply to the wrong one and get off-list

...

Follow-Ups:
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux

References:
- [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Lex Trotman
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Lex Trotman
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Lex Trotman
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen
- [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Lex Trotman
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Lex Trotman
- [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux

Prev by Date: [asciidoc-lang-dev] Text Markup, syntax and parsing thereof
Next by Date: Re: [asciidoc-lang-dev] Text Markup, syntax and parsing thereof
Previous by thread: [asciidoc-lang-dev] Whitespace handling
Next by thread: Re: [asciidoc-lang-dev] Whitespace handling
Index(es):
- Date
- Thread

Breadcrumbs