Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [asciidoc-lang-dev] Avoiding Implementation Specifics Thoughts

I personally think it would be great if everyone who is working on an AsciiDoc grammar or parser would continually publish the current state of their work, no matter how little it does and how much it does wrong.

> On Feb 22, 2021, at 4:03 AM, Lex Trotman <exciidoc@xxxxxxxxx> wrote:
> One thing that has been mentioned several times is the concept of "extension points".  This needs careful definition and consideration if its to be included in any specification to avoid implementation specifics.
> Asciidoctor makes good use of the dynamism of its development environment, Ruby, to allow customisation.  And no markup is likely to cover every eventuality, so some extendability seems important (although as devil's advocate I point out that documents using extensions likely makes the documents _not_ Asciidoc Specification compliant).  

Well, since the behavior of Asciidoctor is the current spec, and Asciidoctor supports extensions, they are spec compliant :-)  More seriously, this is a big question.  Perhaps another way to state it is, ‘Does which extensions are installed affect whether a document is grammatical?”  This is especially complicated by inline macros defined using a regex.
> But languages like C or other fully compiled languages are not so amenable to being changed dynamically.  It would seem to be not a good thing to specify the capability in a way that implementations in those languages cannot reasonably be made to comply.

You can always use CORBA…. but there must be something more usable by now.
> Another implementation specific is the handling of the simplest markup, constrained and unconstrained quotes.  The current implementations (AFAIK all of them) perform the recognition and replacement in a fixed order, rather than the order that the markup occurs in the document (this is the source of the problem, independent if you use regexes or lexers to recognise the tokens).  
> Replacing that with a recursive descent parser giving an in-order AST is simple (I have an experimental one already that I hope to publish to github in a few weeks, depending on how my "real" world goes, it has lots of other experiments too :-) but the results are different to existing processors.  So for Asciidoc 1.0 this might need to stay as is for compatibility, and documents that depend on it could be deprecated ready for Asciidoc 2.0 that changes the processing order.
> The same issue occurs at the next higher level as well, the ordered recognition of inline markup, special, quotes, replacements, etc and its sidekick, the infamous "subs=" attribute.  That makes the Asciidoc language not only context dependent (in structures like sections and lists), but "subs=" is _content_ dependent.  Not many (actually not any AFAIK) programming languages allow the source code to specify which language constructs are allowed in parts of the program, making normal formal computer language methods difficult to apply.  Imagine if a programming language source could say "no, the `while` construct isn't to be parsed in this part of the program", that is what "subs=" does.

I’ve wondered if Antlr modes can provide a solution to this, but haven’t seriously investigated.
> It will be interesting to see how this is formalised.

Yes indeed!

Thanks for posting!
David Jencks
> Cheers
> Lex
> _______________________________________________
> asciidoc-lang-dev mailing list
> asciidoc-lang-dev@xxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit

Back to the top