I think this thread demonstrates a solid understanding of the current relationship between the preprocessor and parser. So I don't have anything to add there. But I do want to share a thought that came to my mind when I first read Guillem's question in hopes it can frame the effort going forward:
> Guillem asked:
> in current AsciiDoc it's not possible to have an actual preprocessing as a separate step. Include expansion needs to be performed at the same time of (block) parsing (as David says). Or am I missing something?
You're not missing anything. The behavior of the preprocessor is currently coupled to the parser (despite what its name suggests), as described in this thread in various ways. Decoupling the preprocessor from the parser is going to be one of the greatest challenges this project will need to solve, and also a very important one. I didn't fully grasp the implications of this behavior until I started to think deeper about the grammar. (The mechanism does inherit from AsciiDoc Python's streaming process).
This decoupling is not likely something we can do in 1.0 since that risks breaking backwards compatibility. (No, we are not just thoughtlessly standardising Asciidoctor, as Lex suggested. We are trying not to splinter the community of users in the effort to define and evolve AsciiDoc).
I have raised a separate thread so the intention can be clarified, I was just observing that testing Asciidoctor behaviour seemed to be being used as the reference, not that it was the intention of the project.
It's very possible we can tighten some of the rules, such as not allowing an include to leave a block half open. But once we cross 1.0, we should try to go all the way with the decoupling as soon as possible. At that point, I don't think we should rule out introducing an alternate syntax if it allows us to break free.
Another approach is to use new syntax for differing behaviour, so that it is clear what is intended by the writer, rather than having existing syntax change behaviour with versions of the spec. Clearly some places (like quotes) that is not possible, but "macro like" syntax has a lot of space, so maybe "import::" instead of include for example.
For 1.0, it's probably going to be necessary to have a preprocessor parser that can parse (or at least recognize) the structure down to the block level (so it knows when to process an include). Once that first parsing phase is done, then the main grammar-based parser can kick in and do its thing. But that's just one possible approach that comes to my mind.
Yes a preprocessor needs to recognise those constructs that affect it, attribute definitions, literal blocks etc so those are part of its "parser" if you like similar to "include::". That doesn't mean its impossible to separate.
As I said above, this is going to be a key challenge this project will need to address. And we'll need to call on all the expertise we can get to solve it.
Yes, its not easy and there is no one "right" answer.
Dan Allen, Vice President | OpenDevise Inc.
Pronouns: he, him, his
asciidoc-lang-dev mailing list
To change your delivery options, retrieve your password, or unsubscribe from this list, visit