Re: [asciidoc-lang-dev] Whitespace handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [asciidoc-lang-dev] Whitespace handling

From: Lex Trotman <exciidoc@xxxxxxxxx>
Date: Sun, 28 Feb 2021 08:11:06 +1000
Delivered-to: asciidoc-lang-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/asciidoc-lang-dev/>
List-help: <mailto:asciidoc-lang-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/asciidoc-lang-dev>, <mailto:asciidoc-lang-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/asciidoc-lang-dev>, <mailto:asciidoc-lang-dev-request@eclipse.org?subject=unsubscribe>

On Sat, 27 Feb 2021 at 21:06, Dan Allen <dan@xxxxxxxxxxxxxx> wrote:

Sylvain,

I'm glad you brought this up as it needs to be addressed by the spec.

What we know to be undoubtedly true is that, with the exception of verbatim blocks, sequential space characters (which includes ASCII spaces, tabs, and line feeds) are normalized to a single space in the output format produced from AsciiDoc. Some converters, such as the built-in HTML and DocBook converters in Asciidoctor, rely on the viewer (e.g., browser) to perform this normalization. Other converters, such as the PDF converter, take on the work of performing this normalization since the viewer does not offer this feature. As far as what the reader sees, runs of space characters should only show up as a single space.

I have long wrestled with whether this normalization should be performed eagerly when the document is parsed. As Lex points out, AsciiDoc Python did not do so, and Asciidoctor retained that behavior. It's debatable whether the spaces are valuable to keep. However, removing them would certainly make the sourcemap more complicated since it would have to account for their absence.

I don't think the spec needs to mandate whether or not spaces should be normalized. What it should do, however, is mandate that the spaces should be normalized when the converted document is viewed. This allows converters to rely on functionality of the viewer application, only stepping in when that functionality isn't available. And even then, it should probably always be the responsibility of the converter.

Best Regards,

-Dan

p.s. I ask that you use the term "spaces" rather than "whitespace". The color specifier serves no purpose and isn't relevant anyway for people who use a dark theme.

Its not just "space", which is an ASCII character, it is all of https://en.wikipedia.org/wiki/Whitespace_character.

Cheers

Lex

--
Dan Allen, Vice President | OpenDevise Inc.
Pronouns: he, him, his
Content ∙ Strategy ∙ Community
opendevise.com
_______________________________________________
asciidoc-lang-dev mailing list
asciidoc-lang-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/asciidoc-lang-dev

Follow-Ups:
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen

References:
- [asciidoc-lang-dev] Whitespace handling
  - From: Sylvain Leroux
- Re: [asciidoc-lang-dev] Whitespace handling
  - From: Dan Allen

Prev by Date: Re: [asciidoc-lang-dev] Whitespace handling
Next by Date: Re: [asciidoc-lang-dev] Whitespace handling
Previous by thread: Re: [asciidoc-lang-dev] Whitespace handling
Next by thread: Re: [asciidoc-lang-dev] Whitespace handling
Index(es):
- Date
- Thread

Breadcrumbs