Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[asciidoc-lang-dev] Whitespace handling
  • From: Sylvain Leroux <sylvain@xxxxxxxxxxx>
  • Date: Sun, 28 Feb 2021 13:10:40 +0100
  • Autocrypt: addr=sylvain@xxxxxxxxxxx; keydata= xsFNBFdFUf4BEACl0a/nxBGmY4eqGLMYQTVTaUt+Z7SXkaYiiMx00suDDJpCsE3f6Qet4zaC 1EBBseb0x/164kC92cc8ZV5NN00qOKWEkf05/JrVEFFq4le78l/9yO5GTE9ORnrOEqbYrFYf +3ArkXHnxFmR1SCRyFGKTtgE2nGqbKicQgjOYQFS4DfRVkEyPfKsr7/J1GUUTHu/sD7nnNik +7trfLwva9D6EetRUnd+H/AV6QVw3jhgR9klpKMo7+bXi35IZShnYAN+kvuAvoCQDjv1L2L5 XkOf9gGNLJAdEKbBcK0UiQ80RvO6Vr0FejpA0tmRGGIqB5m6WNxRxpeFhgK32l1+pInjGIP3 1to6xf0+pJWuWL5ZfQq8+8+4J+5ibX/klD5D6b78aNV/B/NTO+wE2B1Umw1JWthnKlTbKLCj t4IvAXsQCJWXi55pyz2S2m2vMd1ffHKPl59jIJzUXy2nM9sQhFTzLeKUZ0V6RBUF9lGDAWwh 3pR0OaIvQzuBEf1qEdLBsjMsI9SJdMY4VOKWMCuSMm+KlaF3jsEPkgu+GymUDCbvv2ZIGwwK kXQbs2gqpicPUKXwiszbgx43wiwpTLQ+6ZRlaoKlbVlHoCC/eO2fMvfasUOJZzLZSHOPPsOr xCtygLrSBx5hLdAA7syJv1GVGQaE8IfQPM7P+5QPHVhgQ/mJEQARAQABzSRTeWx2YWluIExl cm91eCA8c3lsdmFpbkBjaGljb3JlZS5mcj7CwYIEEwEIACwCGyMFCQlmAYAHCwkIBwMCAQYV CAIJCgsEFgIDAQIeAQIXgAUCV+WKiQIZAQAKCRCrWB8dH2HFIpzYD/9KVcvI3xAlR+Ahxlvl AnxzwT1ZIhRT1YPbX3Fwr6l7lBuFfp8sGHejY9XNsGMDM/C4h+GxHKiY87KMLTI2P5TfHy2j MYHW4x2VhXTqOmUMtTO1/4DfamlTF/xwaXTy+jx5Z3ghaZDWWflaNXpbwB1j/gl0TjXCSeiK 7GPGFTPJt04JmTDxuTKXqdwHUpKQSZ5pqdufP2po+W/uxgamRXjHD7z8X04+xK5E7ic5pgaE YtquzZDRfnil3W4GSodX6dKdnhCN2r8tDqV0FsRSp3qRuvzBJ692WCH5FmXmvqiNpVCo+Fj1 T45TYB49yiRAzyJZwgZnEB0vH/HzybPmJC9z3wjPaoFmGOUp2imbHlu3ABWRnqPtdYcbDHBF Mrpop7oFAGxhxxiCGv30eEPYdHWgj0pwgja4Z/dauS1NlHBBAdOtG1ixV0+KgW4mP2RrA8aa epUinq7PydEAS9NoYSeSRaBeFjrZPCS+En6/2jyON5nmlgcnRFbTQWjnhRj5tNXPC/QKNBOd 55m+mZkolkF8wkx44bv+jQ8mmgtQGbrBFF9PAaPidPs4C3t7duIeW8zVXmqFH5lF1KmTsljf j79DhHbz3H5gg1UXFe+NYNVEC3rbTFYkdeuFnAOsWUbXl2B+yJ5KR899aKF5yz6pEWPcwjGk jKOx3wzbebkbVvvHX87BTQRXRVH+ARAAoOcKbTwX/+5hwyqgxF//jDo3eMwQUdXUdi5JkiRA dEmJAlAAAfL6IL03rcrKCViPD9W/hL8coa4uUTko5EXkVFLIvq2Npmlr26lGnE5Ae+L4KHn+ qtUUm5Mg9xjtUoukhYjBv6IDXuONcI1iC93tpTsHbNmqG3QXjRWwVs3cCflZLvpKqoC7cXYt 7bKcb/B7lAD3aYqo+plr6zlqSHKTigGIO64eu/TfcUAQxU+/wGfSv1wekHauvFgRumfPJxU0 s4VLUCtAN9huRuET3iqVRtQk1TayLyZDeryxVJhcMTs6qs2n/9s4aZHRBM1iPbFqZ5YXVF03 ySgCj0fXSZ40PY8tqjMSuowRUSA8979EBMi94j4MLGmBwwbp4P1RaNbvvSyYebr2nV+LPDqc oDEI3BpJDz5PCYJOoKZWc2vTWnCjjzufybhZfzRWfzALupdbKq5XkQwMXxlx40GBngpvXc9P yPp8XkbkeEjx4Z2LWU6SUuZmmzoTDzo7J9KA4X3Shdxjdev8xlhSOCooHre3yi1VfPkeuggn 3JYycrio1uJqGUE01XtKKqmqe0sPNgBA+YyV+QNLsDRzk/qTDvbfjq76onYllZTl5mTEN94B uTmS6vKbqg5wiL9usGzOM9MdLzZ2VEUd2y3FqoUMngNRzpotsTqICNFYTzu7mOr1ji8AEQEA AcLBZQQYAQgADwUCV0VR/gIbDAUJCWYBgAAKCRCrWB8dH2HFIh6ID/9s+rRqmUPJm95gMamc W2qvfXmB60xP+Pcbt9tiJEvHF9PdwfEaREH7DxDrq/URgBJ/EYhcDdKJgOzMzV8dGE/EbuO4 KgpEDwT6P8ZjEhEdGouyPYL9SX0nBoxigI7RCmk+4WJ8S4RNcI6guOgGYKSKo/CdGBQhlhK+ 2PoviUaWpy/pBzMwCr6V74qifu0VS2kneOUYOB5UzI/dOy7akFZl7U1Wk8gtJg+Vcvik+UPg T59MWQU+NVJt2ehllXccjC3ImApufu5Yq4GIFEZ/zmAYCdD4TzgfvknDFC4ibyKkddv+eJHd Vn2bWK24s8f/JekOdOboWEBRPJg1XuGVdiB2o79KOhx42/wxZrnG07+1sUyhcpszruLbGn6H 1sjcPL/ELVoicVB3VcguXw+t3ZrnPSnuwBBNkJsQbA4rcBxbYlHV9BINbaV3W7+7FBnhPMT3 7FZ/xDGcGKlOpQVkuNhP7Awa8DPqPbO63mjnrYhkCQe5ySvNdpMxHVd/j6TWg4XE/fJx+62X NFeLWXsl9tKrrYx0Eqbay7NpodCZ/YhijGi8im46VVXBUH+jA7GLm9D8+afmOCadJj6MQZh1 LO60K3XtOlvoG+1DpnQpb982/zPVmr66FyzD4wHDOtU76+fC7GwnbnoEZIUYnIrLom+qdbsP ZVTXbkoKWnXazv6EYQ==
  • Delivered-to: asciidoc-lang-dev@xxxxxxxxxxx
  • List-archive: <https://dev.eclipse.org/mailman/private/asciidoc-lang-dev/>
  • List-help: <mailto:asciidoc-lang-dev-request@eclipse.org?subject=help>
  • List-subscribe: <https://dev.eclipse.org/mailman/listinfo/asciidoc-lang-dev>, <mailto:asciidoc-lang-dev-request@eclipse.org?subject=subscribe>
  • List-unsubscribe: <https://dev.eclipse.org/mailman/options/asciidoc-lang-dev>, <mailto:asciidoc-lang-dev-request@eclipse.org?subject=unsubscribe>
  • Openpgp: preference=signencrypt
  • User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0


On 28/02/2021 11:37, Dan Allen wrote:
>> Lex:
>>   "Asciidoc spacing characters" is fine by me, if a bit long, but
> {asc} can fix that :-)
> 
> Excellent. We will certainly define it.

FWIW, it suits me well too ;)

>> > Dan wrote:
>> One thing we will need to be careful about, though, is that AsciiDoc
> doesn't support *all* spacing characters. So we'll just need to
> emphasize that in our definition / usage.
>> Lex wrote:
>> I wonder if the standard should consider changing that?
> 
> Perhaps. Though I think one of the values of AsciiDoc is that it's not
> cryptic. So limiting spacing characters to space, tab, and line feed has
> merit. When we start to allow other spacing characters, then I think it
> introduces inconsistently from one document to the next. I think a case
> has to be made for each individual character we permit. That way, we
> know why we permitted it.

I already discussed that idea with Dan elsewhere. I assume it's not a
problem if I quote him here: "A valid space character is a space, a tab,
or a line feed (aka newline). It's questionable whether a non-breaking
space should be allowed. But it definitely shouldn't extend beyond that."

My understanding at that time was a parser only has to deal with Ascii
characters (Formally, the C0 Controls and Basic Latin block [1]). So the
only spacing allowed between markups and user's content were \u0009 and
\u0020. That won't prevent any other spacing character inside the text,
at least up to the DOM.

Dan also reminded me at that time `asciidoctor` "normalizes" lines early
in the document processing to remove trailing spaces ([2]). Once again,
I understood "to remove trailing \u0009 and \u0020." That also means
trailing spaces will not make their way into the DOM.

We didn't discuss the case of the `\r`. AFAIK, asciidoctor only
recognizes `\n` as the line terminator. So, we might extend the
normalization process to remove `\r` at the end of a line.


[1]: https://unicode.org/charts/PDF/U0000.pdf
[2]:
https://discuss.asciidoctor.org/Section-without-a-title-tp8466p8468.html

- Sylvain



Attachment: signature.asc
Description: OpenPGP digital signature


Back to the top