Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [asciidoc-lang-dev] Whitespace handling



On Sun, 28 Feb 2021 at 18:17, Dan Allen <dan@xxxxxxxxxxxxxx> wrote:
> We need a collective name for the set of spacing characters. "space-like" ?

I've given this some thought today and also polled Twitter for some feedback. What I came up with was "spacing character(s)". If we want to be purely academic, the definition (from Merriam-Webster) fits extremely well:

>> the act of providing with spaces or placing at intervals

We are talking about the act of spacing visible characters & words for the purpose of placing them at intervals (either setting them apart in a line or organizing them on separate lines). I don't think we should try to invent a word when there's already one available with a suitable definition.

As I noted it needs to be defined, especially for the reason you mentioned below that only some characters are supported.  So we probably should be clear it is _Asciidoc's_ definition.  "Asciidoc spacing characters" is fine by me, if a bit long, but {asc} can fix that :-)

 

Speaking of Unicode, there *is* a property for this group of characters. The name of that group is "Space". Try /\p{Space}/ with any regular _expression_ engine that supports Unicode properties to see what it matches. However, I think the concern about this term being too easily confused with the space character (\u0020) from ASCII / Basic Latin is valid.

I'm not sure which regex engine that is, but the Ecmascript standard points to Unicode for the regex \p property names, which indeed does have a name for it, well enough hidden that I didn't find it until I followed the link from Ecmascript, https://unicode.org/reports/tr44/#White_Space and even better hidden aliases "WSpace" and lower case "space" so Unicode is no help, we are on our own.
 

One thing we will need to be careful about, though, is that AsciiDoc doesn't support *all* spacing characters. So we'll just need to emphasize that in our definition / usage.


I wonder if the standard should consider changing that?
 
Best Regards,

-Dan

Sylvain:

> > Perhaps "blank space" could be used, but it would need to be written down and defined.

> Amusingly enough, from a French-speaking perspective, "blank" is more questionable than "white".

Well there you go, everyone needs to take a deep breath when objecting to things.  Cultural and personal limits of what is objectionable are different around the world, even between native English speakers like Dan and I, let alone other languages and cultures.  Nothing should be considered automatically objectionable unless it is written down where contributors will easily find it, otherwise the same thing will continually happen, especially when a term is in conventional usage in the context as "whitespace" is.  And objections should be raised gently as Dan did in his PS (so gently I missed his point the first time :-)

Cheers
Lex


--
Dan Allen, Vice President | OpenDevise Inc.
Pronouns: he, him, his
Content ∙ Strategy ∙ Community
opendevise.com
_______________________________________________
asciidoc-lang-dev mailing list
asciidoc-lang-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/asciidoc-lang-dev

Back to the top