> Date: Tue, 5 Sep 2023 20:19:40 +0200 > From: Patrice Dumas <pertu...@free.fr> > Cc: bug-texinfo@gnu.org > > On Tue, Sep 05, 2023 at 09:09:18PM +0300, Eli Zaretskii wrote: > > > Date: Tue, 5 Sep 2023 20:01:53 +0200 > > > From: Patrice Dumas <pertu...@free.fr> > > > > > > Currently, when counting the width of a line of character, we count > > > control characters that are also spaces as having a width of 1. I think > > > that it is not good, as control characters either should not have a > > > width, for end of line, form feed, carriage return, or have a width that > > > is not well defined for vertical and horizontal tab. I suggest to > > > consider all the control characters as having a width of 0. This will > > > be consistent with libunistring u8_strwidth, which I intend to use in C > > > code equivalent to perl code. > > > > Please define "control characters" for this purpose. Some of them are > > definitely not zero-width, for example, TAB. > > Characters whose unicode codepoints in decimal are in the range 0 to 31, > and also 127 (Delete). This includes the horizontal tab. It > corresponds to the [:cntrl:] character class.
Then I guess I still don't understand: how is TAB a zero-width character? > > Also, depending on how control characters are displayed, their width > > could be even 4, for example if they are displayed as \nnn octal > > escapes. > > It is in a context where they are displayed as encoded bytes. So what is the context of this discussion, if it is not display of bytes? I really don't understand, could you elaborate? Control characters can also be displayed as ^C, for example, in which case they take 2 columns.