Re: utf8 hack for ls

2015-10-27 Thread Ted Unangst
Anthony J. Bentley wrote: > "Ted Unangst" writes: > > Fixing citrus is a pretty massive effort in itself. I'd prefer to see the > > replacement code prove itself as a separate API first, then we can remove > > citrus and change the wchar functions to use the new code. I'm less > > confident > > in

Re: utf8 hack for ls

2015-10-27 Thread Nicholas Marriott
Hi I can tell you for certain that I would use mbwidth() and mbvis() in tmux. Functions to answer things like "is this string valid UTF-8?" and "how many codepoints is this string?" would also be good. If the consensus is to use the locale.h goo I will do that but I would prefer something simpler

Re: utf8 hack for ls

2015-10-27 Thread Anthony J. Bentley
"Ted Unangst" writes: > Fixing citrus is a pretty massive effort in itself. I'd prefer to see the > replacement code prove itself as a separate API first, then we can remove > citrus and change the wchar functions to use the new code. I'm less confident > in a "meet in the middle" effort where we c

Re: utf8 hack for ls

2015-10-27 Thread Ted Unangst
Anthony J. Bentley wrote: > Stefan Sperling writes: > > On Mon, Oct 26, 2015 at 03:58:58PM -0600, Anthony J. Bentley wrote: > > > "Ted Unangst" writes: > > > > it only gets deeper and thicker... > > > > > > Indeed. > > > > > > Here's a shorter implementation. Like colorls(1), it uses wide > > > c

Re: utf8 hack for ls

2015-10-26 Thread Anthony J. Bentley
Stefan Sperling writes: > On Mon, Oct 26, 2015 at 03:58:58PM -0600, Anthony J. Bentley wrote: > > "Ted Unangst" writes: > > > it only gets deeper and thicker... > > > > Indeed. > > > > Here's a shorter implementation. Like colorls(1), it uses wide > > characters (only within the putname() functio

Re: utf8 hack for ls

2015-10-26 Thread Stefan Sperling
On Mon, Oct 26, 2015 at 03:58:58PM -0600, Anthony J. Bentley wrote: > "Ted Unangst" writes: > > it only gets deeper and thicker... > > Indeed. > > Here's a shorter implementation. Like colorls(1), it uses wide > characters (only within the putname() function) but is slightly cleaned > up and simp

Re: utf8 hack for ls

2015-10-26 Thread Anthony J. Bentley
"Ted Unangst" writes: > it only gets deeper and thicker... Indeed. Here's a shorter implementation. Like colorls(1), it uses wide characters (only within the putname() function) but is slightly cleaned up and simplified. Index: ls.c ==

Re: utf8 hack for ls

2015-10-26 Thread Theo de Raadt
> Damien Miller wrote: > > rather than scattering hacks in each program that needs to > > output utf8 to the console, how about making something > > for libutil that they all can use? > > Yes, that is certainly the plan, but I think it's easier to see what's needed > if we convert a few programs f

Re: utf8 hack for ls

2015-10-26 Thread Ted Unangst
Damien Miller wrote: > rather than scattering hacks in each program that needs to > output utf8 to the console, how about making something > for libutil that they all can use? Yes, that is certainly the plan, but I think it's easier to see what's needed if we convert a few programs first to identi

Re: utf8 hack for ls

2015-10-26 Thread Damien Miller
rather than scattering hacks in each program that needs to output utf8 to the console, how about making something for libutil that they all can use? On Sun, 25 Oct 2015, Ted Unangst wrote: > it only gets deeper and thicker... > > this decodes chars and prints ? for bytes it doesn't like, as well

Re: utf8 hack for ls

2015-10-25 Thread Ted Unangst
it only gets deeper and thicker... this decodes chars and prints ? for bytes it doesn't like, as well as codepoints (128-159) it doesn't like. (this is extracted from some old utf8 code i had laying around. it's a bit simpler than the stringprep stuff but it seems to handle the case of some incor

Re: utf8 hack for ls

2015-10-25 Thread Ted Unangst
Ted Unangst wrote: > Christian Weisgerber wrote: > > On 2015-10-23, "Ted Unangst" wrote: > > > > >> To what degree should tools like ls protect terminals from escape codes? > > > > > > I think this is beyond the scope of what ls should care about. du doesn't > > > have > > > such a check. Does t

Re: utf8 hack for ls

2015-10-23 Thread Ted Unangst
Christian Weisgerber wrote: > On 2015-10-23, "Ted Unangst" wrote: > > >> To what degree should tools like ls protect terminals from escape codes? > > > > I think this is beyond the scope of what ls should care about. du doesn't > > have > > such a check. Does the shell perform a check before tab

Re: utf8 hack for ls

2015-10-23 Thread Christian Weisgerber
On 2015-10-23, "Ted Unangst" wrote: >> To what degree should tools like ls protect terminals from escape codes? > > I think this is beyond the scope of what ls should care about. du doesn't have > such a check. Does the shell perform a check before tab completing? Our ksh, bash, and tcsh all esc

Re: utf8 hack for ls

2015-10-23 Thread Nicholas Marriott
Ah right this makes sense to me On Fri, Oct 23, 2015 at 09:13:02AM -0400, Ted Unangst wrote: > Nicholas Marriott wrote: > > Hi > > > > This doesn't account for UTF-8 double width characters, so they will > > still throw the column widths off? > > right. maybe we will steal some code from tmux f

Re: utf8 hack for ls

2015-10-23 Thread Ted Unangst
Nicholas Marriott wrote: > Hi > > This doesn't account for UTF-8 double width characters, so they will > still throw the column widths off? right. maybe we will steal some code from tmux for that :). but getting u8len() into the right places is the first step. i don't think we want a isu8cont()

Re: utf8 hack for ls

2015-10-23 Thread Stefan Sperling
On Fri, Oct 23, 2015 at 02:52:13PM +0200, Peter Hessler wrote: > As a different approach to ls, I wrote this a while ago. This uses the > wchar_t functions, but only in putname(). That's like the colorls port does it. I'm not sure if that's the best answer either, this diff would already be in ba

Re: utf8 hack for ls

2015-10-23 Thread Ted Unangst
Stefan Sperling wrote: > This removes the isprint() check entirely. Do we really want that? > > To what degree should tools like ls protect terminals from escape codes? I think this is beyond the scope of what ls should care about. du doesn't have such a check. Does the shell perform a check befo

Re: utf8 hack for ls

2015-10-23 Thread Ted Unangst
Peter Hessler wrote: > As a different approach to ls, I wrote this a while ago. This uses the > wchar_t functions, but only in putname(). This will correct the alignment of columns, but if you have a filename like pöp the columns will be super wide instead of nicely sized.

Re: utf8 hack for ls

2015-10-23 Thread Nicholas Marriott
Hi This doesn't account for UTF-8 double width characters, so they will still throw the column widths off? On Fri, Oct 23, 2015 at 08:42:52AM -0400, Ted Unangst wrote: > So, third diff to ponder as we evaluate this approach. This one also uses a > u8len() function to help get the column widths

Re: utf8 hack for ls

2015-10-23 Thread Peter Hessler
As a different approach to ls, I wrote this a while ago. This uses the wchar_t functions, but only in putname(). On 2015 Oct 23 (Fri) at 08:42:52 -0400 (-0400), Ted Unangst wrote: :So, third diff to ponder as we evaluate this approach. This one also uses a :u8len() function to help get the colu

Re: utf8 hack for ls

2015-10-23 Thread Stefan Sperling
On Fri, Oct 23, 2015 at 08:42:52AM -0400, Ted Unangst wrote: > So, third diff to ponder as we evaluate this approach. This one also uses a > u8len() function to help get the column widths correct. > > (Still not dealing with combining or otherwise not 1 width glyphs.) > > Index: ls.c > ==

utf8 hack for ls

2015-10-23 Thread Ted Unangst
So, third diff to ponder as we evaluate this approach. This one also uses a u8len() function to help get the column widths correct. (Still not dealing with combining or otherwise not 1 width glyphs.) Index: ls.c === RCS file: /cvs/sr