"Ted Unangst" writes:
> Fixing citrus is a pretty massive effort in itself. I'd prefer to see the
> replacement code prove itself as a separate API first, then we can remove
> citrus and change the wchar functions to use the new code. I'm less confident
> in a "meet in the middle" effort where we convert to wchar while simultanesou
> ly
> hoping the wchar code gets better.

So how about a compromise? Use wchar_t where necessary, and no more
than that. After several programs get converted, come up with a better
wchar_t-free API, because at that time it becomes clear what kind of
functions are necessary.

> In the mean time, maybe we should also look at a few more utilities. We seem
> to have some intuition that mbwidth() etc. will be useful, and maybe mbvis(),
> but we don't really know. So far I've poked at ul, rs, and ls. Then we've gon
> e
> around in circles polishing those particular turds, but not much effort was
> spent looking at other utilities.

schwarze@, zhuk@ and czarkoff@ (and maybe others) spent time looking at
assorted utilities and classifying them. The difference between ls and
the other programs you mentioned is that it calculates *and uses* column
widths of frequently-UTF-8 data. Counting codepoints isn't adequate--most
UTF-8 filenames on my system contain double-width characters. Until we
come up with a better API, the *only* way to check widths is wcwidth()
and friends. And the best way to come up with a better replacement is to
have real programs that count columns. Or would you prefer to add a full
range of Unicode width tables to ls?

> What about wc? file? du? top? cut? vis???
> 
> Will adding utf-8/unicode support to each of those look the same? Or will
> every utility be different?

Again--there are a few broad categories that will require different
approaches. ls(1) is in the "needs to count columns" category.

> Additionally, I think all diffs to fix ls should be accompanied by before
> and after ktrace output. :)

That's a snappy line, and one that exposes a real problem in our wchar_t
implementation, but it doesn't help ls work in ~/anime/...

Reply via email to