Ingo Schwarze wrote: > I think that way we can actually start committing such patches and > improve our userland. > > Two final notes: > > 1. It turns out each of the three programs needs exactly one > multibyte-character helper function in utf8.c, and each helper > function uses mbtowc(3) and mbwidth(3), but all three helper > functions have different functionality. So we don't know yet > exactly which functionality will be needed most. But that's > no problem because these functions are very small can comfortably > live in utf8.c.
One other program which I think will provide some insight is cut. It cares, perhaps even more than others, about bytes and chars and bytes in the middle of chars. I took a poke at it the other day, but didn't get too far, but I'd feel more confident moving forward if cut were settled too.