On 04/06/2014 12:56 PM, Dan Douglas wrote: > On Sunday, April 06, 2014 01:24:58 PM Jan Novak wrote: >> To solve this problem I suppose to add "wide" switch to printf >> or to add "%S" format (similarly to wprintf(3) ) > > ksh93 already has this feature using the "L" modifier: > > ksh -c "printf '%.3Ls\n' $'\u2605\u2605\u2605\u2605\u2605'" > ★★★ > bash -c "printf '%.3Ls\n' $'\u2605\u2605\u2605\u2605\u2605'" > ★ > > Also, zsh does this by default with no special option. I tend to lean towards > going by character anyway because that's what most shell features such as > "read -N" do, and most work directly involving the shell is with text not > binary data.
So we can count bytes, chars or cells (graphemes). Thinking a bit more about it, I think shell level printf should be dealing in text of the current encoding and counting cells. In the edge case where you want to deal in bytes one can do: LC_ALL=C printf ... I see that ksh behaves as I would expect and counts cells, though requires the explicit %L enabler: $ ksh -c "printf '%.3Ls\n' $'a\u0301\u2605\u2605\u2605'" á★★ $ ksh -c "printf '%.3Ls\n' $'A\u2605\u2605\u2605'" A★ $ ksh -c "printf '%.3Ls\n' $'AA\u2605\u2605\u2605'" A zsh seems to just count characters: $ zsh -c "printf '%.3Ls\n' $'a\u0301\u2605\u2605\u2605'" á★ $ zsh -c "printf '%.3s\n' $'a\u0301\u2605\u2605\u2605'" á★ $ zsh -c "printf '%.3Ls\n' $'A\u2605\u2605\u2605'" A★★ GNU awk seems to just count characters: $ awk 'BEGIN{printf "%.3s\n", "A★★★"}' A★★ I see that dash gives invalid directive for any of %ls %Ls %S. Pity there is no consensus here. Personally I would go for: printf '%3s' 'blah' # count cells printf '%3Ls' 'blah' # count chars LANG=C '%3Ls' 'blah' # count bytes LANG=C '%3s' 'blah' # count bytes Pádraig.