Hi, I am Japanese speaker and "this is a 文件 vi 打的" looks to me 6 words.
(Please note both Japanese and Chinese rarely use space as word separator, so this should be parsed by english syntax. Also Japanese tends to treat pair of Kanji as a word. For example, Kanji is 漢字.) I count it with my brain. So I think it should be correct :) Seriously, question is what you want to do? On Sat, Feb 07, 2009 at 01:16:56AM +0100, Samuel Thibault wrote: > Neo Anderson, le Fri 06 Feb 2009 15:50:34 -0800, a écrit : > > If I remember correctly that there is a mapping table, so possibly > > this can be done. But of course, perhaps this is just my wishful > > thinking. > > The problem is that posix says > > `The wc utility shall consider a word to be a non-zero-length string > of characters delimited by white space.' > > So that the -w behavior can't be changed, that'd need to be another > option. I agree it can not be changed. The word counting method by human depends on grammar of human languages. I am not sure if it is worth including such complication to simple base tool such as wc. It is good to have another tool to count words like human may be interesting by itself. It can not be simple chinese character conting and space checking. It will be a utility tool as a part of morphological analysis system. I see following in our archive. chasen juman mecab lttoolbox ... They may have such thing or base for such thing. I hope this may help. Osamu -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org