Bug#271397: enamdict: add frequency statistic

2011-08-11 Thread Jim Breen
こんばんは, 2011/8/11 Osamu Aoki : > I have found a data as below in CSV format for family name. > Anyway raw data has a bit over 100,600 names. > Given name is a bit difficult. Yes, but family names is a great start. > It looks like > > "sei","rank","number" > "佐藤","1位",481980 > "鈴木","2位",426804

Bug#271397: enamdict: add frequency statistic

2011-08-11 Thread Osamu Aoki
Hi, On Thu, Aug 11, 2011 at 06:00:55PM +1000, Jim Breen wrote: > I would be quite happy to add some sort of frequency metric > to given and family names in the ENAMDICT file. The trouble > is I have no time spare to go digging out the data. I have found a data as below in CSV format for family n

Bug#271397: enamdict: add frequency statistic

2011-08-11 Thread Jim Breen
I would be quite happy to add some sort of frequency metric to given and family names in the ENAMDICT file. The trouble is I have no time spare to go digging out the data. If someone else were prepared to compile it, I'd be glad to add it. Jim Breen 2011/8/11 Osamu Aoki : > Hi, > > This is about:

Bug#271397: enamdict: add frequency statistic

2011-08-10 Thread Osamu Aoki
Hi, This is about: http://bugs.debian.org/271397 Mr. Tashiro is quite obvious.(% population uses, popularity position) 田代(0.061%, #287th) - I pick this without second thought. 田城(0.001%, #6981th) - mozc Japanese imput listed this too. Not that popular names but this names pupolar than 田代 cove