I would be quite happy to add some sort of frequency metric to given and family names in the ENAMDICT file. The trouble is I have no time spare to go digging out the data. If someone else were prepared to compile it, I'd be glad to add it.
Jim Breen 2011/8/11 Osamu Aoki <os...@debian.org>: > Hi, > > This is about: http://bugs.debian.org/271397 > > Mr. Tashiro is quite obvious.(% population uses, popularity position) > 田代(0.061%, #287th) - I pick this without second thought. > 田城(0.001%, #6981th) - mozc Japanese imput listed this too. > > Not that popular names but this names pupolar than 田代 covers 50% of > Japanese population. > > I got this base facts using data by 城岡研究室 > 静岡大学 人文学部 言語文化学科比較言語文化コース > http://www.ipc.shizuoka.ac.jp/~jjksiro/shiro.html > (With UTF-8 conversion/Openoffice Calc) > > There is a page > http://www.ipc.shizuoka.ac.jp/~jjksiro/kensaku.html > (You can read javascript source and identify the list location as: > http://www.ipc.shizuoka.ac.jp/~jjksiro/sei.csv > > Since he seems to love to use old BSD tools sed/awk/..., he may agree to > license this data as BSD :-) Just sweat talk to him ..., Jim, I think > you have good chance. > > Nw Japanese copyright law allows copying to analyze data: > > (情報解析のための複製等) > > 第四十七条の七 著作物は、電子計算機による情報解析(多数の著作物その他の大量の > 情報から、当該情報を構成する言語、音、影像その他の要素に係る情報を抽出し、比較 > 、分類その他の統計的な解析を行うことをいう。以下この条において同じ。)を行うこ > とを目的とする場合には、必要と認められる限度において、記録媒体への記録又は翻案 > (これにより創作した二次的著作物の記録を含む。)を行うことができる。ただし、情 > 報解析を行う者の用に供するために作成されたデータベースの著作物については、この > 限りでない。 > > Old electric Phone books, I guess did not have obnoxous restriction as > now. So he could do this. > > There is also TOP 100 popular name is published by 明治安田生命、2008年。 > http://www.meijiyasuda.co.jp/profile/release/2008/pdf/20080924.pdf > > Osamu > > > -- Jim Breen Adjunct Snr Research Fellow, Clayton School of IT, Monash University Webmaster: Hawthorn Rowing Club, Treasurer: Japanese Studies Centre Graduate student: Language Technology Group, University of Melbourne -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org