I would be quite happy to add some sort of frequency metric
to given and family names in the ENAMDICT file. The trouble
is I have no time spare to go digging out the data. If someone
else were prepared to compile it, I'd be glad to add it.

Jim Breen

2011/8/11 Osamu Aoki <os...@debian.org>:
> Hi,
>
> This is about: http://bugs.debian.org/271397
>
> Mr. Tashiro is quite obvious.(% population uses, popularity position)
>  田代(0.061%,  #287th) - I pick this without second thought.
>  田城(0.001%, #6981th) - mozc Japanese imput listed this too.
>
> Not that popular names but this names pupolar than 田代 covers 50% of
> Japanese population.
>
> I got this base facts using data by 城岡研究室
> 静岡大学   人文学部   言語文化学科比較言語文化コース
> http://www.ipc.shizuoka.ac.jp/~jjksiro/shiro.html
> (With UTF-8 conversion/Openoffice Calc)
>
> There is a page
> http://www.ipc.shizuoka.ac.jp/~jjksiro/kensaku.html
> (You can read javascript source and identify the list location as:
> http://www.ipc.shizuoka.ac.jp/~jjksiro/sei.csv
>
> Since he seems to love to use old BSD tools sed/awk/..., he may agree to
> license this data as BSD :-)  Just sweat talk to him ..., Jim, I think
> you have good chance.
>
> Nw Japanese copyright law allows copying to analyze data:
>
> (情報解析のための複製等)
>
> 第四十七条の七 著作物は、電子計算機による情報解析(多数の著作物その他の大量の
> 情報から、当該情報を構成する言語、音、影像その他の要素に係る情報を抽出し、比較
> 、分類その他の統計的な解析を行うことをいう。以下この条において同じ。)を行うこ
> とを目的とする場合には、必要と認められる限度において、記録媒体への記録又は翻案
> (これにより創作した二次的著作物の記録を含む。)を行うことができる。ただし、情
> 報解析を行う者の用に供するために作成されたデータベースの著作物については、この
> 限りでない。
>
> Old electric Phone books, I guess did not have obnoxous restriction as
> now.  So he could do this.
>
> There is also TOP 100 popular name is published by 明治安田生命、2008年。
> http://www.meijiyasuda.co.jp/profile/release/2008/pdf/20080924.pdf
>
> Osamu
>
>
>



-- 
Jim Breen
Adjunct Snr Research Fellow, Clayton School of IT, Monash University
Webmaster: Hawthorn Rowing Club, Treasurer: Japanese Studies Centre
Graduate student: Language Technology Group, University of Melbourne



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to