Hi,

Yamane-san
> Then, set priority=30 in unidic-mecab is better, right?

No.  We need more than 100 to ensure it over naist-jdic (UTF-8).

 $ sudo update-alternatives --config mecab-dictionary
There are 6 choices for the alternative mecab-dictionary (providing 
/var/lib/mecab/dic/debian).

  Selection    Path                                 Priority   Status
------------------------------------------------------------
* 0            /var/lib/mecab/dic/unidic             100       auto mode
  1            /var/lib/mecab/dic/ipadic             70        manual mode
  2            /var/lib/mecab/dic/ipadic-utf8        80        manual mode
  3            /var/lib/mecab/dic/juman-utf8         40        manual mode
  4            /var/lib/mecab/dic/naist-jdic         100       manual mode
  5            /var/lib/mecab/dic/naist-jdic-eucjp   90        manual mode
  6            /var/lib/mecab/dic/unidic             100       manual mode

But why unidic is not default dict? ... Alas, this package lacks binary
dictionary installation process in its packaging so I see:

 $ ls -la /var/lib/mecab/dic/unidic
total 8
drwxr-xr-x 2 root root 4096 Dec  2 18:27 .
drwxr-xr-x 8 root root 4096 Feb 19 00:22 ..

Nothing.  We can create ibinary dictionary data via postinst with

  /usr/lib/mecab/mecab-dict-index -d ${srcdir} -o ${dstdir} -t ${encoding}

but considering this is going to be huge CPU load, we may as well install
binary dictionaries from the upstream package.  By tweaking debian/install and
add debian/links to get dicrc both in /usr/... and /var/... sides.

Hmmm... since I am a member of nlp team, I think I can update ...
Yamane-sam may be busy....

Anyway, this is 7GB unzipped-tarball.  This is absolutely the biggest deb.

Building deb may take time ...

Osamu

Reply via email to