RE: Indexing Japanese & English

2008-02-07 Thread Paul Clegg
lto:[EMAIL PROTECTED] Sent: Thursday, February 07, 2008 11:05 AM To: solr-user@lucene.apache.org Subject: RE: Indexing Japanese & English Here are the comments for CJKTokenizer. First, is this what you want? Remember, there are three Japanese writing systems. /** * CJKTokenizer was m

RE: Indexing Japanese & English

2008-02-07 Thread Lance Norskog
; will token as letter * for more info on Asia language(Chinese Japanese Korean) text segmentation: * please search http://www.google.com/search?q=word+chinese+segment";>google * * @author Che, Dong */ -Original Message- From: Paul Clegg [mailto:[EMAIL PROTECTED] Sent: Thursda

Indexing Japanese & English

2008-02-07 Thread Paul Clegg
I hate asking stupid questions immediately after joining a mailing list, but I'm in a bit of a pinch here. I'm using Solr/Tomcat for a Ruby on Rails project (acts_as_solr) and I've had a lot of success getting it working -- for English. The problem I'm running into is that our primary customer