What we usually do to reindex is: 1. stop solr 2. rmdir -r data (that is to remove everything in /opt/solr/data/ 3. mkdir data 4. start solr 5. start reindex..... with this we're sure about not having old copies or index..
To check the index size we do: cd data du -sh Otis Gospodnetic wrote: > > > I can't tell what that analyzer does, but I'm guessing it uses n-grams? > Maybe consider trying https://issues.apache.org/jira/browse/LUCENE-1629 > instead? > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- >> From: Fer-Bj <fernando.b...@gmail.com> >> To: solr-user@lucene.apache.org >> Sent: Thursday, June 4, 2009 2:20:03 AM >> Subject: Re: indexing Chienese langage >> >> >> We are trying SOLR 1.3 with Paoding Chinese Analyzer , and after >> reindexing >> the index size went from 1.5 Gb to 2.7 Gb. >> >> Is that some expected behavior ? >> >> Is there any switch or trick to avoid having a double + index file size? >> >> Koji Sekiguchi-2 wrote: >> > >> > CharFilter can normalize (convert) traditional chinese to simplified >> > chinese or vice versa, >> > if you define mapping.txt. Here is the sample of Chinese character >> > normalization: >> > >> > >> https://issues.apache.org/jira/secure/attachment/12392639/character-normalization.JPG >> > >> > See SOLR-822 for the detail: >> > >> > https://issues.apache.org/jira/browse/SOLR-822 >> > >> > Koji >> > >> > >> > revathy arun wrote: >> >> Hi, >> >> >> >> When I index chinese content using chinese tokenizer and analyzer in >> solr >> >> 1.3 ,some of the chinese text files are getting indexed but others are >> >> not. >> >> >> >> Since chinese has got many different language subtypes as in standard >> >> chinese,simplified chinese etc which of these does the chinese >> tokenizer >> >> support and is there any method to find the type of chiense language >> >> from >> >> the file? >> >> >> >> Rgds >> >> >> >> >> > >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/indexing-Chienese-langage-tp22033302p23864358.html >> Sent from the Solr - User mailing list archive at Nabble.com. > > > -- View this message in context: http://www.nabble.com/indexing-Chienese-langage-tp22033302p23879730.html Sent from the Solr - User mailing list archive at Nabble.com.