What we usually do to reindex is:

1. stop solr
2. rmdir -r data  (that is to remove everything in  /opt/solr/data/
3. mkdir data
4. start solr
5. start reindex.....   with this we're sure about not having old copies or
index..

To check the index size we do:
cd data
du -sh



Otis Gospodnetic wrote:
> 
> 
> I can't tell what that analyzer does, but I'm guessing it uses n-grams?
> Maybe consider trying https://issues.apache.org/jira/browse/LUCENE-1629
> instead?
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
>> From: Fer-Bj <fernando.b...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, June 4, 2009 2:20:03 AM
>> Subject: Re: indexing Chienese langage
>> 
>> 
>> We are trying SOLR 1.3 with Paoding Chinese Analyzer , and after
>> reindexing
>> the index size went from 1.5 Gb to 2.7 Gb.
>> 
>> Is that some expected behavior ?
>> 
>> Is there any switch or trick to avoid having a double + index file size?
>> 
>> Koji Sekiguchi-2 wrote:
>> > 
>> > CharFilter can normalize (convert) traditional chinese to simplified 
>> > chinese or vice versa,
>> > if you define mapping.txt. Here is the sample of Chinese character 
>> > normalization:
>> > 
>> > 
>> https://issues.apache.org/jira/secure/attachment/12392639/character-normalization.JPG
>> > 
>> > See SOLR-822 for the detail:
>> > 
>> > https://issues.apache.org/jira/browse/SOLR-822
>> > 
>> > Koji
>> > 
>> > 
>> > revathy arun wrote:
>> >> Hi,
>> >>
>> >> When I index chinese content using chinese tokenizer and analyzer in
>> solr
>> >> 1.3 ,some of the chinese text files are getting indexed but others are
>> >> not.
>> >>
>> >> Since chinese has got many different language subtypes as in standard
>> >> chinese,simplified chinese etc which of these does the chinese
>> tokenizer
>> >> support and is there any method to find the type of  chiense language 
>> >> from
>> >> the file?
>> >>
>> >> Rgds
>> >>
>> >>  
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/indexing-Chienese-langage-tp22033302p23864358.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/indexing-Chienese-langage-tp22033302p23879730.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to