ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory
Hi all, I'm new to the list (but not totally new to Solr). The documentation states that ISOLatin1AccentFilterFactory is deprecated in favour of ASCIIFoldingFilterFactory: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ISOLatin1AccentFilterFactory I see problems with this. If I have understood ASCIIFoldingFilterFactory correctly it folds both accented characters like 'é' to 'e' and national characters like 'ö' to 'o'. The former is desirable, the latter very much not when indexing for example scandinavian languages. Is there a way to limit which characters are folded? -- ____ Nils Weinander
Re: ISOLatin1AccentFilterFactory vs ASCIIFoldingFilterFactory
On Tue, Jun 14, 2011 at 1:11 PM, Ahmet Arslan wrote: > > With MappingCharFilterFactory you have fully control over which characters > are folded. You can see the default mappings in > mapping-ISOLatin1Accent.txt file. > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.MappingCharFilterFactory Thanks Ahmet! Exactly what I needed. ________ Nils Weinander
Re: Solr 4.0 indexing performance
Ah, thanks Markus! That's a good thing. I tried disabling the transaction log, the difference performance is marginal. So, I'll stick with the transaction logging. On Thu, Nov 15, 2012 at 11:02 AM, Markus Jelsma wrote: > Hi - you're likely seeing a drop in performance because of durability > which is enabled by default via a transaction log. When disabled 4.0 is > iirc slightly faster than 3.x. > > > -Original message- > > From:Nils Weinander > > Sent: Thu 15-Nov-2012 10:35 > > To: solr-user@lucene.apache.org > > Subject: Solr 4.0 indexing performance > > > > I have just updated from Solr 3.6 to 4.0, using defaults in > solrconfig.xml > > for both versions. With 4.0, bulk indexing takes about twice the time it > > did in 3.6. Is this to be expected, or the result of my lack of > optimization > > in the configuration? > > > > -- > > ________ > > Nils Weinander > > > -- Nils Weinander