On 11/22/2010 7:40 AM, Erick Erickson wrote:
As I remember, PatternReplace... isn't in 1.4, so you'd have to move to 3.x
or trunk.

You could always write a custom class that did what you wanted, it's
actually
pretty easy.

PatternReplaceCharFilterFactory isn't in 1.4, but PatternReplaceFilterFactory 
is.  I'm using it in my 1.4.1 installation.  The CharFilter version gets 
applied before tokenization, which caused problems for me in my testing of 
branch_3x.  In situations where the order of operations isn't important, the 
CharFilter option would be great.

Based on their description, I'd think what they actually want is WordDelimiterFilterFactory with preserveOriginal and catenateWords turned on at a minimum. That should match on any likely representation of J.R.R. Tolkien. The other options can also be useful.

In my schema, the index analyzer has WordDelimiterFilterFactory with everything turned on except catenateAll, and the query analyzer is the same except all three catenate options are turned off.

Shawn

Reply via email to