I have made progress on this by writing my own Analyzer.  I basically added
the TokenFilters that are under each of the solr factory classes.  I had to
copy and paste the WordDelimiterFilter because, of course, it was package
protected.



On Mon, Oct 4, 2010 at 3:05 PM, Max Lynch <ihas...@gmail.com> wrote:

> Hi,
> I asked this question a month ago on lucene-user and was referred here.
>
> I have content being analyzed in Solr using these tokenizers and filters:
>
> <fieldType name="text_standard" class="solr.TextField"
> positionIncrementGap="100">
>    <analyzer type="index">
>          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
>       </analyzer>
> </fieldType>
>
> Basically I want to be able to search against this index in Lucene with one
> of my background searching applications.
>
> My main reason for using Lucene over Solr for this is that I use the
> highlighter to keep track of exactly which terms were found which I use for
> my own scoring system and I always collect the whole set of found
> documents.  I've messed around with using Boosts but it wasn't fine grained
> enough and I wasn't able to effectively create a score threshold (would
> creating my own scorer be a better idea?)
>
> Is it possible to use this analyzer from Lucene, or at least re-create it
> in code?
>
> Thanks.
>
>

Reply via email to