Re: Store input text after analyzers and token filters

JCodina Mon, 15 Mar 2010 02:59:13 -0700

Ok
For solr 1.5
after looking around, analyzing the answers in this forum, and browsing the
code,  I think that I could manage it. I had to write a few lines of code,
the problem was to find which ones !!!
So i did a new class, which is a subclass of CompressableField that includes
a new parameter
preProcessType, which is another field type.
Then it uses this type in the  toInternal method to generate the input
string.
the class is  AnalyzedField


It can be used, for example, to store bigrams generated by the
shingleFilterFactory.
In the schema we must add a field type:


                <fieldType name="bigramsType" class="solr.TextField"
positionIncrementGap="100">
                        <analyzer type="index">
                               <tokenizer
class="solr.HTMLStripStandardTokenizerFactory"/> 
                                <filter class="solr.ShingleFilterFactory"
maxShingleSize="2" outputUnigrams="false"/>                     
                         </analyzer>
                </fieldType>


        <fieldType name="analyzedBigramsType" class="solr.AnalyzedField"
positionIncrementGap="100" preProcessType="BygramsType" >
             <analyzer>
                 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
          </analyzer>
          </fieldType>

 
And then create the field 

<field name="storedBigrams" type="analyzedBigramsType" indexed="true"
stored="true" termVectors="true" multiValued="true"/>
http://old.nabble.com/file/p27902209/AnalyzedField.java AnalyzedField.java 
I have a version for solr 1.4, see next post


enjoy it,
Joan Codina
-- 
View this message in context: 
http://old.nabble.com/Store-input-text-after-analyzers-and-token-filters-tp27792550p27902209.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Store input text after analyzers and token filters

Reply via email to