Hi,

Anybody has any thoughts about this?
I'm really struggling whit these problems, any hints would be very welcome!

Regards,

Dirceu

On Fri, Feb 10, 2012 at 4:45 PM, Dirceu Vieira <dirceu...@gmail.com> wrote:

> Hi Guys,
>
> Would someone have time to help me understand what's happening here:
>
> I have a dynamic field called *prMeta_service *and this value *"EHT2011-2012"
> *is indexed for various documents.
>
> When I search for the same exact value (*"EHT2011-2012"*), it ends up NOT
> matching the content.
> I have spent quite a lot of time lately trying to understand what happens,
> reading every documentation possible about the Token Filters that are used
> in this field, but I can't seem to find the answer.
>
> It seems to me that for some reason, the parser is getting lost because
> the value contains letters and numbers, I mention that because I have tried
> querying only for *"2011-2012" and *"*20112012*" and then I have the
> expected results.
>
> I am using Solr 1.4, and I haven't tried it in any other version.
>
> Another interesting factor is that for some reason the
> SnowballPorterFilterFactory is removing a character from *"2011" * and so
> *"201" *is the value that is actually indexed.
> I don't believe that this last point is what actually causes
> my unsatisfactory results, but I just wanted to know if anybody have any
> issue with the Finish language stemming.
>
>
> I would very much appreciate if someone could spare some time to help me
> on this issue.
>
>
> My configuration looks like:
>
>
> *- Dynamic field: *
>
> <dynamicField name="prMeta_*" type="text" indexed="true" stored="true"
> multiValued="true"/>
>
> *- Field type:*
>
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true" words="
> stopwords.txt"/>
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll
> ="0"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"
> />
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="Finnish"/>
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="
> 25"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true" words="
> stopwords.txt"/>
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll
> ="0"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"
> />
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> <filter class="solr.SnowballPorterFilterFactory" language="Finnish"/>
> </analyzer>
> </fieldType>
>
> *- The field analysis gives me that as a response:*
>
>  EHT2011-2012 EHT2011-2012 EHT 2011 2012 20112012 eht 2011 2012 20112012
> eht 2011 2012 20112012 eht 2011 2012 20112012 eht 201 2012 20112012 e 
> eheht2202012202012012220201201120112201120201120120112012
>
> - *When I run the query in the admin in debug mode (&debugQuery=true),
> that's the result:*
>
> <str name="rawquerystring">
> prMeta_service:EHT2011-2012
> </str>
> <str name="querystring">
> prMeta_service:EHT2011-2012
> </str>
> <str name="parsedquery">
> PhraseQuery(prMeta_service:"eht 201 2012")
> </str>
> <str name="parsedquery_toString">
> prMeta_service:"eht 201 2012"
> </str>
>
>
> Thank you very much in advance!
>
> Best regards,
>
> --
> Dirceu Vieira Júnior
> -------------------------------------------------------------------
> +47 9753 2473
> dirceuvjr.blogspot.com
> twitter.com/dirceuvjr
>
>


-- 
Dirceu Vieira Júnior
-------------------------------------------------------------------
+47 9753 2473
dirceuvjr.blogspot.com
twitter.com/dirceuvjr

Reply via email to