Hi Shawn,
Also, the tokenizer we use is very similar to the following.
ftp://zimbra.imladris.sk/src/HELIX-720.fbsd/ZimbraServer/src/java/com/zimbra/cs/index/analysis/UniversalTokenizer.java
ftp://zimbra.imladris.sk/src/HELIX-720.fbsd/ZimbraServer/src/java/com/zimbra/cs/index/analysis/UniversalLexer.jflex


From the looks of it the text is being indexed as a single token and not broken 
across whitespace. 

Thanks,
Rishi. 

 

 

-----Original Message-----
From: Shawn Heisey <apa...@elyograg.org>
To: solr-user <solr-user@lucene.apache.org>
Sent: Fri, Feb 20, 2015 11:52 am
Subject: Re: Strange search behaviour when upgrading to 4.10.3


On 2/20/2015 9:37 AM, Rishi Easwaran wrote:
> We are trying to upgrade from Solr 4.6 to 4.10.3. When testing search 4.10.3 
search results are not being returned, actually looks like only the first word 
in a sentence is getting indexed. 
> Ex: inserting "This is a test message" only returns results when searching 
> for 
content:this*. searching for content:test* or content:message* does not work 
with 4.10. Only searching for content:*message* works. This leads to me to 
believe there is something wrong with behaviour of our analyzer and tokenizers 

<snip>

>  <fields>
> <field name="content" type="ourType" stored="false" indexed = "true" 
required="false" multiValued="true" />
>   </fields>
>
> <fieldType name="ourType" indexed = "true" class="solr.TextField" >
>             <analyzer class = "com.zimbra.cs.index.ZimbraAnalyzer"  />
>         </fieldType>
>  
> Looking at the release notes from solr and lucene
> http://lucene.apache.org/solr/4_10_1/changes/Changes.html
> http://lucene.apache.org/core/4_10_1/changes/Changes.html
> Nothing really sticks out, atleast to me.  Any help to get it working with 
4.10 would be great.

The links you provided lead to zero-byte files when I try them, so I
could not look deeper.

Have you recompiled your custom analysis components against the newer
versions of the Solr/Lucene libraries?  Anytime you're dealing with
custom components, you cannot assume that a component compiled to work
with one version of Solr will work with another version.  The internal
API does change, and there is less emphasis on avoiding API breaks in
minor Solr releases than there is with Lucene, because the vast majority
of Solr users are not writing their own code that uses the Solr API. 
Recompiling against the newer libraries may cause compiler errors that
reveal places in your code that require changes.

Thanks,
Shawn


 

Reply via email to