I am not 100% sure. But I why did you not use the standard confix for "text" ?
<fieldType name="text" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
You are using:
- <fieldtype name="text" class="solr.TextField">
- <analyzer>
<tokenizer class="solr.StandardTokenizerFactory"
luceneMatchVersion="LUCENE_29" />
<filter class="solr.StandardFilterFactory" />
<filter class="solr.LowerCaseFilterFactory" />
- <!--
<filter class="solr.StopFilterFactory" luceneMatchVersion="LUCENE_29"/>
<filter class="solr.EnglishPorterFilterFactory"/>
-->
</analyzer>
</fieldtype>
Can you try a more standard approach ?
solr.WhitespaceTokenizerFactory
solr.LowerCaseFilterFactory
??
Thanks.
On Mon, Feb 28, 2011 at 2:38 AM, Ahsan |qbal <[email protected]> wrote:
> Hi Bill
> Any update..
>
> On Thu, Feb 24, 2011 at 8:58 PM, Ahsan |qbal <[email protected]>
> wrote:
>>
>> Hi
>> schema and document are attached.
>>
>> On Thu, Feb 24, 2011 at 8:24 PM, Bill Bell <[email protected]> wrote:
>>>
>>> Send schema and document in XML format and I'll look at it
>>>
>>> Bill Bell
>>> Sent from mobile
>>>
>>>
>>> On Feb 24, 2011, at 7:26 AM, "Ahsan |qbal" <[email protected]>
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > To narrow down the issue I indexed a single document with one of the
>>> > sample
>>> > queries (given below) which was giving issue.
>>> >
>>> > *"evaluation of loan and lease portfolios for purposes of assessing the
>>> > adequacy of" *
>>> >
>>> > Now when i Perform a search query (*TextContents:"evaluation of loan
>>> > and
>>> > lease portfolios for purposes of assessing the adequacy of"*) the
>>> > parsed
>>> > query is
>>> >
>>> >
>>> > *spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([Contents:evaluation,
>>> > Contents:of], 0, true), Contents:loan], 0, true), Contents:and], 0,
>>> > true),
>>> > Contents:lease], 0, true), Contents:portfolios], 0, true),
>>> > Contents:for], 0,
>>> > true), Contents:purposes], 0, true), Contents:of], 0, true),
>>> > Contents:assessing], 0, true), Contents:the], 0, true),
>>> > Contents:adequacy],
>>> > 0, true), Contents:of], 0, true)*
>>> >
>>> > and search is not successful.
>>> >
>>> > If I remove '*evaluation*' from start OR *'assessing the adequacy of*'
>>> > from
>>> > end it works fine. Issue seems to come on relatively long phrases but I
>>> > have
>>> > not been able to find a pattern and its really mind boggling coz I
>>> > thought
>>> > this issue might be due to large position list but this is a single
>>> > document
>>> > with one phrase. So its definitely not related to size of index.
>>> >
>>> > Any ideas whats going on??
>>> >
>>> > On Thu, Feb 24, 2011 at 10:25 AM, Ahsan |qbal
>>> > <[email protected]>wrote:
>>> >
>>> >> Hi
>>> >>
>>> >> It didn't search.. (means no results found even results exist) one
>>> >> observation is that it works well even in the long phrases but when
>>> >> the long
>>> >> phrases contain stop words and same stop word exist two or more time
>>> >> in the
>>> >> phrase then, solr can't search with query parsed in this way.
>>> >>
>>> >>
>>> >> On Wed, Feb 23, 2011 at 11:49 PM, Otis Gospodnetic <
>>> >> [email protected]> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> What do you mean by "this doesn't work fine"? Does it not work
>>> >>> correctly
>>> >>> or is
>>> >>> it slow or ...
>>> >>>
>>> >>> I was going to suggest you look at Surround QP, but it looks like you
>>> >>> already
>>> >>> did that. Wouldn't it be better to get Surround QP to work?
>>> >>>
>>> >>> Otis
>>> >>> ----
>>> >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> >>> Lucene ecosystem search :: http://search-lucene.com/
>>> >>>
>>> >>>
>>> >>>
>>> >>> ----- Original Message ----
>>> >>>> From: Ahsan |qbal <[email protected]>
>>> >>>> To: [email protected]
>>> >>>> Sent: Tue, February 22, 2011 10:59:26 AM
>>> >>>> Subject: Question about Nested Span Near Query
>>> >>>>
>>> >>>> Hi All
>>> >>>>
>>> >>>> I had a requirement to implement queries that involves phrase
>>> >>> proximity.
>>> >>>> like user should be able to search "ab cd" w/5 "de fg", both
>>> >>>> phrases as
>>> >>>> whole should be with in 5 words of each other. For this I implement
>>> >>>> a
>>> >>> query
>>> >>>> parser that make use of nested span queries, so above query would
>>> >>>> be
>>> >>> parsed
>>> >>>> as
>>> >>>>
>>> >>>> spanNear([spanNear([Contents:ab, Contents:cd], 0, true),
>>> >>>> spanNear([Contents:de, Contents:fg], 0, true)], 5, false)
>>> >>>>
>>> >>>> Queries like this seems to work really good when phrases are small
>>> >>>> but
>>> >>> when
>>> >>>> phrases are large this doesn't work fine. Now my question, Is there
>>> >>>> any
>>> >>>> limitation of SpanNearQuery. that we cannot handle large phrases in
>>> >>> this
>>> >>>> way?
>>> >>>>
>>> >>>> please help
>>> >>>>
>>> >>>> Regards
>>> >>>> Ahsan
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>
>
>