Re: Query for exact part of sentence

zarni aung Tue, 31 Jan 2012 09:27:47 -0800

Did you rebuild the index?  That would help since the index analyzer has
been changed.


On Tue, Jan 31, 2012 at 9:53 AM, Arkadi Colson <ark...@smartbit.be> wrote:

> The text field in the schema configuration looks like this. I changed
> catenateNumbers to 0 but it still doesn't work as aspected.
>
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
> <!-- in this example, we will only use synonyms at query time
> <filter class="solr.**SynonymFilterFactory" synonyms="index_synonyms.txt"
> ignoreCase="true" expand="false"/>
>        -->
> <!-- Case insensitive stop word removal.
>          add enablePositionIncrements=true in both the index and query
>          analyzers to leave a 'gap' for more accurate phrase queries.
>        -->
>
> <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords_en.txt"
>                enablePositionIncrements="**true"
>                />
> <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords_du.txt"
>                enablePositionIncrements="**true"
>                />
> <filter class="solr.**WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.**LowerCaseFilterFactory"/>
> <filter class="solr.**SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
> <filter class="solr.**SnowballPorterFilterFactory" language="Dutch" />
> <filter class="solr.**NGramFilterFactory" minGramSize="3"
> maxGramSize="15"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.**WhitespaceTokenizerFactory"/>
> <filter class="solr.**SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords_en.txt"
>                enablePositionIncrements="**true"
>                />
> <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords_du.txt"
>                enablePositionIncrements="**true"
>                />
> <filter class="solr.**WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.**LowerCaseFilterFactory"/>
> <filter class="solr.**SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
> <filter class="solr.**SnowballPorterFilterFactory" language="Dutch" />
> </analyzer>
> </fieldType>
>
>
>
> On 01/31/2012 03:03 PM, Erick Erickson wrote:
>
>> Unless you provide your schema configuration, there's
>> not much to go on here. Two things though:
>>
>> 1>  look at the admin/analysis page to see how your
>>      data is broken up into tokens.
>> 2>  at a guess you have WordDelimiterFilterFactory
>>      in your chain and perhaps catenateNumbers="1"
>>
>> Best
>> Erick
>>
>> On Mon, Jan 30, 2012 at 3:21 AM, Arkadi Colson<ark...@smartbit.be>
>>  wrote:
>>
>>> Hi
>>>
>>> I'm using the pecl PHP class to query SOLR and was wondering how to query
>>> for a part of a sentence exactly.
>>>
>>> There are 2 data items index in SOLR
>>> 1327497476: 123 456 789
>>> 1327497521. 1234 5678 9011
>>>
>>> However when running the query, both data items are returned as you can
>>> see
>>> below. Any idea why?
>>>
>>> Thanks!
>>>
>>> SolrObject Object
>>> (
>>>    [responseHeader] =>    SolrObject Object
>>>        (
>>>            [status] =>    0
>>>            [QTime] =>    5016
>>>            [params] =>    SolrObject Object
>>>                (
>>>                    [debugQuery] =>    true
>>>                    [shards] =>
>>>  solr01:8983/solr,solr02:8983/**solr,solr03:8983/solr
>>>                    [fl] =>
>>>  id,smsc_module,smsc_ssid,smsc_**description,smsc_content,smsc_**
>>> courseid,smsc_date_created,**smsc_date_edited,score,**
>>> metadata_stream_size,metadata_**stream_source_info,metadata_**
>>> stream_name,metadata_stream_**content_type,last_modified,**
>>> author,title,subject
>>>                    [sort] =>    smsc_date_created asc
>>>                    [indent] =>    on
>>>                    [start] =>    0
>>>                    [q] =>    (smsc_content:\"123 456\" ||
>>> smsc_description:\"123 456\")&&    (smsc_module:Intradesk)&&
>>>  (smsc_date_created:[2011-12-**25T10:29:51Z TO NOW])&&    (smsc_ssid:38)
>>>                    [distrib] =>    true
>>>                    [wt] =>    xml
>>>                    [version] =>    2.2
>>>                    [rows] =>    55
>>>                )
>>>
>>>        )
>>>
>>>    [response] =>    SolrObject Object
>>>        (
>>>            [numFound] =>    2
>>>            [start] =>    0
>>>            [docs] =>    Array
>>>                (
>>>                    [0] =>    SolrObject Object
>>>                        (
>>>                            [smsc_module] =>    Intradesk
>>>                            [smsc_ssid] =>    38
>>>                            [id] =>    1327497476
>>>                            [smsc_courseid] =>    0
>>>                            [smsc_date_created] =>    2011-12-25T10:29:51Z
>>>                            [smsc_date_edited] =>    2011-12-25T10:29:51Z
>>>                            [score] =>    10.028017
>>>                        )
>>>
>>>                    [1] =>    SolrObject Object
>>>                        (
>>>                            [smsc_module] =>    Intradesk
>>>                            [smsc_ssid] =>    38
>>>                            [id] =>    1327497521
>>>                            [smsc_courseid] =>    0
>>>                            [smsc_date_created] =>    2011-12-25T10:29:51Z
>>>                            [smsc_date_edited] =>    2011-12-25T10:29:51Z
>>>                            [score] =>    5.541335
>>>                        )
>>>
>>>                )
>>>
>>>        )
>>>    [debug] =>    SolrObject Object
>>>        (
>>>            [rawquerystring] =>    (smsc_content:\"123 456\" ||
>>> smsc_description:\"123 456\")&&    (smsc_module:Intradesk)&&
>>>  (smsc_date_created:[2011-12-**25T10:29:51Z TO NOW])&&    (smsc_ssid:38)
>>>            [querystring] =>    (smsc_content:\"123 456\" ||
>>> smsc_description:\"123 456\")&&    (smsc_module:Intradesk)&&
>>>  (smsc_date_created:[2011-12-**25T10:29:51Z TO NOW])&&    (smsc_ssid:38)
>>>            [parsedquery] =>    +(smsc_content:123 smsc_content:456
>>> smsc_description:123 smsc_content:456) +smsc_module:intradesk
>>> +smsc_date_created:[2011-12-**25T10:29:51Z TO 2012-01-25T13:33:21.098Z]
>>> +smsc_ssid:38
>>>            [parsedquery_toString] =>    +(smsc_content:123
>>> smsc_content:456
>>> smsc_description:123 smsc_content:456) +smsc_module:intradesk
>>> +smsc_date_created:[2011-12-**25T10:29:51 TO 2012-01-25T13:33:21.098]
>>> +smsc_ssid:`#8;#0;#0;#0;&
>>>            [QParser] =>    LuceneQParser
>>>            [timing] =>    SolrObject Object
>>>
>>>
>>
> --
> Smartbit bvba
> Hoogstraat 13
> B-3670 Meeuwen
> T: +32 11 64 08 80
> F: +32 89 46 81 10
> W: http://www.smartbit.be
> E: ark...@smartbit.be
>
>

Re: Query for exact part of sentence

Reply via email to