RE: Search a Phrase

Pragyanshis Pattanaik Sun, 10 Feb 2013 23:20:05 -0800

Thanks Upayavira . Really helpful.
Cheers,Pragyanshis


> From: u...@odoko.co.uk
> To: solr-user@lucene.apache.org
> Subject: Re: Search a Phrase
> Date: Sun, 10 Feb 2013 20:39:23 +0000
> 
> If you have a field of type 'text_general', searching for:
> 
>  q=good microwave
> 
> Will find any documents with either 'good' or 'microwave' in them.
> 
> Searching for:
> 
>   q="good microwave"
> 
> will find any documents that contain both terms next to each other.
> 
>   q="good microwave"^5 good microwave
> 
> will find any documents that contain either term, but will boost
> documents that contain the terms next to each other above those that
> don't.
> 
> Note also, when Lucene scores a document, it uses a 'co-ordination
> factor' which takes into account the number of query terms that matched
> your document. Thus, a document matching both terms will score more
> highly than a document only matching one of them.
> 
> So, a part of the question is whether you wish to *only* show documents
> that include both terms, or whether you are happy for good matches to be
> prioritised.
> 
> Upayavira
> 
> On Sun, Feb 10, 2013, at 05:27 PM, Pragyanshis Pattanaik wrote:
> > Hi,
> > I did one work around to get all documents that contains "Good" or
> > "Microwave" or "Good Microwave",if i will pass "Good Microwave" as q
> > parameter please guide me wheather i am going in right direction or not.
> > I defined two field type(text_general and shingleString) in my schema
> > like below
> > <fieldType name="text_general" class="solr.TextField"
> > positionIncrementGap="100">      <analyzer type="index">       
> > <tokenizer class="solr.StandardTokenizerFactory"/>        <filter
> > class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
> > enablePositionIncrements="true" />                <filter
> > class="solr.LowerCaseFilterFactory"/>      </analyzer>      <analyzer
> > type="query">        <tokenizer class="solr.StandardTokenizerFactory"/>  
> >      <filter class="solr.StopFilterFactory" ignoreCase="true"
> > words="stopwords.txt" enablePositionIncrements="true" />               
> > <filter class="solr.LowerCaseFilterFactory"/>      </analyzer>   
> > </fieldType>
> > <fieldType name="shingleString" class="solr.TextField"
> > positionIncrementGap="100" omitNorms="true">  <analyzer type="index">   
> > <tokenizer class="solr.KeywordTokenizerFactory"/>    <filter
> > class="solr.LowerCaseFilterFactory"/>    <filter
> > class="solr.PositionFilterFactory" />  </analyzer>  <analyzer
> > type="query">    <tokenizer class="solr.KeywordTokenizerFactory"/>   
> > <filter class="solr.ShingleFilterFactory" outputUnigrams="true"
> > outputUnigramIfNoNgram="true" maxShingleSize="99"/>    <filter
> > class="solr.PositionFilterFactory" />    <filter
> > class="solr.LowerCaseFilterFactory"/>  </analyzer></fieldType>
> > then while indexing i am adding all these field to two different copy
> > fields like below.
> > <field name="SearchableField" type="shingleString" indexed="true"
> > stored="false" multiValued="true"/>   <copyField source="ProductName"
> > dest="SearchableField"/>         <copyField source="ProductDesription"
> > dest="SearchableField"/> <copyField source="Product Feedback"
> > dest="SearchableField"/>
> > <field name="SearchableField1" type="text_general" indexed="true"
> > stored="false" multiValued="true"/>   <copyField source="ProductName"
> > dest="SearchableField1"/>        <copyField source="ProductDesription"
> > dest="SearchableField1"/>        <copyField source="Product Feedback"
> > dest="SearchableField1"/>
> > And now if i am querying on both the fields SearchableField and
> > SearchableField1 i am getting all the documents which contains "Good" or
> > "Microwave" or "Good Microwave".Below is the query i am using to get all
> > the
> > documents.q=SearchableField%3AGood+Microwave%0ASearchableField1%3AGood+Microwave
> > But the documents containing the whole phrase "Good Microwave",are
> > getting a very low score.Can anybody guide me to get a higher score on
> > those documents which contains the whole phrase if at all my approach is
> > correct ?
> > Or can anybody guide me to achieve this ?
> > Thanks,Pragyanshis
> > > From: pragyans...@outlook.com
> > > To: solr-user@lucene.apache.org
> > > Subject: Search a Phrase
> > > Date: Thu, 7 Feb 2013 19:29:04 +0530
> > > 
> > > 
> > > 
> > > 
> > > Hi,
> > > 
> > > My schema is like below
> > > 
> > > <fields>   
> > >     <field name="ProductId" type="int" indexed="true" stored="true" />    
> > >     <field name="ProductName" type="text_general" indexed="true" 
> > > stored="true" required="true" />
> > >     <field name="ProductDesription" type="string" indexed="true" 
> > > stored="true" required="true" />
> > >     <field name="Product Rating" type="int" indexed="true" stored="true" 
> > > required="true" />
> > >     <field name="Product Feedback" type="text_general" indexed="true" 
> > > stored="true" required="true" />
> > > </fields>
> > > 
> > > and my text_general field is like below
> > > 
> > > <fieldType name="text_general" class="solr.TextField" 
> > > positionIncrementGap="100">
> > >       <analyzer type="index">
> > >         <tokenizer class="solr.StandardTokenizerFactory"/>
> > >         <filter class="solr.StopFilterFactory" ignoreCase="true" 
> > > words="stopwords.txt" enablePositionIncrements="true" />        
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > >       <analyzer type="query">
> > >         <tokenizer class="solr.StandardTokenizerFactory"/>
> > >         <filter class="solr.StopFilterFactory" ignoreCase="true" 
> > > words="stopwords.txt" enablePositionIncrements="true" />
> > >         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
> > > ignoreCase="true" expand="true"/>
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > >     </fieldType>
> > > 
> > > How can i search a Phrase("Good Microwave") over ProductDesription and 
> > > Product Feedback field ?
> > > Here some documents might contain only "Good" and some might contain only 
> > > "Microwave".
> > > 
> > > How to get all  documents that contains "Good" or "Microwave" or "Good 
> > > Microwave",if i will pass "Good Microwave" as q parameter  ?
> > > 
> > > 
> > > 
> > > Thanks in advance
> > > 
> > > 
> > >                                     
> >

RE: Search a Phrase

Reply via email to