Help with highlighting

2010-06-22 Thread noel
Hi, I need help with highlighting fields that would match a query. So far, my 
results only highlight if the field is from all_text, and I would like it to 
use other fields. It simply isn't the case if I just turn highlighting on. Any 
ideas why it only applies to all_text? Here is my schema:




















































 































































unique_key   
all_text








Re: Help with highlighting

2010-06-23 Thread noel
Here's my request:
q=ASA+AND+minisite_id%3A36&version=1.3&json.nl=map&rows=10&start=0&wt=json&hl=true&hl.fl=%2A&hl.simple.pre=%3Cspan+class%3D%22hl%22%3E&hl.simple.post=%3C%2Fspan%3E&hl.fragsize=0&hl.mergeContiguous=false

And here's what happened:
It didn't return results, even when I applied an asterisk for which fields 
highlight. I tried other fields and that didn't work either, however all_text 
is the only one that works. Any other ideas why the other fields won't 
highlight? Thanks.

-Original Message-
From: "Erik Hatcher" 
Sent: Tuesday, June 22, 2010 9:49pm
To: solr-user@lucene.apache.org
Subject: Re: Help with highlighting

You need to share with us the Solr request you made, any any custom  
request handler settings that might map to.  Chances are you just need  
to twiddle with the highlighter parameters (see wiki for docs) to get  
it to do what you want.

Erik

On Jun 22, 2010, at 4:42 PM, n...@frameweld.com wrote:

> Hi, I need help with highlighting fields that would match a query.  
> So far, my results only highlight if the field is from all_text, and  
> I would like it to use other fields. It simply isn't the case if I  
> just turn highlighting on. Any ideas why it only applies to  
> all_text? Here is my schema:
>
> 
>
> 
>   
>   
>   
>   
>sortMissingLast="true" omitNorms="true" />
>sortMissingLast="true" omitNorms="true" />
>   
>   
>omitNorms="true"/>
>
>   
>omitNorms="true"/>
>omitNorms="true"/>
>   
>   
>sortMissingLast="true" omitNorms="true"/>
>sortMissingLast="true" omitNorms="true"/>
>sortMissingLast="true" omitNorms="true"/>
>sortMissingLast="true" omitNorms="true"/>
>   
>   
>
>sortMissingLast="true" omitNorms="true"/>
>   
>   
>indexed="true" />
>   
>   
>positionIncrementGap="100">
>   
>class="solr.WhitespaceTokenizerFactory"/>
>   
>   
>
>   
>positionIncrementGap="100">
>   
>class="solr.WhitespaceTokenizerFactory"/>
>   
> 
> generateWordParts="1" generateNumberParts="1" catenateWords="1"  
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>   
> 
> protected="protwords.txt"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>   
>
>   
>class="solr.WhitespaceTokenizerFactory"/>
>synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> 
> generateWordParts="1" generateNumberParts="1" catenateWords="0"  
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>   
> 
> protected="protwords.txt"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>   
>   
>
>   
>positionIncrementGap="100" >
>   
>class="solr.WhitespaceTokenizerFactory"/>
>synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
>ignoreCase="true"  
> words="stopwords.txt"/>
> 
> generateWordParts="0" generateNumberParts="0" catenateWords="1"  
> catenateNumbers="1" catenateAll="0"/>
>   
> 
> protected="protwords.txt"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>
>   
>   
>   
>positionIncrementGap="100" >
>   
>class="solr.StandardTokenizerFactory"/>
>   
>class="solr.RemoveDuplicatesTokenFilterFactory" />
>maxShingleSize="2"  
> outputUnigrams="false" />
>   
>
>   
>   
>sortMissingLast="true" omitNorms="true">
>   
>class="solr.KeywordTokenizerFactory"/>
>   
>   
>  pattern="([^a-z])" replacement="" 
> replace="all"
>   />
>   
>   
>
>class="solr.StrField" />
>

Re: Help with highlighting

2010-06-23 Thread noel
Thanks, that's exactly the problem. I've tried different types, even a 
fieldType that had no tokenizers and that didn't work. However, text just gives 
me my results as wanted. 

-Original Message-
From: "dan sutton" 
Sent: Wednesday, June 23, 2010 12:06pm
To: solr-user@lucene.apache.org
Subject: Re: Help with highlighting

It looks to me like a tokenisation issue, all_text content and the query
text will match, but the string fieldtype fields 'might not' and therefore
will not be highlighted.

On Wed, Jun 23, 2010 at 4:40 PM,  wrote:

> Here's my request:
> q=ASA+AND+minisite_id%3A36&version=1.3&json.nl
> =map&rows=10&start=0&wt=json&hl=true&hl.fl=%2A&hl.simple.pre=%3Cspan+class%3D%22hl%22%3E&hl.simple.post=%3C%2Fspan%3E&hl.fragsize=0&hl.mergeContiguous=false
>
> And here's what happened:
> It didn't return results, even when I applied an asterisk for which fields
> highlight. I tried other fields and that didn't work either, however
> all_text is the only one that works. Any other ideas why the other fields
> won't highlight? Thanks.
>
> -Original Message-
> From: "Erik Hatcher" 
> Sent: Tuesday, June 22, 2010 9:49pm
> To: solr-user@lucene.apache.org
> Subject: Re: Help with highlighting
>
> You need to share with us the Solr request you made, any any custom
> request handler settings that might map to.  Chances are you just need
> to twiddle with the highlighter parameters (see wiki for docs) to get
> it to do what you want.
>
>Erik
>
> On Jun 22, 2010, at 4:42 PM, n...@frameweld.com wrote:
>
> > Hi, I need help with highlighting fields that would match a query.
> > So far, my results only highlight if the field is from all_text, and
> > I would like it to use other fields. It simply isn't the case if I
> > just turn highlighting on. Any ideas why it only applies to
> > all_text? Here is my schema:
> >
> > 
> >
> > 
> >   
> >   
> >
> >   
> >> sortMissingLast="true" omitNorms="true" />
> >> sortMissingLast="true" omitNorms="true" />
> >
> >   
> >omitNorms="true"/>
> >
> >omitNorms="true"/>
> >omitNorms="true"/>
> >omitNorms="true"/>
> >
> >   
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >> sortMissingLast="true" omitNorms="true"/>
> >
> >   
> >
> >> sortMissingLast="true" omitNorms="true"/>
> >
> >   
> >> indexed="true" />
> >
> >   
> >> positionIncrementGap="100">
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >   
> >   
> >
> >
> >> positionIncrementGap="100">
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >   
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="1"
> > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >   
> >
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> >class="solr.RemoveDuplicatesTokenFilterFactory"/>
> >   
> >   
> >
> >
> >> positionIncrementGap="100" >
> >   
> >class="solr.WhitespaceTokenizerFactory"/>
> >> synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
> >ignoreCase="true"
> > words="stopwords.txt"/>
> >class="solr.WordDelimiterFilterFactory"
> > generateWordParts="0" generateNumberParts="0" catenateWords="1"
> > catenateNumbers="1" catenateAll="0"/>
> >class="solr.LowerCaseFilterFactory"/>
> >class="solr.EnglishPorterFilterFactory"
> > protected="protwords.txt"/>
> 

Sorting not working on a string field

2010-09-10 Thread noel
Hello, I seem to be having a problem with sorting. I have a string field 
(time_code) that I want to order by. When the results come up, it displays the 
results differently from relevance which I would assume, but the results aren't 
ordered. The data in time_code came from a numeric decimal with a six digit 
precision if that makes a difference(ex: 1.00).

Here is the query I give it:

q=ceremony+AND+presentation_id%3A296+AND+type%3Ablob&version=1.3&json.nl=map&rows=10&start=0&wt=json&hl=true&hl.fl=text&hl.simple.pre=&hl.simple.post=<%2Fspan>&hl.fragsize=0&hl.mergeContiguous=false&&sort=time_code+asc


And here's the field schema:













Thanks for any help.



Re: Sorting not working on a string field

2010-09-13 Thread noel
You're right, it would be better to just give it a sortable numerical value. 
For now I gave time_code a sdouble type and see if it sorted, and it did. 
However all the 0's are trimmed, but that shouldn't be a problem unless it were 
to truncate any values past the hundreds column.

Thanks.
- Noel

-Original Message-
From: "Jan Høydahl / Cominvent" 
Sent: Monday, September 13, 2010 5:31am
To: solr-user@lucene.apache.org
Subject: Re: Sorting not working on a string field

Hi,

May you show us what result you actually get? Wouldn't it make more sense to 
choose a numeric fieldtype? To get proper sort order of numbers in a string 
field, all number need to be exactly same length since order will be 
lexiographical, i.e. "10" will come before "2", but after "02".

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 10. sep. 2010, at 19.14, n...@frameweld.com wrote:

> Hello, I seem to be having a problem with sorting. I have a string field 
> (time_code) that I want to order by. When the results come up, it displays 
> the results differently from relevance which I would assume, but the results 
> aren't ordered. The data in time_code came from a numeric decimal with a six 
> digit precision if that makes a difference(ex: 1.00).
> 
> Here is the query I give it:
> 
> q=ceremony+AND+presentation_id%3A296+AND+type%3Ablob&version=1.3&json.nl=map&rows=10&start=0&wt=json&hl=true&hl.fl=text&hl.simple.pre=&hl.simple.post=<%2Fspan>&hl.fragsize=0&hl.mergeContiguous=false&&sort=time_code+asc
> 
> 
> And here's the field schema:
> 
> 
> 
> 
>  multiValued="true"/>
> 
> 
> 
>  allowDups="true" multiValued="true"/>
> 
>  allowDups="true"/>
> 
> 
> Thanks for any help.
> 





Searching solr with a two word query

2010-09-17 Thread noel
For some reason, when I run a query that has only two words in it, I get back 
repeating results of the last word. If I were to search for something like 
"good tonight", I'll get results like:

good tonight
tonight good
tonight
tonight
tonight
tonight
tonight
tonight


Basically, the first word if it was searched alone does have results, but it 
doesn't appear anywhere else in the results unless if it were there with the 
second word. I'm not exactly what this has to do with, help would be 
appreciated.



Re: Searching solr with a two word query

2010-09-20 Thread noel
Here is my raw query:
q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablob&version=1.3&json.nl=map&rows=10&start=0&wt=xml&hl=true&hl.fl=text&hl.simple.pre=&hl.simple.post=<%2Fspan>&hl.fragsize=0&hl.mergeContiguous=false&debugQuery=on

and here is what I get on the debugQuery:

−

opening excellent AND presentation_id:294 AND type:blob

−

opening excellent AND presentation_id:294 AND type:blob

−

all_text:open +all_text:excel +presentation_id:294 +type:blob

−

all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob

−

−


3.1143723 = (MATCH) sum of:
  0.46052343 = (MATCH) weight(all_text:open in 4457), product of:
0.5531408 = queryWeight(all_text:open), product of:
  5.3283896 = idf(docFreq=162, maxDocs=12359)
  0.10381013 = queryNorm
0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of:
  1.0 = tf(termFreq(all_text:open)=1)
  5.3283896 = idf(docFreq=162, maxDocs=12359)
  0.15625 = fieldNorm(field=all_text, doc=4457)
  0.74662465 = (MATCH) weight(all_text:excel in 4457), product of:
0.7043054 = queryWeight(all_text:excel), product of:
  6.7845535 = idf(docFreq=37, maxDocs=12359)
  0.10381013 = queryNorm
1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of:
  1.0 = tf(termFreq(all_text:excel)=1)
  6.7845535 = idf(docFreq=37, maxDocs=12359)
  0.15625 = fieldNorm(field=all_text, doc=4457)
  1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4457), product of:
0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of:
  4.1625586 = idf(docFreq=522, maxDocs=12359)
  0.10381013 = queryNorm
4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4457), product of:
  1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1)
  4.1625586 = idf(docFreq=522, maxDocs=12359)
  1.0 = fieldNorm(field=presentation_id, doc=4457)
  0.108517066 = (MATCH) weight(type:blob in 4457), product of:
0.10613751 = queryWeight(type:blob), product of:
  1.0224196 = idf(docFreq=12084, maxDocs=12359)
  0.10381013 = queryNorm
1.0224196 = (MATCH) fieldWeight(type:blob in 4457), product of:
  1.0 = tf(termFreq(type:blob)=1)
  1.0224196 = idf(docFreq=12084, maxDocs=12359)
  1.0 = fieldNorm(field=type, doc=4457)

−


2.06395 = (MATCH) product of:
  2.7519336 = (MATCH) sum of:
0.84470934 = (MATCH) weight(all_text:excel in 4911), product of:
  0.7043054 = queryWeight(all_text:excel), product of:
6.7845535 = idf(docFreq=37, maxDocs=12359)
0.10381013 = queryNorm
  1.199351 = (MATCH) fieldWeight(all_text:excel in 4911), product of:
1.4142135 = tf(termFreq(all_text:excel)=2)
6.7845535 = idf(docFreq=37, maxDocs=12359)
0.125 = fieldNorm(field=all_text, doc=4911)
1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4911), product of:
  0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of:
4.1625586 = idf(docFreq=522, maxDocs=12359)
0.10381013 = queryNorm
  4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4911), product 
of:
1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1)
4.1625586 = idf(docFreq=522, maxDocs=12359)
1.0 = fieldNorm(field=presentation_id, doc=4911)
0.108517066 = (MATCH) weight(type:blob in 4911), product of:
  0.10613751 = queryWeight(type:blob), product of:
1.0224196 = idf(docFreq=12084, maxDocs=12359)
0.10381013 = queryNorm
  1.0224196 = (MATCH) fieldWeight(type:blob in 4911), product of:
1.0 = tf(termFreq(type:blob)=1)
1.0224196 = idf(docFreq=12084, maxDocs=12359)
1.0 = fieldNorm(field=type, doc=4911)
  0.75 = coord(3/4)

−


1.9903867 = (MATCH) product of:
  2.653849 = (MATCH) sum of:
0.74662465 = (MATCH) weight(all_text:excel in 4468), product of:
  0.7043054 = queryWeight(all_text:excel), product of:
6.7845535 = idf(docFreq=37, maxDocs=12359)
0.10381013 = queryNorm
  1.0600865 = (MATCH) fieldWeight(all_text:excel in 4468), product of:
1.0 = tf(termFreq(all_text:excel)=1)
6.7845535 = idf(docFreq=37, maxDocs=12359)
0.15625 = fieldNorm(field=all_text, doc=4468)
1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4468), product of:
  0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of:
4.1625586 = idf(docFreq=522, maxDocs=12359)
0.10381013 = queryNorm
  4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4468), product 
of:
1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1)
4.1625586 = idf(docFreq=522, maxDocs=12359)
1.0 = fieldNorm(field=presentation_id, doc=4468)
0.108517066 = (MATCH) weight(type:blob in 4468), product of:
  0.10613751 = queryWeight(type:blob), product of:
1.0224196 = idf(docFreq=12084, maxDocs=12359)
0.10381013 = queryNorm
  1.0224196 = (MATCH) fieldWeight(type:blob in 4468), product of:
1.0 = tf(termFreq(type:blob)=1)
1.0224196 = idf(docFreq=12084, m

Re: Searching solr with a two word query

2010-09-20 Thread noel
I noticed that my defaultOperator is "OR", and that does have an effect on what 
does come up. If I were to change that to and, it's an exact match to my query, 
but Im would like similar matches with either word as a single result. Is there 
another value I can use? Or maybe I should use another query parser?

Thanks.
- Noel

-Original Message-
From: "Erick Erickson" 
Sent: Monday, September 20, 2010 10:05am
To: solr-user@lucene.apache.org
Subject: Re: Searching solr with a two word query

Here's an excellent description of the Lucene query operators and how they
differ from strict
boolean logic: http://www.gossamer-threads.com/lists/lucene/java-user/47928

<http://www.gossamer-threads.com/lists/lucene/java-user/47928>But the short
form is that (and boy, doesn't the fact that the URL escaping spaces
as '+', which is also a Lucene operator make looking at these interesting),
is that the
first term is essentially a SHOULD clause in a Lucene BooleanQuery and is
matching your docs all by itself.

HTH
Erick

On Mon, Sep 20, 2010 at 8:58 AM,  wrote:

> Here is my raw query:
> q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablob&version=1.3&
> json.nl
> =map&rows=10&start=0&wt=xml&hl=true&hl.fl=text&hl.simple.pre=&hl.simple.post=<%2Fspan>&hl.fragsize=0&hl.mergeContiguous=false&debugQuery=on
>
> and here is what I get on the debugQuery:
> 
> −
> 
> opening excellent AND presentation_id:294 AND type:blob
> 
> −
> 
> opening excellent AND presentation_id:294 AND type:blob
> 
> −
> 
> all_text:open +all_text:excel +presentation_id:294 +type:blob
> 
> −
> 
> all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob
> 
> −
> 
> −
> 
>
> 3.1143723 = (MATCH) sum of:
>  0.46052343 = (MATCH) weight(all_text:open in 4457), product of:
>0.5531408 = queryWeight(all_text:open), product of:
>  5.3283896 = idf(docFreq=162, maxDocs=12359)
>  0.10381013 = queryNorm
>0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of:
>  1.0 = tf(termFreq(all_text:open)=1)
>  5.3283896 = idf(docFreq=162, maxDocs=12359)
>  0.15625 = fieldNorm(field=all_text, doc=4457)
>  0.74662465 = (MATCH) weight(all_text:excel in 4457), product of:
>0.7043054 = queryWeight(all_text:excel), product of:
>  6.7845535 = idf(docFreq=37, maxDocs=12359)
>  0.10381013 = queryNorm
>1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of:
>  1.0 = tf(termFreq(all_text:excel)=1)
>  6.7845535 = idf(docFreq=37, maxDocs=12359)
>  0.15625 = fieldNorm(field=all_text, doc=4457)
>  1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4457), product of:
>0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of:
>  4.1625586 = idf(docFreq=522, maxDocs=12359)
>  0.10381013 = queryNorm
>4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4457), product
> of:
>  1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1)
>  4.1625586 = idf(docFreq=522, maxDocs=12359)
>  1.0 = fieldNorm(field=presentation_id, doc=4457)
>  0.108517066 = (MATCH) weight(type:blob in 4457), product of:
>0.10613751 = queryWeight(type:blob), product of:
>  1.0224196 = idf(docFreq=12084, maxDocs=12359)
>  0.10381013 = queryNorm
>1.0224196 = (MATCH) fieldWeight(type:blob in 4457), product of:
>  1.0 = tf(termFreq(type:blob)=1)
>  1.0224196 = idf(docFreq=12084, maxDocs=12359)
>  1.0 = fieldNorm(field=type, doc=4457)
> 
> −
> 
>
> 2.06395 = (MATCH) product of:
>  2.7519336 = (MATCH) sum of:
>0.84470934 = (MATCH) weight(all_text:excel in 4911), product of:
>  0.7043054 = queryWeight(all_text:excel), product of:
>6.7845535 = idf(docFreq=37, maxDocs=12359)
>0.10381013 = queryNorm
>  1.199351 = (MATCH) fieldWeight(all_text:excel in 4911), product of:
>1.4142135 = tf(termFreq(all_text:excel)=2)
>6.7845535 = idf(docFreq=37, maxDocs=12359)
>0.125 = fieldNorm(field=all_text, doc=4911)
>1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4911), product of:
>  0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of:
>4.1625586 = idf(docFreq=522, maxDocs=12359)
>0.10381013 = queryNorm
>  4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4911),
> product of:
>1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1)
>4.1625586 = idf(docFreq=522, maxDocs=12359)
>1.0 = fieldNorm(field=presentation_id, doc=4911)
>0.108517066 = (MATCH) weight(type:blob in 4911), product of:
>  0.10613751 = queryWeight(type:blob), product of:
>1.0224196 = idf(docFreq=12084, maxDocs=12359)
>0.10381013 =

Re: Searching solr with a two word query

2010-09-20 Thread noel
Say if I had a two word query that was "opening excellent", I would like it to 
return something like:

opening excellent
opening
opening
opening
excellent
excellent
excellent

Instead of:
opening excellent
excellent
excellent
excellent

If I did a search, I would like the first word alone to also show up in the 
results, because currently my results show both words in one result and only 
the second word for the rest of the results. I've done a search on each word by 
itself, and there are results for them.

Thanks.

-Original Message-
From: "Erick Erickson" 
Sent: Monday, September 20, 2010 2:37pm
To: solr-user@lucene.apache.org
Subject: Re: Searching solr with a two word query

I'm missing what you really want out of your query, your
phrase "either word as a single result" just isn't connecting
in my grey matter.. Could you give some example inputs and
outputs that demonstrates what you want?

Best
Erick

On Mon, Sep 20, 2010 at 11:41 AM,  wrote:

> I noticed that my defaultOperator is "OR", and that does have an effect on
> what does come up. If I were to change that to and, it's an exact match to
> my query, but Im would like similar matches with either word as a single
> result. Is there another value I can use? Or maybe I should use another
> query parser?
>
> Thanks.
> - Noel
>
> -Original Message-
> From: "Erick Erickson" 
> Sent: Monday, September 20, 2010 10:05am
> To: solr-user@lucene.apache.org
> Subject: Re: Searching solr with a two word query
>
> Here's an excellent description of the Lucene query operators and how they
> differ from strict
> boolean logic:
> http://www.gossamer-threads.com/lists/lucene/java-user/47928
>
> <http://www.gossamer-threads.com/lists/lucene/java-user/47928>But the
> short
> form is that (and boy, doesn't the fact that the URL escaping spaces
> as '+', which is also a Lucene operator make looking at these interesting),
> is that the
> first term is essentially a SHOULD clause in a Lucene BooleanQuery and is
> matching your docs all by itself.
>
> HTH
> Erick
>
> On Mon, Sep 20, 2010 at 8:58 AM,  wrote:
>
> > Here is my raw query:
> >
> q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablob&version=1.3&
> > json.nl
> >
> =map&rows=10&start=0&wt=xml&hl=true&hl.fl=text&hl.simple.pre=&hl.simple.post=<%2Fspan>&hl.fragsize=0&hl.mergeContiguous=false&debugQuery=on
> >
> > and here is what I get on the debugQuery:
> > 
> > −
> > 
> > opening excellent AND presentation_id:294 AND type:blob
> > 
> > −
> > 
> > opening excellent AND presentation_id:294 AND type:blob
> > 
> > −
> > 
> > all_text:open +all_text:excel +presentation_id:294 +type:blob
> > 
> > −
> > 
> > all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob
> > 
> > −
> > 
> > −
> > 
> >
> > 3.1143723 = (MATCH) sum of:
> >  0.46052343 = (MATCH) weight(all_text:open in 4457), product of:
> >0.5531408 = queryWeight(all_text:open), product of:
> >  5.3283896 = idf(docFreq=162, maxDocs=12359)
> >  0.10381013 = queryNorm
> >0.8325609 = (MATCH) fieldWeight(all_text:open in 4457), product of:
> >  1.0 = tf(termFreq(all_text:open)=1)
> >  5.3283896 = idf(docFreq=162, maxDocs=12359)
> >  0.15625 = fieldNorm(field=all_text, doc=4457)
> >  0.74662465 = (MATCH) weight(all_text:excel in 4457), product of:
> >0.7043054 = queryWeight(all_text:excel), product of:
> >  6.7845535 = idf(docFreq=37, maxDocs=12359)
> >  0.10381013 = queryNorm
> >1.0600865 = (MATCH) fieldWeight(all_text:excel in 4457), product of:
> >  1.0 = tf(termFreq(all_text:excel)=1)
> >  6.7845535 = idf(docFreq=37, maxDocs=12359)
> >  0.15625 = fieldNorm(field=all_text, doc=4457)
> >  1.7987071 = (MATCH) weight(presentation_id:€#0;Ħ in 4457), product of:
> >0.43211576 = queryWeight(presentation_id:€#0;Ħ), product of:
> >  4.1625586 = idf(docFreq=522, maxDocs=12359)
> >  0.10381013 = queryNorm
> >4.1625586 = (MATCH) fieldWeight(presentation_id:€#0;Ħ in 4457),
> product
> > of:
> >  1.0 = tf(termFreq(presentation_id:€#0;Ħ)=1)
> >  4.1625586 = idf(docFreq=522, maxDocs=12359)
> >  1.0 = fieldNorm(field=presentation_id, doc=4457)
> >  0.108517066 = (MATCH) weight(type:blob in 4457), product of:
> >0.10613751 = queryWeight(type:blob), product of:
> >  1.0224196 = idf(docFreq=12084, maxDocs=12359)
> >  0.10381013 = queryNorm
> >1

Re: Searching solr with a two word query

2010-09-21 Thread noel
Alright, this is making much more sense now, but there are still some problems. 
Removing the first AND in the query did solve a lot of things, beforehand I 
didn't know that it was requiring the word excel. Now I run the query as either 
of the two:

opening excellent +presentation_id:294 +type:blob
opening excellent presentation_id:294 AND type:blob

I do get the results I want which have both words or either, BUT I get results 
that have neither. I think it may be grabbing results that belong to the 
specific presentation_id and if it's a blob type.

What I want it to do is give me results that have both words, or one or the 
other.

Thanks,
- Noel

-Original Message-
From: "Tom Hill" 
Sent: Monday, September 20, 2010 6:39pm
To: solr-user@lucene.apache.org
Subject: Re: Searching solr with a two word query

It will probably be clearer if you don't use the pseudo-boolean
operators, and just use + for required terms.

If you look at your output from debug, you see your query becomes:

    all_text:open +all_text:excel +presentation_id:294 +type:blob

Note that "all_text:open" does not have a + sign, but
"all_text:excel" has one. So "all_text:open" is not required, but
"all_text:excel" is.

I think this is because AND marks both of its operands as required.
(which puts the + on +"all_text:excel"), but the open has no explicit
op, so it uses OR, which marks that term as optional.

What I would suggest you do is:

   opening excellent +presentation_id:294 +type:blob

Which is think is much clearer.

I think you could also do
  opening excellent presentation_id:294 AND type:blob
but I think it's  non-obvious how the result will differ from
  opening excellent AND presentation_id:294 AND type:blob
So I wouldn't use either of the last two.


Tom
p.s. Not sure what is going on with the last lines of your debug
output for the query. Is that really what shows up after presentation
ID? I see Euro, hash mark, zero, semi-colon, and "H with stroke"


all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob


On Mon, Sep 20, 2010 at 12:46 PM,  wrote:
>
> Say if I had a two word query that was "opening excellent", I would like it 
> to return something like:
>
> opening excellent
> opening
> opening
> opening
> excellent
> excellent
> excellent
>
> Instead of:
> opening excellent
> excellent
> excellent
> excellent
>
> If I did a search, I would like the first word alone to also show up in the 
> results, because currently my results show both words in one result and only 
> the second word for the rest of the results. I've done a search on each word 
> by itself, and there are results for them.
>
> Thanks.
>
> -Original Message-
> From: "Erick Erickson" 
> Sent: Monday, September 20, 2010 2:37pm
> To: solr-user@lucene.apache.org
> Subject: Re: Searching solr with a two word query
>
> I'm missing what you really want out of your query, your
> phrase "either word as a single result" just isn't connecting
> in my grey matter.. Could you give some example inputs and
> outputs that demonstrates what you want?
>
> Best
> Erick
>
> On Mon, Sep 20, 2010 at 11:41 AM,  wrote:
>
> > I noticed that my defaultOperator is "OR", and that does have an effect on
> > what does come up. If I were to change that to and, it's an exact match to
> > my query, but Im would like similar matches with either word as a single
> > result. Is there another value I can use? Or maybe I should use another
> > query parser?
> >
> > Thanks.
> > - Noel
> >
> > -Original Message-
> > From: "Erick Erickson" 
> > Sent: Monday, September 20, 2010 10:05am
> > To: solr-user@lucene.apache.org
> > Subject: Re: Searching solr with a two word query
> >
> > Here's an excellent description of the Lucene query operators and how they
> > differ from strict
> > boolean logic:
> > http://www.gossamer-threads.com/lists/lucene/java-user/47928
> >
> > <http://www.gossamer-threads.com/lists/lucene/java-user/47928>But the
> > short
> > form is that (and boy, doesn't the fact that the URL escaping spaces
> > as '+', which is also a Lucene operator make looking at these interesting),
> > is that the
> > first term is essentially a SHOULD clause in a Lucene BooleanQuery and is
> > matching your docs all by itself.
> >
> > HTH
> > Erick
> >
> > On Mon, Sep 20, 2010 at 8:58 AM,  wrote:
> >
> > > Here is my raw query:
> > >
> > q=opening+excellent+AND+presentation_id%3A294+AND+type%3Ablob&version=1.3&
> > >

Re: Searching solr with a two word query

2010-09-21 Thread noel
Thanks for all the help, this was exactly the solution I needed, it returned 
all results containing both words, and single words. 

-Original Message-
From: "Thomas Joiner" 
Sent: Tuesday, September 21, 2010 2:53pm
To: solr-user@lucene.apache.org
Subject: Re: Searching solr with a two word query

I think what you want is a query like

all_text:(opening excellent) AND presentation_id:294 AND type:blob

which will require one of the all_text clauses to be true.

On Tue, Sep 21, 2010 at 12:20 PM,  wrote:

> Alright, this is making much more sense now, but there are still some
> problems. Removing the first AND in the query did solve a lot of things,
> beforehand I didn't know that it was requiring the word excel. Now I run the
> query as either of the two:
>
> opening excellent +presentation_id:294 +type:blob
> opening excellent presentation_id:294 AND type:blob
>
> I do get the results I want which have both words or either, BUT I get
> results that have neither. I think it may be grabbing results that belong to
> the specific presentation_id and if it's a blob type.
>
> What I want it to do is give me results that have both words, or one or the
> other.
>
> Thanks,
> - Noel
>
> -Original Message-
> From: "Tom Hill" 
> Sent: Monday, September 20, 2010 6:39pm
> To: solr-user@lucene.apache.org
> Subject: Re: Searching solr with a two word query
>
> It will probably be clearer if you don't use the pseudo-boolean
> operators, and just use + for required terms.
>
> If you look at your output from debug, you see your query becomes:
>
> all_text:open +all_text:excel +presentation_id:294 +type:blob
>
> Note that "all_text:open" does not have a + sign, but
> "all_text:excel" has one. So "all_text:open" is not required, but
> "all_text:excel" is.
>
> I think this is because AND marks both of its operands as required.
> (which puts the + on +"all_text:excel"), but the open has no explicit
> op, so it uses OR, which marks that term as optional.
>
> What I would suggest you do is:
>
>opening excellent +presentation_id:294 +type:blob
>
> Which is think is much clearer.
>
> I think you could also do
>   opening excellent presentation_id:294 AND type:blob
> but I think it's  non-obvious how the result will differ from
>   opening excellent AND presentation_id:294 AND type:blob
> So I wouldn't use either of the last two.
>
>
> Tom
> p.s. Not sure what is going on with the last lines of your debug
> output for the query. Is that really what shows up after presentation
> ID? I see Euro, hash mark, zero, semi-colon, and "H with stroke"
>
> 
> all_text:open +all_text:excel +presentation_id:€#0;Ħ +type:blob
> 
>
> On Mon, Sep 20, 2010 at 12:46 PM,  wrote:
> >
> > Say if I had a two word query that was "opening excellent", I would like
> it to return something like:
> >
> > opening excellent
> > opening
> > opening
> > opening
> > excellent
> > excellent
> > excellent
> >
> > Instead of:
> > opening excellent
> > excellent
> > excellent
> > excellent
> >
> > If I did a search, I would like the first word alone to also show up in
> the results, because currently my results show both words in one result and
> only the second word for the rest of the results. I've done a search on each
> word by itself, and there are results for them.
> >
> > Thanks.
> >
> > -Original Message-
> > From: "Erick Erickson" 
> > Sent: Monday, September 20, 2010 2:37pm
> > To: solr-user@lucene.apache.org
> > Subject: Re: Searching solr with a two word query
> >
> > I'm missing what you really want out of your query, your
> > phrase "either word as a single result" just isn't connecting
> > in my grey matter.. Could you give some example inputs and
> > outputs that demonstrates what you want?
> >
> > Best
> > Erick
> >
> > On Mon, Sep 20, 2010 at 11:41 AM,  wrote:
> >
> > > I noticed that my defaultOperator is "OR", and that does have an effect
> on
> > > what does come up. If I were to change that to and, it's an exact match
> to
> > > my query, but Im would like similar matches with either word as a
> single
> > > result. Is there another value I can use? Or maybe I should use another
> > > query parser?
> > >
> > > Thanks.
> > > - Noel
> > >
> > > -Original Message-
> > > From: "Erick Erickson" 
> > > Sent: Mon

Questions about Solr

2010-04-09 Thread noel
Hi, I would like to know the answer to the following:

- How am I able to use wildcard searches with Solr? EX: querying Ado with a 
result that would retrieve something like Adolescent.

- Phrase searches with stop words completely ruin the query and finds no 
results. How can I query something like "To be or not to be" with stop words 
enabled?

- I use synonyms for certain keywords. However, when I search for a specific 
phrase which does contain synonyms, results with the synonyms rank higher than 
the ones that have the exact term. How can that be fixed?

Thanks,
Noel



Highlighting a field with a certain value

2010-05-24 Thread noel
Hello,

How am I able to highlight a field that contains a specific value? If I have a 
field called type, how am I able to highlight the rows whose values contain 
something like "title"?