Bah.. nope this would miss documents that only match a subset of the given terms.

I'm going to have to go with Steven's approach as the right choice here.

Matt

On 10/26/2010 3:44 PM, Matthew Hall wrote:
Indeed, I'd missed the second part of his requirements, my and solution is sadly insufficient to this task.

The combinatorial part of you solution worries me a bit though Steven, because his documents that are on the larger side of his corpus would likely slow down query performance a bit while the filter calculates all of the possibilities for a given document.

I'm wondering if a slightly hybrid approach would be valid:

Have a filter that calculates the total number of terms for a given document. And then add a clause into your query at runtime that would match what the filter would come up with:

So:

text:"Nokia" AND text:"Mobile" AND text:"GPS" AND termCount: 3

Something like that anyhow.

Matt

On 10/26/2010 3:35 PM, Dennis Gearon wrote:
I'm the LAST person anyone will ever need to worry about flame baiting. You did notice that I retracted what I said and supported your point of view?

Sorry if my cryptic comment sounded critical. I was wrong, you were right :-)
Dennis Gearon

Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
   otherwise we all die.


--- On Tue, 10/26/10, Steven A Rowe<sar...@syr.edu>  wrote:

From: Steven A Rowe<sar...@syr.edu>
Subject: RE: How do I this in Solr?
To: "solr-user@lucene.apache.org"<solr-user@lucene.apache.org>
Date: Tuesday, October 26, 2010, 12:27 PM
Hi Dennis,

You wrote:
If Solr is like Google, once documents matching only
the ANDed items
in the query ran out, then those that had only two of
the terms, then
only 1 of the terms, and then those close to it would
start showing up.
[...]
Plus, if he wants terms that contain ONLY those words,
and no others, an
ANDed query would not do that, right? ANDed queries
return results that
must have ALL the terms listed, and could have lots of
other words, right?

This is *exactly* what I just said: ANDed queries (i.e.,
requiring all query terms) will not satisfy Varun's
requirements.

Your participation in this thread looks an awful lot like
flame-bating: Someone else asks a question, I answer with a
possible solution, you give a one-word "overkill" response,
I say why it's not overkill.  You then ask if anybody
knows the answer to the original question, and then parrot
my response to your "overkill" statement.  Really????

Get your shit together or shut up.  Please.

Steve

-----Original Message-----
From: Dennis Gearon [mailto:gear...@sbcglobal.net]
Sent: Tuesday, October 26, 2010 3:14 PM
To: solr-user@lucene.apache.org
Subject: RE: How do I this in Solr?



Dennis Gearon

Signature Warning
----------------
It is always a good idea to learn from your own
mistakes. It is usually a
better idea to learn from others’ mistakes, so you
do not have to make
them yourself. from
'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
EARTH has a Right To Life,
    otherwise we all die.


--- On Tue, 10/26/10, Steven A Rowe<sar...@syr.edu>
wrote:
From: Steven A Rowe<sar...@syr.edu>
Subject: RE: How do I this in Solr?
To: "solr-user@lucene.apache.org"
<solr-user@lucene.apache.org>
Date: Tuesday, October 26, 2010, 12:10 PM
Dennis,

Do you mean to say that you read my earlier post,
and
disagree that it would solve the problem?  Or
have you
simply not read it?

Steve

-----Original Message-----
From: Dennis Gearon [mailto:gear...@sbcglobal.net]
Sent: Tuesday, October 26, 2010 3:00 PM
To: solr-user@lucene.apache.org
Subject: RE: How do I this in Solr?

Good point. Since I might need such a query
myself
someday, how *IS* that
done?


Dennis Gearon

Signature Warning
----------------
It is always a good idea to learn from your
own
mistakes. It is usually a
better idea to learn from others’
mistakes, so you
do not have to make
them yourself. from
'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
EARTH has a Right To Life,
    otherwise we all die.


--- On Tue, 10/26/10, Steven A Rowe<sar...@syr.edu>
wrote:
From: Steven A Rowe<sar...@syr.edu>
Subject: RE: How do I this in Solr?
To: "solr-user@lucene.apache.org"
<solr-user@lucene.apache.org>
Date: Tuesday, October 26, 2010, 11:46
AM
Um, maybe I'm way off base, but when
Varun said:

If I search with the text "samsung
andriod
GPS",
search results should only conain
"samsung",
"GPS",
"andriod" and "samsung andriod".
I interpreted that to mean that hit
documents
should
contain terms from the query, and
nothing else.
Making
all terms required doesn't do this.

Steve

-----Original Message-----
From: Matthew Hall [mailto:mh...@informatics.jax.org]
Sent: Tuesday, October 26, 2010
2:30 PM
To: solr-user@lucene.apache.org
Subject: Re: How do I this in
Solr?
Um.. you could change your default
clause to
AND
rather than or.
That should do the trick.

Matt

On 10/26/2010 2:26 PM, Dennis
Gearon wrote:
Overkill?

Dennis Gearon
I can't think of a way to
do it
without
writing new
analysis filters.

But I think you could do
what you
want with
two filters
(this is untested):

1. An index-time filter
that
outputs a single
token
consisting of all of the
input
tokens, sorted
in a
consistent way, e.g.:

      "mobile with
GPS"
->   "GPS mobile
with"
      "samsung
android"
->   "android
samsung"

2. A query-time filter
that outputs
one token
per input
term combination, sorted
in the
same
consistent way as the
index-time filter, e.g.:

      "samsung andriod
GPS"
        ->

"samsung","android","GPS",
           "android
samsung","GPS
samsung","android
GPS"
           "android
GPS
samsung"

Steve

-----Original
Message-----
From: Varun Gupta
[mailto:varun.vgu...@gmail.com]
Sent: Tuesday,
October 26, 2010
9:08 AM
To: solr-user@lucene.apache.org
Subject: How do I
this in
Solr?
Hi,

I have lot of small
documents
(each
containing 1 to 15
words) indexed in
Solr. For the search
query, I
want the
search results
to contain only
those
documents that
satisfy this
criteria "All
of the words
of the search
result
document are present
in the
search
query"
For example:
If I have the
following
documents
indexed: "nokia
n95", "GPS", "android",
"samsung", "samsung
andriod",
"nokia
andriod", "mobile
with GPS"
If I search with the
text
"samsung
andriod GPS",
search results should
only
conain "samsung",
"GPS",
"andriod" and
"samsung
andriod".
Is there a way to do
this in
Solr.
--
Thanks
Varun Gupta





--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012


Reply via email to