On Nov 24, 2008, at 8:37 AM, David Santamauro wrote:
i need to search something as
myText:billion AND guarantee
i need to be extracted only the record where the words exists in
the same value (in this case only the first record) because in the
2nd record the two words are in different values
is it possible?
It's not possible with a purely boolean query like this, but it is
possible with a sloppy phrase query where the position increment
gap (see example schema.xml) is greater than the slop factor.
Erik
I think what is needed here is the concept of SAME, i.e.,
myText:billion SAME guarantee. I know a few full-text engines that
can handle this operator one way or another. And without it, I don't
quick understand the usefulness of multiValue fields.
Yeah, multi-valued fields are a bit awkward to grasp fully in Lucene.
Especially in this context where it's a full-text field. Basically as
far as indexing goes, there's no such thing as a "multi-valued"
field. An indexed field gets split into terms, and terms have
positional information attached to them (thus a position increment gap
can be used to but a big virtual gap between the last term of one
field instance and the first term of the next one). A multi-valued
field gets stored (if it is set to be stored, that is) as separate
strings, and is retrievable as the separate values.
Multi-valued fields are handy for facets where, say, a product can
have multiple categories associated with it. In this case it's a bit
clearer. It's the full-text multi-valued fields that seem a bit
strange.
Erik