Erick, Thanks a lot for the explanation, makes sense now.
Tom On Tue, Mar 3, 2015 at 5:54 PM, Erick Erickson <erickerick...@gmail.com> wrote: > bq: Does it mean that words between " symbols, such as "Orange ordered" are > treated as a single term, with (implicitly) AND conjunction between them? > > not at all. When you quote things, you're getting a "phrase query", > perhaps one > with slop. So something like > "a b" means that 'a' must appear right next to 'b'. This is something > like an AND > in the sense that both terms must appear, but it is far more > restrictive since it takes into > account the position of the terms in the field. > > "a b"~10 means that both words must appear within 10 transpositions in > the same field. > You can think of "transposition" as how many intervening terms there > are, so something > like "a b"~2 would match docs with "a x b", but not "a x y z b". > > And this is where positionIncrementGap comes in. By putting 1000 in > for it, you guarantee > "a b"~999 won't match 'a' in one field and 'b' in another. > > whereas a AND b would match across successive MV entries no matter what the > gap. > > HTH, > Erick > > On Tue, Mar 3, 2015 at 2:22 PM, Tom Devel <deve...@gmail.com> wrote: > > Jack, > > > > This is exactly what I was looking for, thanks. I found the > > positionIncrementGap attribute in the schema.xml for the text_en > > > > I was putting in "AND" because I read in the Solr documentation that "The > > OR operator is the default conjunction operator." > > > > Does it mean that words between " symbols, such as "Orange ordered" are > > treated as a single term, with (implicitly) AND conjunction between them? > > > > Where could I found more info about this? > > > > I am currently reading > > > https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser > > > > Thanks again > > > > On Tue, Mar 3, 2015 at 3:58 PM, Jack Krupansky <jack.krupan...@gmail.com > > > > wrote: > > > >> Just set the positionIncrementGap for the multivalued field to a much > >> higher value, like 1000 or 5000. That's the purpose of this attribute, > to > >> assure that reasonable proximity matches don't match across multiple > >> values. > >> > >> Also, leave "AND" out of the query phrases - you're just trying to match > >> the product name and availability. > >> > >> > >> -- Jack Krupansky > >> > >> On Tue, Mar 3, 2015 at 4:51 PM, Tom Devel <deve...@gmail.com> wrote: > >> > >> > Hi, > >> > > >> > I am running Solr 5.0.0 and have a question about proximity search and > >> > multiValued fields. > >> > > >> > I am indexing xml files of the following form with foundField being a > >> field > >> > defined as multiValued and text_en my in schema.xml. > >> > > >> > <?xml version="1.0" encoding="UTF-8"?> > >> > <add><doc> > >> > <field name="id">8</field> > >> > <field name="foundField">"Oranges from South California - > >> ordered"</field> > >> > <field name="foundField">"Green Apples - available"</field> > >> > <field name="foundField">"Black Report Books - ordered"</field> > >> > </doc></add> > >> > > >> > There are several such documents, and for instance, I would like to > query > >> > all documents having in the foundField "Oranges" and "ordered". The > >> > following proximity query takes care of it: > >> > > >> > q=foundField:("oranges AND ordered"~2) > >> > > >> > However, a field could have more words, and I also cannot know the > >> > proximity of the desired query words in advance. Setting the proximity > >> > value too high results in false positives, the following query also > >> returns > >> > the document (although "available" was in the entry about Apples): > >> > > >> > foundField:("oranges AND available"~200) > >> > > >> > I do not think that tweaking a proximity value is the correct > approach. > >> > > >> > How can I search to match contents in a multiValued field per Value as > >> > described above, without running into the problem? > >> > > >> > Many thanks for any help > >> > > >> >