: 1) Give my need, am I losing anything by writing my own copy-field in my
: Java code vs. using Solr's copyField in the schema?

nope.

: 2) How do I prevent a case where when I copy data from field A and B where
: A has "Fable of the Throbbing" and B has "Genius of a Tank Town" which get
: copied into group-X as "Fable of the Throbbing Genius of a Tank Town".
: When this happens, a phrase search for "Throbbing Genius" will get me a hit
: (when in reality, it shouldn't).  If I was using copyField, wouldn't this
: problem still exists?

Each discrete field value is kept discrete at all levels -- so adding two 
String values "Foo Bar" and "Yik Yak" (either via solrj or via 
sopyField, or via something like the CLoneFieldsUpdateProcessor does *not* 
just result in one String value of "Foo Bar Yik Yak" -- instead it results 
in a (multivalued) field containing two string values (in the order 
specified)

If/when you search on a multivalued *text* field, phrase queries can in 
fact result in matches across multiple values, so a search for "Bar Yik" 
(or "Throbbing Genius") may result in a match  -- depending on what the 
*positions* are of the temrs in those field values, and what "slop" factor 
is specified in your search.

By default, the tokens resulting from the analysis of field values have 
sequential positionIncremebts -- so "Throbbing" would have position 3, and 
"Genius" would have position 4 -- but the "positionIncrementGap" option 
can be specified on your fieldType to indicate how much of a "gap" you 
want to place in the positionIncrement for the first token produced by 
each subsequent value in a multivalued field.

So, if you had a positionIncrementGap="100" for a fieldtype, you would 
need a slop value on your phrase query of at least 100 to get any matches 
from tokens that originaled in multiple source values.

https://cwiki.apache.org/confluence/display/solr/Field+Type+Definitions+and+Properties

https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser
(scroll to the description of "Proximity Searches")


-Hoss
http://www.lucidworks.com/
  • copyField Steven White
    • Re: copyField Chris Hostetter

Reply via email to