RE: Strange fuzzy behavior in 4.2.1

Ryan Wilson Thu, 16 May 2013 06:28:45 -0700

In answering your first questions, any changes we’ve been making have been
followed by a reindex.




The data that is being indexed generally looks something like this (<space>
indicating an actual space):



TIM <space> , <space> JULIO

JULIE <space> , <space> JIM



So based off what we see from looking at top terms in the field and the
analysis tool, at index time these records are being broken up such that
TIM , JULIO can be found with tim or Julio.



Just to make sure I’m not misunderstanding something about Solr/Lucene,
when a record is indexed the index analysis chain result (<tim> <,>
<julio>) is what is written to disk correct? So far as I understand it it’s
the query analysis chain that has the issue with most filters not being
applied during wildcard and fuzzy queries.



Finally, some clarification as I’ve realized my original email might not
have made this point well. I can have a particular record with a primary
key of X and a name value of LEWIS , JULIA and be able to find that exact
record with bulia~1 but not aulia~1,   or GUERRERO , JULIAN , JULIAN can be
found with julan~1 but not julia~1. It’s not that records go missing when
searched for with fuzzy, but rather the  fuzzy terms that will find them
seem, to my eyes, inconsistent.



Regards,

Ryan Wilson
rpwils...@gmail.com

RE: Strange fuzzy behavior in 4.2.1

Reply via email to