ah, thanks for the link.

--
John Blythe

On Wed, Oct 4, 2017 at 9:23 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Check. The problem is they don't encode the exact length. I _think_
> this patch shows you'd be OK with shorter lengths, but check:
> https://issues.apache.org/jira/browse/LUCENE-7730.
>
> Note it's not the patch that counts here, just look at the table of
> lengths.
>
> Best,
> Erick
>
> On Wed, Oct 4, 2017 at 4:25 AM, John Blythe <johnbly...@gmail.com> wrote:
> > interesting idea.
> >
> > the field in question is one that can have a good deal of stray zeros
> based
> > on distributor skus for a product and bad entries from those entering
> them.
> > part of the matching logic for some operations look for these
> discrepancies
> > by having a simple regex that removes zeroes. so 400010 can match with
> > 40010 (and rightly so). issues come in the form of rare cases where 41
> is a
> > sku by the same distributor or manufacturer and thus can end up being an
> > erroneous match. having a means of looking at the length would help to
> know
> > that going from 6 characters to 2 is too far a leap to be counted as a
> > match.
> >
> > --
> > John Blythe
> >
> > On Wed, Oct 4, 2017 at 6:22 AM, alessandro.benedetti <
> a.benede...@sease.io>
> > wrote:
> >
> >> Are the norms a good approximation for you ?
> >> If you preserve norms at indexing time ( it is a configuration that you
> can
> >> operate in the schema.xml) you can retrieve them with this specific
> >> function
> >> query :
> >>
> >> *norm(field)*
> >> Returns the "norm" stored in the index for the specified field. This is
> the
> >> product of the index time boost and the length normalization factor,
> >> according to the Similarity for the field.
> >> norm(fieldName)
> >>
> >> This will not be the exact length of the field, but it can be a good
> >> approximation though.
> >>
> >> Cheers
> >>
> >>
> >>
> >> -----
> >> ---------------
> >> Alessandro Benedetti
> >> Search Consultant, R&D Software Engineer, Director
> >> Sease Ltd. - www.sease.io
> >> --
> >> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >>
>

Reply via email to