Agreed.  Stop words from the moment I started using them caused complaints
and problems right off the bat.  They may have been implemented less than a
week before needing a re-index to fix all the problems they caused.

On Thu, Jun 29, 2017 at 4:55 PM, Walter Underwood <wun...@wunderwood.org>
wrote:

> Ultraseek (and Infoseek) never used stopwords. They cause odd failures,
> like not being able to search for “Vitamin A”.
>
> Stopwords are an on/off approach to term frequency. idf is a proportional
> approach. Once you have idf, you don’t need stopwords.
>
> When I was bringing up Solr for Netflix, I started with an analysis chain
> that used stopwords. A surprising number of movie titles entirely
> disappeared. I wrote a blog post about it. Ten years ago!
>
> https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/
>
> Mostly, stopwords were a performance hack back when people ran search
> engines on 16-bit machines. Neither disks nor RAM were big enough to hold
> the posting lists for common words.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Jun 29, 2017, at 1:46 PM, Rick Leir <rl...@leirtech.com> wrote:
> >
> > Walter
> > Sorry for the tangent, but the stopwords feature sounds useful. You say
> you do not use this? Did Ultraseek not do it either?
> > Thanks
> > Rick
> >
> > On June 29, 2017 10:53:42 AM EDT, Walter Underwood <
> wun...@wunderwood.org> wrote:
> >> Nope. Haven’t used stopwords for the last 20 years.
> >>
> >> I wonder if lowercaseOperators is true. The docs don’t give the default
> >> value for that in edismax.
> >>
> >> https://lucene.apache.org/solr/guide/6_6/the-extended-
> dismax-query-parser.html
> >>
> >> wunder
> >> Walter Underwood
> >> wun...@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>
> >>> On Jun 29, 2017, at 4:42 AM, Rick Leir <rl...@leirtech.com> wrote:
> >>>
> >>> Stopwords?
> >>>
> >>> On June 28, 2017 5:13:43 PM EDT, Walter Underwood
> >> <wun...@wunderwood.org> wrote:
> >>>> Is there some special casing in the highlighter to skip query syntax
> >>>> words? The words “and” and “or” don’t get highlighted.
> >>>>
> >>>> This is in 6.5.0.
> >>>>
> >>>>     <str name="hl.fl">question</str>
> >>>>     <str name="hl.encoder">html</str>
> >>>>     <str name="hl.fragsize">440</str>
> >>>>     <str name="hl.method">fastVector</str>
> >>>>     <str name="hl.snippets">1</str>
> >>>>
> >>>> wunder
> >>>> Walter Underwood
> >>>> wun...@wunderwood.org
> >>>> http://observer.wunderwood.org/  (my blog)
> >>>
> >>> --
> >>> Sorry for being brief. Alternate email is rickleir at yahoo dot com
> >
> > --
> > Sorry for being brief. Alternate email is rickleir at yahoo dot com
> > --
> > Sorry for being brief. Alternate email is rickleir at yahoo dot com
>
>

Reply via email to