Agreed. Stop words from the moment I started using them caused complaints and problems right off the bat. They may have been implemented less than a week before needing a re-index to fix all the problems they caused.
On Thu, Jun 29, 2017 at 4:55 PM, Walter Underwood <wun...@wunderwood.org> wrote: > Ultraseek (and Infoseek) never used stopwords. They cause odd failures, > like not being able to search for “Vitamin A”. > > Stopwords are an on/off approach to term frequency. idf is a proportional > approach. Once you have idf, you don’t need stopwords. > > When I was bringing up Solr for Netflix, I started with an analysis chain > that used stopwords. A surprising number of movie titles entirely > disappeared. I wrote a blog post about it. Ten years ago! > > https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/ > > Mostly, stopwords were a performance hack back when people ran search > engines on 16-bit machines. Neither disks nor RAM were big enough to hold > the posting lists for common words. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Jun 29, 2017, at 1:46 PM, Rick Leir <rl...@leirtech.com> wrote: > > > > Walter > > Sorry for the tangent, but the stopwords feature sounds useful. You say > you do not use this? Did Ultraseek not do it either? > > Thanks > > Rick > > > > On June 29, 2017 10:53:42 AM EDT, Walter Underwood < > wun...@wunderwood.org> wrote: > >> Nope. Haven’t used stopwords for the last 20 years. > >> > >> I wonder if lowercaseOperators is true. The docs don’t give the default > >> value for that in edismax. > >> > >> https://lucene.apache.org/solr/guide/6_6/the-extended- > dismax-query-parser.html > >> > >> wunder > >> Walter Underwood > >> wun...@wunderwood.org > >> http://observer.wunderwood.org/ (my blog) > >> > >> > >>> On Jun 29, 2017, at 4:42 AM, Rick Leir <rl...@leirtech.com> wrote: > >>> > >>> Stopwords? > >>> > >>> On June 28, 2017 5:13:43 PM EDT, Walter Underwood > >> <wun...@wunderwood.org> wrote: > >>>> Is there some special casing in the highlighter to skip query syntax > >>>> words? The words “and” and “or” don’t get highlighted. > >>>> > >>>> This is in 6.5.0. > >>>> > >>>> <str name="hl.fl">question</str> > >>>> <str name="hl.encoder">html</str> > >>>> <str name="hl.fragsize">440</str> > >>>> <str name="hl.method">fastVector</str> > >>>> <str name="hl.snippets">1</str> > >>>> > >>>> wunder > >>>> Walter Underwood > >>>> wun...@wunderwood.org > >>>> http://observer.wunderwood.org/ (my blog) > >>> > >>> -- > >>> Sorry for being brief. Alternate email is rickleir at yahoo dot com > > > > -- > > Sorry for being brief. Alternate email is rickleir at yahoo dot com > > -- > > Sorry for being brief. Alternate email is rickleir at yahoo dot com > >