I would agree with removing the stopword filter from the example configs. It is 
not a “best practice” or even a recommended practice.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Jun 29, 2017, at 8:01 PM, Rick Leir <rl...@leirtech.com> wrote:
> 
> Walter, Erick, David
> Thanks for the info. Maybe the default for stopwords should be disabled? 
> Cheers -- Rick
> 
> On June 29, 2017 5:14:16 PM EDT, Walter Underwood <wun...@wunderwood.org> 
> wrote:
>> My blog post has a list of movie titles. I forgot to list the TV series
>> “Once and Again”.
>> 
>> Some bands that are not searchable with stopwords:
>> 
>> * The Who
>> * Was (not Was)
>> * The The
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Jun 29, 2017, at 2:09 PM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>> 
>>> bq: Mostly, stopwords were a performance hack back when people ran
>>> search engines on 16-bit machines
>>> 
>>> Ah, _those_ were the days when programmers were _real_ programmers.
>>> Actually I'm glad they're gone but that's another story.
>>> 
>>> "to be or not to be". Can't search that if you enable stopwords.
>>> 
>>> Chris Hostetter wrote a fun blog on the fact that Lucene query
>> parsers
>>> are not strict boolean logic with the title "Why Not AND, OR, And
>> NOT"
>>> purposely choosing that title as it's totally unsearchable if you're
>>> using stopwords.
>>> 
>>> FWIW,
>>> Erick
>>> 
>>> On Thu, Jun 29, 2017 at 1:57 PM, David Hastings
>>> <hastings.recurs...@gmail.com> wrote:
>>>> Agreed.  Stop words from the moment I started using them caused
>> complaints
>>>> and problems right off the bat.  They may have been implemented less
>> than a
>>>> week before needing a re-index to fix all the problems they caused.
>>>> 
>>>> On Thu, Jun 29, 2017 at 4:55 PM, Walter Underwood
>> <wun...@wunderwood.org>
>>>> wrote:
>>>> 
>>>>> Ultraseek (and Infoseek) never used stopwords. They cause odd
>> failures,
>>>>> like not being able to search for “Vitamin A”.
>>>>> 
>>>>> Stopwords are an on/off approach to term frequency. idf is a
>> proportional
>>>>> approach. Once you have idf, you don’t need stopwords.
>>>>> 
>>>>> When I was bringing up Solr for Netflix, I started with an analysis
>> chain
>>>>> that used stopwords. A surprising number of movie titles entirely
>>>>> disappeared. I wrote a blog post about it. Ten years ago!
>>>>> 
>>>>> 
>> https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/
>>>>> 
>>>>> Mostly, stopwords were a performance hack back when people ran
>> search
>>>>> engines on 16-bit machines. Neither disks nor RAM were big enough
>> to hold
>>>>> the posting lists for common words.
>>>>> 
>>>>> wunder
>>>>> Walter Underwood
>>>>> wun...@wunderwood.org
>>>>> http://observer.wunderwood.org/  (my blog)
>>>>> 
>>>>> 
>>>>>> On Jun 29, 2017, at 1:46 PM, Rick Leir <rl...@leirtech.com> wrote:
>>>>>> 
>>>>>> Walter
>>>>>> Sorry for the tangent, but the stopwords feature sounds useful.
>> You say
>>>>> you do not use this? Did Ultraseek not do it either?
>>>>>> Thanks
>>>>>> Rick
>>>>>> 
>>>>>> On June 29, 2017 10:53:42 AM EDT, Walter Underwood <
>>>>> wun...@wunderwood.org> wrote:
>>>>>>> Nope. Haven’t used stopwords for the last 20 years.
>>>>>>> 
>>>>>>> I wonder if lowercaseOperators is true. The docs don’t give the
>> default
>>>>>>> value for that in edismax.
>>>>>>> 
>>>>>>> https://lucene.apache.org/solr/guide/6_6/the-extended-
>>>>> dismax-query-parser.html
>>>>>>> 
>>>>>>> wunder
>>>>>>> Walter Underwood
>>>>>>> wun...@wunderwood.org
>>>>>>> http://observer.wunderwood.org/  (my blog)
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 29, 2017, at 4:42 AM, Rick Leir <rl...@leirtech.com>
>> wrote:
>>>>>>>> 
>>>>>>>> Stopwords?
>>>>>>>> 
>>>>>>>> On June 28, 2017 5:13:43 PM EDT, Walter Underwood
>>>>>>> <wun...@wunderwood.org> wrote:
>>>>>>>>> Is there some special casing in the highlighter to skip query
>> syntax
>>>>>>>>> words? The words “and” and “or” don’t get highlighted.
>>>>>>>>> 
>>>>>>>>> This is in 6.5.0.
>>>>>>>>> 
>>>>>>>>>   <str name="hl.fl">question</str>
>>>>>>>>>   <str name="hl.encoder">html</str>
>>>>>>>>>   <str name="hl.fragsize">440</str>
>>>>>>>>>   <str name="hl.method">fastVector</str>
>>>>>>>>>   <str name="hl.snippets">1</str>
>>>>>>>>> 
>>>>>>>>> wunder
>>>>>>>>> Walter Underwood
>>>>>>>>> wun...@wunderwood.org
>>>>>>>>> http://observer.wunderwood.org/  (my blog)
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Sorry for being brief. Alternate email is rickleir at yahoo dot
>> com
>>>>>> 
>>>>>> --
>>>>>> Sorry for being brief. Alternate email is rickleir at yahoo dot
>> com
>>>>>> --
>>>>>> Sorry for being brief. Alternate email is rickleir at yahoo dot
>> com
>>>>> 
>>>>> 
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

Reply via email to