Make phrases into single tokens at indexing and query time. Let the engine do
the rest of the work.
For example, “subunits of the army” can become “subunitsofthearmy” or
“subunits_of_the_army”.
We used patterns to choose phrases, so “word word”, “word glue word”, or “word
glue glue word”
could b
interesting, i cant seem to find anything on Phrase IDF, dont suppose you
have a link or two i could look at by chance?
On Mon, Feb 17, 2020 at 1:48 PM Walter Underwood
wrote:
> At Infoseek, we used “glue words” to build phrase tokens. It was really
> effective.
> Phrase IDF is powerful stuff.
>
At Infoseek, we used “glue words” to build phrase tokens. It was really
effective.
Phrase IDF is powerful stuff.
Luckily for you, the patent on that has expired. :-)
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 17, 2020, at 10:46 AM, David Ha
i use stop words for building shingles into "interesting phrases" for my
machine teacher/students, so i wouldnt say theres no reason, however my use
case is very specific. Otherwise yeah, theyre gone for all practical
reasons/search scenarios.
On Mon, Feb 17, 2020 at 1:41 PM Walter Underwood
wro
Why are you using stopwords? I would need a really, really good reason to use
those.
Stopwords are an obsolete technique from 16-bit processors. I’ve never used
them and
I’ve been a search engineer since 1997.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my bl
Hi
I've run into an issue with creating a Managed Stopwords list that has the
same name as a previously deleted list. Going through the same flow with
Managed Synonyms doesn't result in this unexpected behaviour. Am I missing
something or did I discover a bug in Solr?
On a newly started solr with