On Sep 3, 2019, at 1:13 PM, Audrey Lorberfeld - audrey.lorberf...@ibm.com
wrote:
>
> The main issue we are anticipating with the above strategy surrounds scoring.
> Since we will be increasing the frequency of accented terms, we might bias
> our page ranker...
You will not be increasing the f
Thanks, Alex! We'll look into this.
--
Audrey Lorberfeld
Data Scientist, w3 Search
IBM
audrey.lorberf...@ibm.com
On 9/3/19, 4:27 PM, "Alexandre Rafalovitch" wrote:
What about combining:
1) KeywordRepeatFilterFactory
2) An existing folding filter (need to check it ignores Keyword
What about combining:
1) KeywordRepeatFilterFactory
2) An existing folding filter (need to check it ignores Keyword marked word)
3) RemoveDuplicatesTokenFilterFactory
That may give what you are after without custom coding.
Regards,
Alex.
On Tue, 3 Sep 2019 at 16:14, Audrey Lorberfeld -
audrey
Toke,
Thank you! That makes a lot of sense.
In other news -- we just had a meeting where we decided to try out a hybrid
strategy. I'd love to know what you & everyone else thinks...
- Since we are concerned with the overhead created by "double-fielding" all
tokens per language (because I'm not
Thank you, Erick!
--
Audrey Lorberfeld
Data Scientist, w3 Search
Digital Workplace Engineering
CIO, Finance and Operations
IBM
audrey.lorberf...@ibm.com
On 8/30/19, 3:49 PM, "Erick Erickson" wrote:
It Depends (tm). In this case on how sophisticated/precise your users are.
If your users