I haven’t removed stopwords since 1996, when I joined Infoseek. What is your
special case where you must remove them?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 22, 2019, at 9:51 PM, akash jayaweera
> wrote:
>
> Hello Walter,
>
> Thank y
Hello Walter,
Thank you for the reply.
But for some of my use-case I need to identify stopword. So I need a better
way to identify domain specific stopwords. I used TF-IDF to identify
stopwords. But it has the issue I mentioned above.
Regards,
*Akash Jayaweera.*
E akash.jayawe...@gmail.com
M +
Don’t remove stopwords. That was a useful hack when we were running search
engines on 16-bit machines. These days, it causes more problems than it solves.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jun 22, 2019, at 8:14 PM, akash jayaweera
> w
Hello All,
I'm trying to identify stopwords for a non-English corpus using TF-IDF
score. I calculated the score for each unique term in the corpus. But my
question is how can I select stopwords using the score.
For example if we have a corpus of football, term "football" get the lowest
TF-IDF score