Lukas Vlcek wrote:
Hi,

I haven't heard of multilingual stop words list before. What should be the
purpose of it? This seems to odd to me :-)

That's because multilingual stopword list doesn't make sense ;)

One example that I'm familiar with: words "is" and "by" in English and in Swedish. Both words are stopwords in English, but they are content words in Swedish (ice and village, respectively). Similarly, "till" in Swedish is a stopword (to, towards), but it's a content word in English.

So, as Lukas correctly suggested, you should first perform language identification, and then apply the correct stopword list.


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to