Lukas Vlcek wrote:
Hi,
I haven't heard of multilingual stop words list before. What should be the
purpose of it? This seems to odd to me :-)
That's because multilingual stopword list doesn't make sense ;)
One example that I'm familiar with: words "is" and "by" in English and
in Swedish. Both words are stopwords in English, but they are content
words in Swedish (ice and village, respectively). Similarly, "till" in
Swedish is a stopword (to, towards), but it's a content word in English.
So, as Lukas correctly suggested, you should first perform language
identification, and then apply the correct stopword list.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com