Re: multilingual list of stopwords

2007-10-25 Thread Maria Mosolova
ageIdentifier.html . Peter -Original Message- From: Maria Mosolova [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 8:48 AM To: solr-user@lucene.apache.org Subject: Re: multilingual list of stopwords Thanks a lot to everyone who responded. Yes, I agree that eventually we need to use

Re: multilingual list of stopwords

2007-10-24 Thread Daniel Alheiros
> >> Peter >> >> -Original Message- >> From: Maria Mosolova [mailto:[EMAIL PROTECTED] >> Sent: Thursday, October 18, 2007 8:48 AM >> To: solr-user@lucene.apache.org >> Subject: Re: multilingual list of stopwords >> >> Thanks a lot to e

Re: multilingual list of stopwords

2007-10-18 Thread Maria Mosolova
Thank you very much for the references Gordon! Looks like that is exactly what I need Maria On 10/18/07, Gordon <[EMAIL PROTECTED]> wrote: > Maria, > > It's perfectly reasonable to build a single list, sort it, and scan it for > especially bad cases. See for example, > http://members.unine.ch/jacqu

Re: multilingual list of stopwords

2007-10-18 Thread Gordon
Maria, It's perfectly reasonable to build a single list, sort it, and scan it for especially bad cases. See for example, http://members.unine.ch/jacques.savoy/clef/index.html for stopwords for several languages or check in some standard programming modules like: http://search.cpan.org/~fabpot/Ling

Re: multilingual list of stopwords

2007-10-18 Thread Maria Mosolova
Original Message- > From: Maria Mosolova [mailto:[EMAIL PROTECTED] > Sent: Thursday, October 18, 2007 8:48 AM > To: solr-user@lucene.apache.org > Subject: Re: multilingual list of stopwords > > Thanks a lot to everyone who responded. Yes, I agree that eventually we &

RE: multilingual list of stopwords

2007-10-18 Thread Binkley, Peter
solr-user@lucene.apache.org Subject: Re: multilingual list of stopwords Thanks a lot to everyone who responded. Yes, I agree that eventually we need to use separate stopword lists for different languages. Unfortunately the data we are trying to index at the moment does not contain any direct co

Re: multilingual list of stopwords

2007-10-18 Thread Maria Mosolova
Thanks a lot to everyone who responded. Yes, I agree that eventually we need to use separate stopword lists for different languages. Unfortunately the data we are trying to index at the moment does not contain any direct country/language information and we need to create the first version of the in

Re: multilingual list of stopwords

2007-10-18 Thread Walter Underwood
Also "die" in German and English. --wunder On 10/18/07 4:16 AM, "Andrzej Bialecki" <[EMAIL PROTECTED]> wrote: > One example that I'm familiar with: words "is" and "by" in English and > in Swedish. Both words are stopwords in English, but they are content > words in Swedish (ice and village, respe

Re: multilingual list of stopwords

2007-10-18 Thread Grant Ingersoll
Are you sure they don't just mean they want separate stopword lists for various different indexes in different languages? Otherwise, I agree, it doesn't make much sense for a single mixed language index (unless you had an intelligent filter that could select based on language.) Maria, pe

Re: multilingual list of stopwords

2007-10-18 Thread Andrzej Bialecki
Lukas Vlcek wrote: Hi, I haven't heard of multilingual stop words list before. What should be the purpose of it? This seems to odd to me :-) That's because multilingual stopword list doesn't make sense ;) One example that I'm familiar with: words "is" and "by" in English and in Swedish. Both

Re: multilingual list of stopwords

2007-10-18 Thread Lukas Vlcek
to merge the various language stopword > files I need to one and use it. But the main problem in this case is, > having collusions with words which are stopwords in one language and in > the other not. > > Cheers, > Joe > > > Maria Mosolova schrieb: > >

Re: multilingual list of stopwords

2007-10-18 Thread Joseph Doehr
Cheers, Joe Maria Mosolova schrieb: > I am looking for a multilingual list of stopwords to use with > Solr/Lucene and would greatly appreciate an advice on where I could > find it.

multilingual list of stopwords

2007-10-17 Thread Maria Mosolova
Hi, I am looking for a multilingual list of stopwords to use with Solr/Lucene and would greatly appreciate an advice on where I could find it. Thanks, Maria