That very txt said "A Spanish stop word list. Comments begin with vertical bar. Each stop word is at the start of a line."
Solr's comments are #s not pipes. Brazilian stopwords file is kinda raw... http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/common/src/resources/org/apache/lucene/analysis/br/stopwords.txt 2011/8/22 Alexei Martchenko <ale...@superdownloads.com.br> > Funny thing is that stopwords files in the examples shown in > http://wiki.apache.org/solr/LanguageAnalysis#Spanish are actually using > pipe and other terms. See the spanish one in > http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/common/src/resources/org/apache/lucene/analysis/snowball/spanish_stop.txt > > I never saw this format before. > > Lucas, try to use only one word per line, no pipes, no trailing spaces. and > you can use all spanish accents too. Don't forget to save encoded as > UTF-8... u can do that in Eclipse or even Windows Word can open and save > txts in UTF-8. > > > > 2011/8/22 Erick Erickson <erickerick...@gmail.com> > >> What does the admin/analysis page show? And if you're really >> putting the pipe symbol (|) in you stopwords file, I have no clue what >> Solr will make of it. The stopwords file format is usually just one >> word per line..... >> >> I'm assuming your name of "string" for the field type is just a >> placeholder >> or you've replaced the example "string" fieldType, right? >> >> >> Best >> Erick >> >> On Mon, Aug 22, 2011 at 6:24 AM, Lucas Miguez <lucas.mig...@gmail.com> >> wrote: >> > Hi, >> > >> > I am trying to use spanish stop words, but the stop words are not >> working: >> > >> > Part of the schema.xml file: >> > >> > <fieldtype name="string" class="solr.TextField" >> > positionIncrementGap="100" autoGeneratePhraseQueries="true"> >> > <analyzer type="index"> >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> > <filter class="solr.LowerCaseFilterFactory" /> >> > <filter class="solr.SnowballPorterFilterFactory" >> language="Spanish" /> >> > <filter class="solr.StopFilterFactory" >> words="spanish_stop.txt" >> > enablePositionIncrements="true" ignoreCase="true" /> >> > </analyzer> >> > <analyzer type="query"> >> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> > <filter class="solr.LowerCaseFilterFactory" /> >> > <filter class="solr.SnowballPorterFilterFactory" >> language="Spanish" /> >> > <filter class="solr.StopFilterFactory" >> words="spanish_stop.txt" >> > enablePositionIncrements="true" ignoreCase="true" /> >> > </analyzer> >> > </fieldtype> >> > >> ___________________________________________________________________________ >> > >> > A piece of the stopwords file: >> > >> > de | from, of >> > la | the, her >> > que | who, that >> > el | the >> > en | in >> > y | and >> > a | to >> > los | the, them >> > del | de + el >> > se | himself, from him etc >> > las | the, them >> > por | for, by, etc >> > un | a >> > para | for >> > con | with >> > no | no >> > una | a >> > su | his, her >> > al | a + el >> > | es from SER >> > lo | him >> > >> > >> > Any idea? Thanks! >> > >> > > > > -- > > *Alexei Martchenko* | *CEO* | Superdownloads > ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) > 5083.1018/5080.3535/5080.3533 > > -- *Alexei Martchenko* | *CEO* | Superdownloads ale...@superdownloads.com.br | ale...@martchenko.com.br | (11) 5083.1018/5080.3535/5080.3533