Funny thing is that stopwords files in the examples shown in
http://wiki.apache.org/solr/LanguageAnalysis#Spanish are actually using pipe
and other terms. See the spanish one in
http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/common/src/resources/org/apache/lucene/analysis/snowball/spanish_stop.txt

I never saw this format before.

Lucas, try to use only one word per line, no pipes, no trailing spaces. and
you can use all spanish accents too. Don't forget to save encoded as
UTF-8... u can do that in Eclipse or even Windows Word can open and save
txts in UTF-8.



2011/8/22 Erick Erickson <erickerick...@gmail.com>

> What does the admin/analysis page show? And if you're really
> putting the pipe symbol (|)  in you stopwords file, I have no clue what
> Solr will make of it. The stopwords file format is usually just one
> word per line.....
>
> I'm assuming your name of "string" for the field type is just a placeholder
> or you've replaced the example "string" fieldType, right?
>
>
> Best
> Erick
>
> On Mon, Aug 22, 2011 at 6:24 AM, Lucas Miguez <lucas.mig...@gmail.com>
> wrote:
> > Hi,
> >
> > I am trying to use spanish stop words, but the stop words are not
> working:
> >
> > Part of the schema.xml file:
> >
> > <fieldtype name="string"  class="solr.TextField"
> > positionIncrementGap="100" autoGeneratePhraseQueries="true">
> >   <analyzer type="index">
> >        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >                <filter class="solr.LowerCaseFilterFactory" />
> >                <filter class="solr.SnowballPorterFilterFactory"
> language="Spanish" />
> >                <filter class="solr.StopFilterFactory"
> words="spanish_stop.txt"
> > enablePositionIncrements="true" ignoreCase="true" />
> >   </analyzer>
> >   <analyzer type="query">
> >        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >                <filter class="solr.LowerCaseFilterFactory" />
> >                <filter class="solr.SnowballPorterFilterFactory"
> language="Spanish" />
> >                <filter class="solr.StopFilterFactory"
> words="spanish_stop.txt"
> > enablePositionIncrements="true"  ignoreCase="true" />
> >        </analyzer>
> >   </fieldtype>
> >
> ___________________________________________________________________________
> >
> > A piece of the stopwords file:
> >
> > de             |  from, of
> > la             |  the, her
> > que            |  who, that
> > el             |  the
> > en             |  in
> > y              |  and
> > a              |  to
> > los            |  the, them
> > del            |  de + el
> > se             |  himself, from him etc
> > las            |  the, them
> > por            |  for, by, etc
> > un             |  a
> > para           |  for
> > con            |  with
> > no             |  no
> > una            |  a
> > su             |  his, her
> > al             |  a + el
> >  | es         from SER
> > lo             |  him
> >
> >
> > Any idea? Thanks!
> >
>



-- 

*Alexei Martchenko* | *CEO* | Superdownloads
ale...@superdownloads.com.br | ale...@martchenko.com.br | (11)
5083.1018/5080.3535/5080.3533

Reply via email to