Thanks for your answer, I have just put the filter in my schema.xml but it
doesn't work I am using solr 1.4 and my conf is:

<code>
<analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.HTMLStripCharFilterFactory"/>
    <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
 </analyzer>
</code>


But it doesn't work in tomcat 6 logs I get this error:

 java.lang.ClassCastException:
org.apache.solr.analysis.HTMLStripCharFilterFactory cannot be cast to
org.apache.solr.analysis.TokenFilterFactory
    at org.apache.solr.schema.IndexSchema$6.init(IndexSchema.java:831)
    at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:149)
    at org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:835)
    at org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)
    at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:424)
    at org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:447)
    at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:141)
    at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:456)
    at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:95)
    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:426)
    at org.apache.solr.core.CoreContainer.load(CoreContainer.java:278)
    at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
...

Any Idea ? How can I solve that problem ???

Regards
Ariel



On Thu, Jun 16, 2011 at 6:24 PM, Steven A Rowe <sar...@syr.edu> wrote:

> Hi Ariel,
>
> On 6/16/2011 at 10:45 AM, Ariel wrote:
> > I have the following problem: I am using the spanish analyzer to index
> > and query, but due to I am using tinymce some charactes of the text are
> > changed codified in html, for example the text: "En espaƱa ... " it is
> > changed to "En espa&ntilde;a" so I need a way to recodify that text to
> > make queries correctly.
>
> HTMLStripCharFilterFactory, which strips out HTML tags, also converts named
> character entities like &ntilde; to their equivalent character.
>
> Steve
>

Reply via email to