It worked by replace < with < and > with > Thank you for your support, ahmd
On Thu, Nov 6, 2008 at 2:39 AM, Norskog, Lance <[EMAIL PROTECTED]> wrote: > There is a nice HTML stripper inside Solr. > "solr.HTMLStripStandardTokenizerFactory" > > > -----Original Message----- > From: Ahmed Hammad [mailto:[EMAIL PROTECTED] > Sent: Wednesday, November 05, 2008 10:43 AM > To: solr-user@lucene.apache.org > Subject: Re: Regex Transformer Error > > Hi, > > It works with the attribute regex="<(.|\n)*?>" > > Sorry for the disturbance. > > Regards, > > ahmd > > > On Wed, Nov 5, 2008 at 8:18 PM, Ahmed Hammad <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I am using Solr 1.3 data import handler. One of my table fields has > > html tags, I want to strip it of the field text. So obviously I need > > the Regex Transformer. > > > > I added transformer="RegexTransformer" attribute to my entity and a > > new field with: > > > > <field sourceColName="content" column="content" regex="English" > > replaceWith="XXXXX"/> > > > > Every thing works fine. The text is replace without any problem. The > > provlem happend with my regular experession to strip html tags. So I > > use regex="<(.|\n)*?>". Of course the charecters '<' and '>' are not > > allowed in XML. I tried the following regex="<(.|\n)*?>" and > > regex="C;(.|\n)*?E;" but I get the following error: > > > > The value of attribute "regex" associated with an element type "field" > > > must not contain the '<' character. at > > com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown > > Source) ... > > > > The full stack trace is following: > > > > *FATAL: Could not create importer. DataImporter config invalid > > org.apache.solr.common.SolrException: FATAL: Could not create > importer. > > DataImporter config invalid at > > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImport > > Handler.java:114) > > at > > org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody > > (DataImportHandler.java:206) > > at > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle > > rBase.java:131) at > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter. > > java:303) > > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter > > .java:232) > > at > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli > > cationFilterChain.java:235) > > at > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi > > lterChain.java:206) > > at > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVa > > lve.java:233) > > at > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextVa > > lve.java:191) > > at > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.ja > > va:128) > > at > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.ja > > va:102) > > at > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValv > > e.java:109) > > at > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java > > :286) > > at > > org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor > > .java:857) > > at > > org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.pro > > cess(Http11AprProtocol.java:565) at > > org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:150 > > 9) at java.lang.Thread.run(Unknown Source) Caused by: > > org.apache.solr.handler.dataimport.DataImportHandlerException: > > Exception occurred while initializing context Processing Document # at > > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImp > > orter.java:176) > > at > > org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.ja > > va:93) > > at > > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImport > > Handler.java:106) ... 17 more Caused by: > > org.xml.sax.SAXParseException: The value of attribute "regex" > > associated with an element type "field" must not contain the '<' > > character. at > > com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown > > Source) at > > com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unkn > > own > > Source) at > > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImp > > orter.java:166) > > ... 19 more * > > > > *description* *The server encountered an internal error (FATAL: Could > > not create importer. DataImporter config invalid > > org.apache.solr.common.SolrException: FATAL: Could not create > importer. > > DataImporter config invalid at > > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImport > > Handler.java:114) > > at > > org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody > > (DataImportHandler.java:206) > > at > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandle > > rBase.java:131) at > > org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter. > > java:303) > > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter > > .java:232) > > at > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appli > > cationFilterChain.java:235) > > at > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFi > > lterChain.java:206) > > at > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVa > > lve.java:233) > > at > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextVa > > lve.java:191) > > at > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.ja > > va:128) > > at > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.ja > > va:102) > > at > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValv > > e.java:109) > > at > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java > > :286) > > at > > org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor > > .java:857) > > at > > org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.pro > > cess(Http11AprProtocol.java:565) at > > org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:150 > > 9) at java.lang.Thread.run(Unknown Source) Caused by: > > org.apache.solr.handler.dataimport.DataImportHandlerException: > > Exception occurred while initializing context Processing Document # at > > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImp > > orter.java:176) > > at > > org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.ja > > va:93) > > at > > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImport > > Handler.java:106) ... 17 more Caused by: > > org.xml.sax.SAXParseException: The value of attribute "regex" > > associated with an element type "field" must not contain the '<' > > character. at > > com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown > > Source) at > > com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unkn > > own > > Source) at > > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImp > > orter.java:166) ... 19 more ) that prevented it from fulfilling this > > request.* > > > > I appreciate your help. > > > > Regards, > > ahmd > > > > >