solr break up word
Hi, I've solr running on a CentOS server working OK, but sometimes my application needs to index some parts of a word. For example, if I search 'dislike' word fine but if I search 'disl' it returns zero. Also, if I search 'disl*' returns some values (the same if I search for 'dislike') but if I search 'dislike*' it returns zero too. So, I've two questions: 1. How exactly the asterisk works as a wildcard? 2. What can I do to index properly parts of a word? I added this lines to my schema.xml: But I can't get it to work. Is OK what I did or I'm wrong? Thanks. -- Boris Quiroz boris.qui...@menco.it
Re: solr break up word
Hi Erick, I'll try without the type="index" on analyzer tag and then I'll re-index some files. Thanks for you answer. On Thu, Oct 27, 2011 at 6:54 PM, Erick Erickson wrote: > Hmmm, I'm not sure what happens when you specify > (without type="index" and > . I have no clue which one > is used. > > Look at the admin/analysis page to understand how things are > broken up. > > Did you re-index after you added the ngram filter? > > You'll get better help if you include example queries with > &debugQuery=on appended, it'll give us a lot more to > work with. > > Best > Erick > > On Wed, Oct 26, 2011 at 4:14 PM, Boris Quiroz wrote: >> Hi, >> >> I've solr running on a CentOS server working OK, but sometimes my >> application needs to index some parts of a word. For example, if I search >> 'dislike' word fine but if I search 'disl' it returns zero. Also, if I >> search 'disl*' returns some values (the same if I search for 'dislike') but >> if I search 'dislike*' it returns zero too. >> >> So, I've two questions: >> >> 1. How exactly the asterisk works as a wildcard? >> >> 2. What can I do to index properly parts of a word? I added this lines to my >> schema.xml: >> >> >> >> >> >> >> > maxGramSize="15"/> >> >> >> >> >> >> >> >> >> >> But I can't get it to work. Is OK what I did or I'm wrong? >> >> Thanks. >> >> -- >> Boris Quiroz >> boris.qui...@menco.it >> >> > -- Boris Quiroz boris.qui...@menco.it
Re: solr break up word
Hi, I solved the issue. I added to my schema.xml the following lines: ... ... Then, I re-index and everything is working great :-) Thanks for your help. On Fri, Oct 28, 2011 at 10:08 AM, Boris Quiroz wrote: > Hi Erick, > > I'll try without the type="index" on analyzer tag and then I'll > re-index some files. > > Thanks for you answer. > > On Thu, Oct 27, 2011 at 6:54 PM, Erick Erickson > wrote: >> Hmmm, I'm not sure what happens when you specify >> (without type="index" and >> . I have no clue which one >> is used. >> >> Look at the admin/analysis page to understand how things are >> broken up. >> >> Did you re-index after you added the ngram filter? >> >> You'll get better help if you include example queries with >> &debugQuery=on appended, it'll give us a lot more to >> work with. >> >> Best >> Erick >> >> On Wed, Oct 26, 2011 at 4:14 PM, Boris Quiroz wrote: >>> Hi, >>> >>> I've solr running on a CentOS server working OK, but sometimes my >>> application needs to index some parts of a word. For example, if I search >>> 'dislike' word fine but if I search 'disl' it returns zero. Also, if I >>> search 'disl*' returns some values (the same if I search for 'dislike') but >>> if I search 'dislike*' it returns zero too. >>> >>> So, I've two questions: >>> >>> 1. How exactly the asterisk works as a wildcard? >>> >>> 2. What can I do to index properly parts of a word? I added this lines to >>> my schema.xml: >>> >>> >>> >>> >>> >>> >>> >> maxGramSize="15"/> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> But I can't get it to work. Is OK what I did or I'm wrong? >>> >>> Thanks. >>> >>> -- >>> Boris Quiroz >>> boris.qui...@menco.it >>> >>> >> > > > > -- > Boris Quiroz > boris.qui...@menco.it > -- Boris Quiroz boris.qui...@menco.it
weird issue with solr and CentOS 5.7
Hi all, I'm facing a real weird issue here with solr (lucene 3.3) and CentOS 5.7. I've two servers, one running CentOS 5.5 and the other running CentOS 5.7. Both servers has the same solr, java and tomcat versions, the only difference between them is OS version. I added a custom field to schema.xml: . When that type is boolean, on CentOS 5.5 works OK indexing Chinese characters, but on CentOS 5.7 I got this exception: Nov 22, 2011 11:27:11 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/select/ params={indent=on&start=0&q=我们从右上角讲起&rows=10&version=2.2} hits=1 status=0 QTime=8 Nov 22, 2011 11:27:11 PM org.apache.solr.common.SolrException log SEVERE: java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.String.charAt(String.java:694) at org.apache.solr.schema.BoolField.write(BoolField.java:129) at org.apache.solr.schema.SchemaField.write(SchemaField.java:124) at org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369) at org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545) at org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482) at org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519) at org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582) at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131) at org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:343) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:636) That only happens on CentOS 5.7. I also tested on Ubuntu Server, and also works OK. solrconfig.xml and everything else is the same on both servers. Any idea what could be happening? Should it be a CentOS bug? Regards. -- Boris Quiroz boris.qui...@menco.it