solr break up word

2011-10-26 Thread Boris Quiroz
Hi,

I've solr running on a CentOS server working OK, but sometimes my application 
needs to index some parts of a word. For example, if I search 'dislike' word 
fine but if I search 'disl' it returns zero. Also, if I search 'disl*' returns 
some values (the same if I search for 'dislike') but if I search 'dislike*' it 
returns zero too. 

So, I've two questions:

1. How exactly the asterisk works as a wildcard?

2. What can I do to index properly parts of a word? I added this lines to my 
schema.xml:


  




  

  



  


But I can't get it to work. Is OK what I did or I'm wrong?

Thanks.

--
Boris Quiroz
boris.qui...@menco.it



Re: solr break up word

2011-10-28 Thread Boris Quiroz
Hi Erick,

I'll try without the type="index" on analyzer tag and then I'll
re-index some files.

Thanks for you answer.

On Thu, Oct 27, 2011 at 6:54 PM, Erick Erickson  wrote:
> Hmmm, I'm not sure what happens when you specify
>  (without type="index" and
> . I have no clue which one
> is used.
>
> Look at the admin/analysis page to understand how things are
> broken up.
>
> Did you re-index after you added the ngram filter?
>
> You'll get better help if you include example queries with
> &debugQuery=on appended, it'll give us a lot more to
> work with.
>
> Best
> Erick
>
> On Wed, Oct 26, 2011 at 4:14 PM, Boris Quiroz  wrote:
>> Hi,
>>
>> I've solr running on a CentOS server working OK, but sometimes my 
>> application needs to index some parts of a word. For example, if I search 
>> 'dislike' word fine but if I search 'disl' it returns zero. Also, if I 
>> search 'disl*' returns some values (the same if I search for 'dislike') but 
>> if I search 'dislike*' it returns zero too.
>>
>> So, I've two questions:
>>
>> 1. How exactly the asterisk works as a wildcard?
>>
>> 2. What can I do to index properly parts of a word? I added this lines to my 
>> schema.xml:
>>
>> 
>>      
>>        
>>        
>>        
>>        > maxGramSize="15"/>
>>      
>>
>>      
>>        
>>        
>>        
>>      
>> 
>>
>> But I can't get it to work. Is OK what I did or I'm wrong?
>>
>> Thanks.
>>
>> --
>> Boris Quiroz
>> boris.qui...@menco.it
>>
>>
>



-- 
Boris Quiroz
boris.qui...@menco.it


Re: solr break up word

2011-10-28 Thread Boris Quiroz
Hi,

I solved the issue. I added to my schema.xml the following lines:




...




...


Then, I re-index and everything is working great :-)

Thanks for your help.

On Fri, Oct 28, 2011 at 10:08 AM, Boris Quiroz  wrote:
> Hi Erick,
>
> I'll try without the type="index" on analyzer tag and then I'll
> re-index some files.
>
> Thanks for you answer.
>
> On Thu, Oct 27, 2011 at 6:54 PM, Erick Erickson  
> wrote:
>> Hmmm, I'm not sure what happens when you specify
>>  (without type="index" and
>> . I have no clue which one
>> is used.
>>
>> Look at the admin/analysis page to understand how things are
>> broken up.
>>
>> Did you re-index after you added the ngram filter?
>>
>> You'll get better help if you include example queries with
>> &debugQuery=on appended, it'll give us a lot more to
>> work with.
>>
>> Best
>> Erick
>>
>> On Wed, Oct 26, 2011 at 4:14 PM, Boris Quiroz  wrote:
>>> Hi,
>>>
>>> I've solr running on a CentOS server working OK, but sometimes my 
>>> application needs to index some parts of a word. For example, if I search 
>>> 'dislike' word fine but if I search 'disl' it returns zero. Also, if I 
>>> search 'disl*' returns some values (the same if I search for 'dislike') but 
>>> if I search 'dislike*' it returns zero too.
>>>
>>> So, I've two questions:
>>>
>>> 1. How exactly the asterisk works as a wildcard?
>>>
>>> 2. What can I do to index properly parts of a word? I added this lines to 
>>> my schema.xml:
>>>
>>> 
>>>      
>>>        
>>>        
>>>        
>>>        >> maxGramSize="15"/>
>>>      
>>>
>>>      
>>>        
>>>        
>>>        
>>>      
>>> 
>>>
>>> But I can't get it to work. Is OK what I did or I'm wrong?
>>>
>>> Thanks.
>>>
>>> --
>>> Boris Quiroz
>>> boris.qui...@menco.it
>>>
>>>
>>
>
>
>
> --
> Boris Quiroz
> boris.qui...@menco.it
>



-- 
Boris Quiroz
boris.qui...@menco.it


weird issue with solr and CentOS 5.7

2011-11-22 Thread Boris Quiroz
Hi all,

I'm facing a real weird issue here with solr (lucene 3.3) and CentOS
5.7. I've two servers, one running CentOS 5.5 and the other running
CentOS 5.7. Both servers has the same solr, java and tomcat versions,
the only difference between them is OS version.
I added a custom field to schema.xml: . When
that type is boolean, on CentOS 5.5 works OK indexing Chinese
characters, but on CentOS 5.7 I got this exception:

Nov 22, 2011 11:27:11 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/select/
params={indent=on&start=0&q=我们从右上角讲起&rows=10&version=2.2} hits=1
status=0 QTime=8
Nov 22, 2011 11:27:11 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(String.java:694)
at org.apache.solr.schema.BoolField.write(BoolField.java:129)
at org.apache.solr.schema.SchemaField.write(SchemaField.java:124)
at org.apache.solr.response.XMLWriter.writeDoc(XMLWriter.java:369)
at org.apache.solr.response.XMLWriter$3.writeDocs(XMLWriter.java:545)
at org.apache.solr.response.XMLWriter.writeDocuments(XMLWriter.java:482)
at org.apache.solr.response.XMLWriter.writeDocList(XMLWriter.java:519)
at org.apache.solr.response.XMLWriter.writeVal(XMLWriter.java:582)
at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:131)
at 
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:35)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:343)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
at java.lang.Thread.run(Thread.java:636)

That only happens on CentOS 5.7. I also tested on Ubuntu Server, and
also works OK.

solrconfig.xml and everything else is the same on both servers. Any
idea what could be happening? Should it be a CentOS bug?

Regards.
-- 
Boris Quiroz
boris.qui...@menco.it