Language detection for solr 3.6.1

2014-07-01 Thread Poornima Jay
Hi,

Can anyone please let me know how to integrate 
http://code.google.com/p/language-detection/ in solr 3.6.1. I want four 
languages (English, chinese simplified, chinese traditional, Japanes, and 
Korean) to be added in one schema ie. multilingual search from single schema 
file.

I tried added solr-langdetect-3.5.0.jar in my /solr/contrib/langid/lib/ 
location and in /webapps/solr/WEB-INF/contrib/langid/lib/ and made changes in 
the solrconfig.xml as below



 
    
    
    content_eng    
    true
    content_eng,content_ja
    en,ja
    en:english ja:japanese
    en
    
    
  
  
  
    
    langid
    
  

Please suggest me the solution.

Thanks,
Poornima

Re: Language detection for solr 3.6.1

2014-07-07 Thread Poornima Jay
Hi,

Please let me know if anyone had used google language detection for 
implementing multilanguage search in one schema.

Thanks,
Poornima




On Tuesday, 1 July 2014 6:54 PM, Poornima Jay  
wrote:
 


Hi,

Can anyone please let me know how to integrate 
http://code.google.com/p/language-detection/ in solr 3.6.1. I want four 
languages (English, chinese simplified, chinese traditional, Japanes, and 
Korean) to be added in one schema ie. multilingual search from single schema 
file.

I tried added solr-langdetect-3.5.0.jar in my /solr/contrib/langid/lib/ 
location and in /webapps/solr/WEB-INF/contrib/langid/lib/ and made changes in 
the solrconfig.xml as below



 
    
    
    content_eng    
    true
    content_eng,content_ja
    en,ja
    en:english ja:japanese
    en
    
    
  
  
  
    
    langid
    
  

Please suggest me the solution.

Thanks,
Poornima

Re: Fwd: Language detection for solr 3.6.1

2014-07-08 Thread Poornima Jay
When i use solr-langid-3.5.0.jar file after reloading the core i am getting the 
below error 

SEVERE: java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException


Even after adding the solr-jsonic-3.5.0.jar file in the webapps folder.

Thanks,
Poornima



On Tuesday, 8 July 2014 3:36 PM, Alexandre Rafalovitch  
wrote:
 


-- Forwarded message --

From: Poornima Jay 
Date: Tue, Jul 8, 2014 at 5:03 PM
Subject: Re: Language detection for solr 3.6.1


When i try to use solr-langid-3.6.1.jar file in my path
/apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/
and define the path in the solrconfig.xml as below



I am getting the below error while reloading the core.

SEVERE: java.lang.NoClassDefFoundError:
com/cybozu/labs/langdetect/DetectorFactory

Please advice.

Thanks,
Poornima


On Tuesday, 8 July 2014 9:58 AM, Alexandre Rafalovitch
 wrote:


If you are having troubles with jar location, just use absolute path
in your lib statement and use path, not dir/regex. That will complain
louder. You should be using the latest jar matching the version, they
should be shipped with Solr itself.

Regards,
  Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Tue, Jul 8, 2014 at 11:14 AM, Poornima Jay
 wrote:
> I am facing the issue with the jar file location. Where should i place the
> solr-langid-3.6.1.jar. If i place it in the instance folder inside
> /lib/solr-langid-3.6.1.jar the language detection class are not loaded.
> Should i use solr-langid-3.5.1.jar in solr 3.6.1 version?
>
> Can you please attach the schema file also for reference.
>
> 
> 
>
> where exactly the jar file should be placed? /dist/ or /contrib/langid/lib/
>
> Thanks for your time.
>
> Regards,
> Poornima
>
>
>
> On Monday, 7 July 2014 2:42 PM, Alexandre Rafalovitch 
> wrote:
>
>
> I've had an example in my book:
> https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/conf/solrconfig.xml
> , though it was for Solr 4.2+. Solr in Action also has a section on
> multilingual indexing. There is no generic advice, as everybody seems
> to have slightly different multilingual requirements, but the books
> will at least discuss the main issues.
>
> Regarding your specific email from a week ago, You haven't actually
> said what is the problem was. Just what you did. So, we don't know
> where you are stuck and what - specifically - you need help with.
>
> Regards,
>  Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr
> proficiency
>
>
> On Mon, Jul 7, 2014 at 4:06 PM, Poornima Jay 
> wrote:
>> Hi,
>>
>> Please let me know if anyone had used google language detection for
>> implementing multilanguage search in one schema.
>>
>> Thanks,
>> Poornima
>>
>>
>>
>>
>> On Tuesday, 1 July 2014 6:54 PM, Poornima Jay 
>> wrote:
>>
>>
>> Hi,
>>
>> Can anyone please let me know how to integrate
>> http://code.google.com/p/language-detection/ in solr 3.6.1. I want four
>> languages (English, chinese simplified, chinese traditional, Japanes, and
>> Korean) to be added in one schema ie. multilingual search from single
>> schema
>> file.
>>
>> I tried added solr-langdetect-3.5.0.jar in my /solr/contrib/langid/lib/
>> location and in /webapps/solr/WEB-INF/contrib/langid/lib/ and made changes
>> in the solrconfig.xml as below
>>
>> > class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
>>
>>  
>>    >
>> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
>>    
>>    content_eng
>>    true
>>    content_eng,content_ja
>>    en,ja
>>    en:english ja:japanese
>>    en
>>    
>>    
>>  
>>
>>  
>>    
>>    langid
>>    
>>  
>>
>> Please suggest me the solution.
>>
>> Thanks,
>> Poornima
>>
>
>

Re: Fwd: Language detection for solr 3.6.1

2014-07-08 Thread Poornima Jay
I'm using the google library which I has mentioned in my first mail saying Im 
using http://code.google.com/p/language-detection/. I have downloaded the jar 
file from the below url

https://www.versioneye.com/java/org.apache.solr:solr-langid/3.6.1


Please let me know from where I need to download the correct jar file.

Regards,
Poornima


On Tuesday, 8 July 2014 3:42 PM, Alexandre Rafalovitch  
wrote:
 


I just realized you are not using Solr language detect libraries. You
are using third party one. You did mention that in your first message.

I don't see that library integrated with Solr though, just as a
standalone library. So, you can't just plug in it.

Is there any reason you cannot use one of the two libraries Solr does
already have (Tika's and Google's)? What's so special about that one?

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency



On Tue, Jul 8, 2014 at 5:08 PM, Poornima Jay  wrote:
> When i use solr-langid-3.5.0.jar file after reloading the core i am getting 
> the below error
>
> SEVERE: java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException
>
>
> Even after adding the solr-jsonic-3.5.0.jar file in the webapps folder.
>
> Thanks,
> Poornima
>
>
>
> On Tuesday, 8 July 2014 3:36 PM, Alexandre Rafalovitch  
> wrote:
>
>
>
> -- Forwarded message --
>
> From: Poornima Jay 
> Date: Tue, Jul 8, 2014 at 5:03 PM
> Subject: Re: Language detection for solr 3.6.1
>
>
> When i try to use solr-langid-3.6.1.jar file in my path
> /apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/
> and define the path in the solrconfig.xml as below
>
>  dir="/home/searchuser/apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/"
> regex="solr-langid-.*\.jar" />
>
> I am getting the below error while reloading the core.
>
> SEVERE: java.lang.NoClassDefFoundError:
> com/cybozu/labs/langdetect/DetectorFactory
>
> Please advice.
>
> Thanks,
> Poornima
>
>
> On Tuesday, 8 July 2014 9:58 AM, Alexandre Rafalovitch
>  wrote:
>
>
> If you are having troubles with jar location, just use absolute path
> in your lib statement and use path, not dir/regex. That will complain
> louder. You should be using the latest jar matching the version, they
> should be shipped with Solr itself.
>
> Regards,
>   Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
> On Tue, Jul 8, 2014 at 11:14 AM, Poornima Jay
>  wrote:
>> I am facing the issue with the jar file location. Where should i place the
>> solr-langid-3.6.1.jar. If i place it in the instance folder inside
>> /lib/solr-langid-3.6.1.jar the language detection class are not loaded.
>> Should i use solr-langid-3.5.1.jar in solr 3.6.1 version?
>>
>> Can you please attach the schema file also for reference.
>>
>> 
>> 
>>
>> where exactly the jar file should be placed? /dist/ or /contrib/langid/lib/
>>
>> Thanks for your time.
>>
>> Regards,
>> Poornima
>>
>>
>>
>> On Monday, 7 July 2014 2:42 PM, Alexandre Rafalovitch 
>> wrote:
>>
>>
>> I've had an example in my book:
>> https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/conf/solrconfig.xml
>> , though it was for Solr 4.2+. Solr in Action also has a section on
>> multilingual indexing. There is no generic advice, as everybody seems
>> to have slightly different multilingual requirements, but the books
>> will at least discuss the main issues.
>>
>> Regarding your specific email from a week ago, You haven't actually
>> said what is the problem was. Just what you did. So, we don't know
>> where you are stuck and what - specifically - you need help with.
>>
>> Regards,
>>  Alex.
>> Personal website: http://www.outerthoughts.com/
>> Current project: http://www.solr-start.com/ - Accelerating your Solr
>> proficiency
>>
>>
>> On Mon, Jul 7, 2014 at 4:06 PM, Poornima Jay 
>> wrote:
>>> Hi,
>>>
>>> Please let me know if anyone had used google language detection for
>>> implementing multilanguage search in one schema.
>>>
>>> Thanks,
>>> Poornima
>>>
>>>
>>>
>>>
>>> On Tuesday, 1 July 2014 6:54 PM, Poornima Jay 
>>> wrote:
>>>
>>>
>>> Hi,
>>>
>>> Can anyone please let me know how to integrate
>>> http://co

Korean Tokenizer in solr

2014-07-10 Thread Poornima Jay
Hi,

Anyone tried to implement korean language in solr 3.6.1. I define the field as 
below in my schema file but the fieldtype is not working.


      
        
        
        
        
      
      
        
        
        
        
            
    
    
Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype 
'text_kr' specified on field product_name_kr

Regards,
Poornima


Re: Korean Tokenizer in solr

2014-07-10 Thread Poornima Jay
I have defined the fieldtype inside the fields section.  When i checked the 
error log i found the below error

Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory

SEVERE: org.apache.solr.common.SolrException: analyzer without class or 
tokenizer & filter list


Do i need to add any libraries for koreanTokenizer?

Regards,
Poornima


On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch  
wrote:
 


Double check your xml file that you don't - for example - define your
fieldType outside of fields section. Or maybe you have exception
earlier about some component in the type definition.

This is not about Korean language, it seems. Something more
fundamentally about XML config.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency



On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay
 wrote:
> Hi,
>
> Anyone tried to implement korean language in solr 3.6.1. I define the field
> as below in my schema file but the fieldtype is not working.
>
> >
>       
>         
>          hasCNoun="true"  bigrammable="true"/>
>         
>          words="stopwords_kr.txt"/>
>       
>       
>         
>          hasCNoun="false"  bigrammable="false"/>
>         
>          words="stopwords_kr.txt"/>
>       
>     
>
> Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype
> 'text_kr' specified on field product_name_kr
>
> Regards,
> Poornima
>

Re: Korean Tokenizer in solr

2014-07-10 Thread Poornima Jay
Till now I was thinking solr will support KoreanTokenizer. I haven't used any 
other 3rd party one. 
Actually the issue i am facing is I need to integrate English, Chinese, 
Japanese and Korean language search in a single site. Based on the user's 
selected language to search the fields will be queried appropriately. 

I tried using cjk for all the 3 languages like below but only few search terms 
work for Chinese and Japanese. nothing works for Korean.


             
        
        
        
        
        
        
        
      
    

So i tried to implement individual fieldtype for each language as below

Chinese
 
     
         
           
           
           
       
    

Japanese

   
     
      
      
      
      
      
      
   


Korean

      
        
        
        
        
      
      
        
        
        
        
            
    

I am really struck how to implement this. Please help me.

Thanks,
Poornima



On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch  
wrote:
 


I don't think Solr ships with Korean Tokenizer, does it?

If you are using a 3rd party one, you need to give full class name,
not just solr.Korean... And you need the library added in the lib
statement in solrconfig.xml (at least in Solr 4).

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency



On Thu, Jul 10, 2014 at 3:23 PM, Poornima Jay
 wrote:
> I have defined the fieldtype inside the fields section.  When i checked the 
> error log i found the below error
>
> Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory
>
> SEVERE: org.apache.solr.common.SolrException: analyzer without class or 
> tokenizer & filter list
>
>
> Do i need to add any libraries for koreanTokenizer?
>
> Regards,
> Poornima
>
>
> On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch  
> wrote:
>
>
>
> Double check your xml file that you don't - for example - define your
> fieldType outside of fields section. Or maybe you have exception
> earlier about some component in the type definition.
>
> This is not about Korean language, it seems. Something more
> fundamentally about XML config.
>
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
>
> On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay
>  wrote:
>> Hi,
>>
>> Anyone tried to implement korean language in solr 3.6.1. I define the field
>> as below in my schema file but the fieldtype is not working.
>>
>> >>
>>       
>>         
>>         > hasCNoun="true"  bigrammable="true"/>
>>         
>>         > words="stopwords_kr.txt"/>
>>       
>>       
>>         
>>         > hasCNoun="false"  bigrammable="false"/>
>>         
>>         > words="stopwords_kr.txt"/>
>>       
>>     
>>
>> Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype
>> 'text_kr' specified on field product_name_kr
>>
>> Regards,
>> Poornima
>>

Re: Korean Tokenizer in solr

2014-07-13 Thread Poornima Jay
I have upgrade the solr version to 4.8.1. But after making changes in the 
schema file i am getting the below error
Error instantiating class: 
'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory'
I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in 
4.8.1. Do I need to make any configuration changes to get this working.

Please advice.

Regards,
Poornima


On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch  
wrote:
 


I would suggest you read through all 12 (?) articles in this series:
http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html
. It will probably lay out most of the issues for you.

And if you are starting, I would really suggest using the latest Solr
(4.9). A lot more people remember what the latest version has then
what was in 3.6. And, as the series above will tell you, some relevant
issues had been fixed in more recent Solr versions.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency



On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay
 wrote:
> Till now I was thinking solr will support KoreanTokenizer. I haven't used any 
> other 3rd party one.
> Actually the issue i am facing is I need to integrate English, Chinese, 
> Japanese and Korean language search in a single site. Based on the user's 
> selected language to search the fields will be queried appropriately.
>
> I tried using cjk for all the 3 languages like below but only few search 
> terms work for Chinese and Japanese. nothing works for Korean.
>
>  positionIncrementGap="1" autoGeneratePhraseQueries="false">
>      
>         
>         
>         
>         id="Traditional-Simplified"/>
>         id="Katakana-Hiragana"/>
>         
>         hiragana="true" katakana="true" hangul="true" outputUnigrams="true" />
>       
>     
>
> So i tried to implement individual fieldtype for each language as below
>
> Chinese
>  positionIncrementGap="1000" autoGeneratePhraseQueries="false">
>      
>          
>            
>            
>            
>        
>     
>
> Japanese
>  autoGeneratePhraseQueries="false">
>    
>      
>       
>       tags="stoptags_ja.txt" />
>       
>       words="stopwords_ja.txt" />
>       minimumLength="4"/>
>       
>    
> 
>
> Korean
>  autoGeneratePhraseQueries="false">
>       
>         
>         hasCNoun="true"  bigrammable="true"/>
>         
>         words="stopwords_kr.txt"/>
>       
>       
>         
>         hasCNoun="false"  bigrammable="false"/>
>         
>         words="stopwords_kr.txt"/>
>       
>     
>
> I am really struck how to implement this. Please help me.
>
> Thanks,
> Poornima
>
>
>
> On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch  
> wrote:
>
>
>
> I don't think Solr ships with Korean Tokenizer, does it?
>
> If you are using a 3rd party one, you need to give full class name,
> not just solr.Korean... And you need the library added in the lib
> statement in solrconfig.xml (at least in Solr 4).
>
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
>
> On Thu, Jul 10, 2014 at 3:23 PM, Poornima Jay
>  wrote:
>> I have defined the fieldtype inside the fields section.  When i checked the 
>> error log i found the below error
>>
>> Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory
>>
>> SEVERE: org.apache.solr.common.SolrException: analyzer without class or 
>> tokenizer & filter list
>>
>>
>> Do i need to add any libraries for koreanTokenizer?
>>
>> Regards,
>> Poornima
>>
>>
>> On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch 
>>  wrote:
>>
>>
>>
>> Double check your xml file that you don't - for example - define your
>> fieldType outside of fields section. Or maybe you have exception
>> earlier about some component in the type definition.
>>
>> This is not about Korean language, it seems. Something more
>> fundamentally about XML config.
>>
>> Regards,
>>    Alex.
>> Personal website: http://www.outerthoughts.com/
>> Current project: http://www.solr-start.com/ - Accelerating your Solr 
>> proficiency
>>
>>
>>
>> On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay
>>  wrote:
>>> Hi,
>>>
>>> Anyone tried to implement korean language in solr 3.6.1. I define the field
>>> as below in my schema file but the fieldtype is not working.
>>>
>>> >>>
>>>       
>>>         
>>>         >> hasCNoun="true"  bigrammable="true"/>
>>>         
>>>         >> words="stopwords_kr.txt"/>
>>>       
>>>       
>>>         
>>>         >> hasCNoun="false"  bigrammable="false"/>
>>>         
>>>         >> words="stopwords_kr.txt"/>
>>>       
>>>     
>>>
>>> Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype
>>> 'text_kr' specified on field product_name_kr
>>>
>>> Regards,
>>> Poornima
>>>

Re: Korean Tokenizer in solr

2014-07-14 Thread Poornima Jay
Yes, Below is my defined fieldtype


      
         
         
         
      
      
         
         
         
      
   

Please correct me if I am doing anything wrong here

Regards,
Poornima


On Monday, 14 July 2014 12:33 PM, Alexandre Rafalovitch  
wrote:
 


You sure, it's not a spelling error or something other weird like
that? Because Solr ships with that filter in it's example schema:
        

So, you can compare what you are doing differently with that.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853



On Mon, Jul 14, 2014 at 1:58 PM, Poornima Jay
 wrote:
> I have upgrade the solr version to 4.8.1. But after making changes in the 
> schema file i am getting the below error
> Error instantiating class: 
> 'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory'
> I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in 
> 4.8.1. Do I need to make any configuration changes to get this working.
>
> Please advice.
>
> Regards,
> Poornima
>
>
> On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch  
> wrote:
>
>
>
> I would suggest you read through all 12 (?) articles in this series:
> http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html
> . It will probably lay out most of the issues for you.
>
> And if you are starting, I would really suggest using the latest Solr
> (4.9). A lot more people remember what the latest version has then
> what was in 3.6. And, as the series above will tell you, some relevant
> issues had been fixed in more recent Solr versions.
>
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
>
> On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay
>  wrote:
>> Till now I was thinking solr will support KoreanTokenizer. I haven't used 
>> any other 3rd party one.
>> Actually the issue i am facing is I need to integrate English, Chinese, 
>> Japanese and Korean language search in a single site. Based on the user's 
>> selected language to search the fields will be queried appropriately.
>>
>> I tried using cjk for all the 3 languages like below but only few search 
>> terms work for Chinese and Japanese. nothing works for Korean.
>>
>> > positionIncrementGap="1" autoGeneratePhraseQueries="false">
>>      
>>         
>>         
>>         >class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/>
>>         >id="Traditional-Simplified"/>
>>         >id="Katakana-Hiragana"/>
>>         
>>         >hiragana="true" katakana="true" hangul="true" outputUnigrams="true" />
>>       
>>     
>>
>> So i tried to implement individual fieldtype for each language as below
>>
>> Chinese
>>  >positionIncrementGap="1000" autoGeneratePhraseQueries="false">
>>      
>>          
>>            
>>            
>>            
>>        
>>     
>>
>> Japanese
>> > autoGeneratePhraseQueries="false">
>>    
>>      
>>       
>>       >tags="stoptags_ja.txt" />
>>       
>>       >words="stopwords_ja.txt" />
>>       >minimumLength="4"/>
>>       
>>    
>> 
>>
>> Korean
>> > autoGeneratePhraseQueries="false">
>>       
>>         
>>         >hasCNoun="true"  bigrammable="true"/>
>>         
>>         >words="stopwords_kr.txt"/>
>>       
>>       
>>         
>>         >hasCNoun="false"  bigrammable="false"/>
>>         
>>         >words="stopwords_kr.txt"/>
>>       
>>     
>>
>> I am really struck how to implement this. Please help me.
>>
>> Thanks,
>> Poornima
>>
>>
>>
>> On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch 
>>  wrote:
>>
>>
>>
>> I don't think Solr ships with Korean Tokenizer, does it?
>>
>> If you are using a 3rd party one, you need to give full class name,
>> not just solr.Korean... And you need the library added in the lib
>> statement in solrconfig.xml (at least in Solr 4).
>>
>> Regards,
>>    Alex.
>>

Re: Korean Tokenizer in solr

2014-07-14 Thread Poornima Jay
When I am trying to index the below error comes

java.io.FileNotFoundException: 
/home/searchuser/multicore/apac_content/data/tlog/tlog.000 (No 
such file or directory)





On Monday, 14 July 2014 2:07 PM, Poornima Jay  
wrote:
 


Yes, Below is my defined fieldtype


      
         
         
         
      
      
         
         
         
      
   

Please correct me if I am doing anything wrong here

Regards,
Poornima



On Monday, 14 July 2014 12:33 PM, Alexandre Rafalovitch  
wrote:



You sure, it's not a spelling error or something other weird like
that? Because Solr ships with that filter in it's example schema:
        

So, you can compare what you are doing differently with that.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853



On Mon, Jul 14, 2014 at 1:58 PM, Poornima Jay
 wrote:
> I have upgrade the solr version to 4.8.1. But after making changes in the 
> schema file i am getting the below error
> Error instantiating class: 
> 'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory'
> I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in 
> 4.8.1. Do I need to make any configuration changes to get this working.
>
> Please advice.
>
> Regards,
> Poornima
>
>
> On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch  
> wrote:
>
>
>
> I would suggest you read through all 12 (?) articles in this series:
> http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html
> . It will probably lay out most of the issues for you.
>
> And if you are starting, I would really suggest using the latest Solr
> (4.9). A lot more people remember what the latest version has then
> what was in 3.6. And, as the series above will tell you, some relevant
> issues had been fixed in more recent Solr versions.
>
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr 
> proficiency
>
>
>
> On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay
>  wrote:
>> Till now I was thinking solr will support KoreanTokenizer. I haven't used 
>> any other 3rd party one.
>> Actually the issue i am facing is I need to integrate English, Chinese, 
>> Japanese and Korean language search in a single site. Based on the user's 
>> selected language to search the fields will be queried appropriately.
>>
>> I tried using cjk for all the 3 languages like below but only few search 
>> terms work for Chinese and Japanese. nothing works for Korean.
>>
>> > positionIncrementGap="1" autoGeneratePhraseQueries="false">
>>      
>>         
>>         
>>         >class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/>
>>         >id="Traditional-Simplified"/>
>>         >id="Katakana-Hiragana"/>
>>         
>>         >hiragana="true" katakana="true" hangul="true" outputUnigrams="true" />
>>       
>>     
>>
>> So i tried to implement individual fieldtype for each language as below
>>
>> Chinese
>>  >positionIncrementGap="1000" autoGeneratePhraseQueries="false">
>>      
>>          
>>            
>>            
>>            
>>        
>>     
>>
>> Japanese
>> > autoGeneratePhraseQueries="false">
>>    
>>      
>>       
>>       >tags="stoptags_ja.txt" />
>>       
>>       >words="stopwords_ja.txt" />
>>       >minimumLength="4"/>
>>       
>>    
>> 
>>
>> Korean
>> > autoGeneratePhraseQueries="false">
>>       
>>         
>>         >hasCNoun="true"  bigrammable="true"/>
>>         
>>         >words="stopwords_kr.txt"/>
>>       
>>       
>>         
>>         >hasCNoun="false"  bigrammable="false"/>
>>         
>>         >words="stopwords_kr.txt"/>
>>       
>>     
>>
>> I am really struck how to implement this. Please help me.
>>
>> Thanks,
>> Poornima
>>
>>
>>
>> On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch 
>>  wrote:
>>
>>
>>
>> I don't think Solr ships with Korean Tokenizer, does it?
>>

Re: Perm Gen issues in SolrCloud

2014-07-28 Thread Poornima Jay
Hi Nitin,

Not sure of you have tried these steps.

1. Stop the Tomcat Server.
2.Find catalina.bat
3.Assign following line to JAVA_OPTS variable and add it into catalina.bat 
file. 
set JAVA_OPTS=-server -Xms512M -Xmx768M -XX:MaxPermSize=256m
 4. restart



On Saturday, 1 March 2014 6:02 AM, KNitin  wrote:
 


Hi Furkan

I have read that before but I haven't added any new classes or changed
anything with my setup. I just created more collections in solr. How will
that increase perm gen space ? Doesn't solr intern strings at all ?
Interned strings also go to the perm gen space right?

- Nitin



On Fri, Feb 28, 2014 at 3:11 PM, Furkan KAMACI wrote:

> Hi;
>
> Jack has an answer for a PermGen usages:
>
> "PermGen memory has to do with number of classes loaded, rather than
> documents.
>
> Here are a couple of pages that help explain Java PermGen issues. The
> bottom
> line is that you can increase the PermGen space, or enable unloading of
> classes, or at least trace class loading to see why the problem occurs.
>
>
> http://stackoverflow.com/questions/88235/how-to-deal-with-java-lang-outofmemoryerror-
> permgen-space-error
>
> http://www.brokenbuild.com/blog/2006/08/04/java-jvm-gc-permgen
> -and-memory-options/
> "
>
> You can see the conversation from here:
> http://search-lucene.com/m/iMaR11lgj3Q1/permgen&subj=PermGen+OOM+Error
>
> Thanks;
> Furkan KAMACI
>
>
> 2014-02-28 21:37 GMT+02:00 KNitin :
>
> > Hi
> >
> >  I am seeing the Perm Gen usage increase as i keep adding more
> collections.
> > What kind of strings get interned in solr? (Only schema , fields,
> > collection metadata or the data itself?)
> >
> > Will Permgen space (atleast interned strings) increase proportional to
> the
> > size of the data in the collections or with the # of collections
> > themselves?
> >
> >
> > I have temporarily increased the size of PermGen to deal with this but
> > would love to understand what goes on behind the scenes
> >
> > Thanks
> > Nitin
> >
>

Range field for interger

2014-08-08 Thread Poornima Jay
Hi,

I am using solr 3.6.1 and trying to find a range on a field which was defined 
as integer. but i'm not getting accurate results. below is my schema.

The input will be as [-1 TO 0] or [2 TO 5]






my query string will be interestlevel:[-1 TO 0] -- this is returning only 2 
records from solr where as it has 21 records in the DB.

Please advice.

Thanks,
Poornima


Chinese language search in SOLR 3.6.1

2013-10-22 Thread Poornima Jay
Hi,

Did any one face a problem for chinese language in SOLR 3.6.1. Below is the 
analyzer in the schema.xml file.


      
          
           
           
          

      
      
        
          
          
          
      
 

It works fine with the chinese strings but not working with product code or 
ISBN even though the fields are defined as string.

Please let me know how should the chinese schema be configured.

Thanks.
Poornima


Re: Chinese language search in SOLR 3.6.1

2013-10-22 Thread Poornima Jay
Hi Rajani,

Below is the configured in my schema.

      
                
        
        
        
      
      
        
        
        
        

      
    














if I search with the query q=simple:总评价 it works but doesn't work if I search 
with q=simple:676767667. If the field is defined as string the chinese 
character works but doesn't work if it is defined as text_chinese.

Regards,
Poornima





On Tuesday, 22 October 2013 7:52 PM, Rajani Maski  wrote:
 
Hi Poornima,

  Your statement :   "It works fine with the chinese strings but not working 
with product code or ISBN even though the fields are defined as string" is 
confusing. 

Did you mean that the product code and ISBN fields are of type text_Chinese?

Is it first or second:


or 




What do you refer to when you tell that it's not working? Unable to search?


















On Tue, Oct 22, 2013 at 6:09 PM, Poornima Jay  
wrote:

Hi,
>
>Did any one face a problem for chinese language in SOLR 3.6.1. Below is the 
>analyzer in the schema.xml file.
>
>positionIncrementGap="100">
>      
>          
>           words="stopwords.txt" enablePositionIncrements="true"/>
>           
>          
>
>      
>      
>        
>          
>          words="stopwords.txt"/>
>          
>      
> 
>
>It works fine with the chinese strings but not working with product code or 
>ISBN even though the fields are defined as string.
>
>Please let me know how should the chinese schema be configured.
>
>Thanks.
>Poornima
>

Re: Chinese language search in SOLR 3.6.1

2013-10-23 Thread Poornima Jay
Hi Rajani,

The string field type is not analyzed. But that is not the case for 
text_chinese field type for which is  ChineseTokenizerFactory and 
ChineseFilterFactory is added for index and query analysis. Below check the 
schema and the fields how it is defined in my above mail.

Thanks,
Poornima



On Wednesday, 23 October 2013 7:21 AM, Rajani Maski  
wrote:
 
String field will work for any case when you do exact key search.
text_chinese also should work if you are simply searching with exact
string"676767667".

Well, the best way to find an answer to this query is by using solr
analysis tool : http://localhost:8983/solr/#/collection1/analysis
Enter your field type and index time input that you had given with query
value that you are searching for.

You should be able to find your answers.






On Tue, Oct 22, 2013 at 8:06 PM, Poornima Jay wrote:

> Hi Rajani,
>
> Below is the configured in my schema.
>  positionIncrementGap="100">
>       
>         
>           words="stopwords.txt"   enablePositionIncrements="true" />
>         
>         
>       
>       
>         
>         
>          words="stopwords.txt"/>
>         
>       
>     
>
>  multiValued="true" />
>  stored="false" multiValued="true"/>
>  stored="false" multiValued="true" />
>  multiValued="true" />
> 
> 
>
> if I search with the query q=simple:总评价 it works but doesn't work if I
> search with q=simple:676767667. If the field is defined as string the
> chinese character works but doesn't work if it is defined as text_chinese.
>
> Regards,
> Poornima
>
>
>
>
>   On Tuesday, 22 October 2013 7:52 PM, Rajani Maski 
> wrote:
>  Hi Poornima,
>
>   Your statement :   "It works fine with the chinese strings but not
> working with product code or ISBN even though the fields are defined as
> string" is confusing.
>
> Did you mean that the product code and ISBN fields are of type
> text_Chinese?
>
> Is it first or second:
> 
> or
>  stored="false"/>
>
>
> What do you refer to when you tell that it's not working? Unable to search?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Oct 22, 2013 at 6:09 PM, Poornima Jay 
> wrote:
>
> Hi,
>
> Did any one face a problem for chinese language in SOLR 3.6.1. Below is
> the analyzer in the schema.xml file.
>
>  positionIncrementGap="100">
>       
>           
>             words="stopwords.txt" enablePositionIncrements="true"/>
>            
>           
>
>       
>       
>         
>           
>            words="stopwords.txt"/>
>           
>       
>  
>
> It works fine with the chinese strings but not working with product code
> or ISBN even though the fields are defined as string.
>
> Please let me know how should the chinese schema be configured.
>
> Thanks.
> Poornima
>
>
>
>
>

Spell check SOLR 3.6.1 not working for numbers

2013-07-25 Thread Poornima Jay
Hi,

I using SOLR 3.6.1 and implemented spellcheck. I found that the numbers in the 
spellcheck query does not return any results. Below is my solrconfig.xml and 
schema.xml details. Please any one let me know what needs to be done in order 
to get the spell check for numbers.

solrConfig

     
    default   
    solr.IndexBasedSpellChecker
    spell  
    ./spellchecker   
    0.7    
    true
    .0001
   
  textSpell



  
    
    default   
    
    false
    
    false
    
    10
  
      
      spellcheck
        
  

Schema

         
            
            
            
            
            
            
         
        
         
        
        
        
      
      




   
   
 
   

Thanks,
Poornima

Re: Spell check SOLR 3.6.1 not working for numbers

2013-07-26 Thread Poornima Jay
Hi James,

Thanks for you reply. I got it worked and below was my old query.
 
http://localhost:8080/solr_3.6.1_spellcheck/test_spellcheck/spellcheck?q=8956632541&spellcheck=true


now I changed the q to spellcheck.q and it started working. This is the response
0210108956632541589566325415 


Regards,
Poornima



 From: "Dyer, James" 
To: "solr-user@lucene.apache.org"  
Sent: Thursday, 25 July 2013 9:03 PM
Subject: RE: Spell check SOLR 3.6.1 not working for numbers
 

I think the default SpellingQueryConverter has a hard time with terms that 
contain numbers.  Can you provide a failing case...the query you're executing 
(with all the spellcheck.xxx params) and the spellcheck response (or lack 
thereof).  Is it producing any hits?

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Poornima Jay [mailto:poornima...@rocketmail.com] 
Sent: Thursday, July 25, 2013 5:00 AM
To: solr-user
Subject: Spell check SOLR 3.6.1 not working for numbers

Hi,

I using SOLR 3.6.1 and implemented spellcheck. I found that the numbers in the 
spellcheck query does not return any results. Below is my solrconfig.xml and 
schema.xml details. Please any one let me know what needs to be done in order 
to get the spell check for numbers.

solrConfig

     
    default   
    solr.IndexBasedSpellChecker
    spell  
    ./spellchecker   
    0.7    
    true
    .0001
   
  textSpell



  
    
    default   
    
    false
    
    false
    
    10
  
      
      spellcheck
        
  

Schema

         
            
            
            
            
            
            
         
        
         
        
        
        
      
      




   
   
 
   

Thanks,
Poornima

SOLR 3.6.1 auto complete sorting

2013-09-06 Thread Poornima Jay
Hi, 

We had implemented Auto Complete feature in our site. Below are the solr config 
details.

schema.xml

 
         
            
            
            
            
            
         
         
            
            
            
            
         
      



 




 
   
   
   
  
solrquery is  
q=ph_su%3Aepub+&start=0&rows=10&fl=dams_id&wt=json&indent=on&hl=true&hl.fl=ph_su&hl.simple.pre=&hl.simple.post=

the requirement is to sort the results based on releavance and latest published 
products for the search term.

I have the below parameters but nothing worked

sort = dams_id desc,published_date desc
order_by = dams_id desc,published_date desc

Please let me know how to sort the results with relevance and published date 
descending.

Thanks,
Poornima