Language detection for solr 3.6.1
Hi, Can anyone please let me know how to integrate http://code.google.com/p/language-detection/ in solr 3.6.1. I want four languages (English, chinese simplified, chinese traditional, Japanes, and Korean) to be added in one schema ie. multilingual search from single schema file. I tried added solr-langdetect-3.5.0.jar in my /solr/contrib/langid/lib/ location and in /webapps/solr/WEB-INF/contrib/langid/lib/ and made changes in the solrconfig.xml as below content_eng true content_eng,content_ja en,ja en:english ja:japanese en langid Please suggest me the solution. Thanks, Poornima
Re: Language detection for solr 3.6.1
Hi, Please let me know if anyone had used google language detection for implementing multilanguage search in one schema. Thanks, Poornima On Tuesday, 1 July 2014 6:54 PM, Poornima Jay wrote: Hi, Can anyone please let me know how to integrate http://code.google.com/p/language-detection/ in solr 3.6.1. I want four languages (English, chinese simplified, chinese traditional, Japanes, and Korean) to be added in one schema ie. multilingual search from single schema file. I tried added solr-langdetect-3.5.0.jar in my /solr/contrib/langid/lib/ location and in /webapps/solr/WEB-INF/contrib/langid/lib/ and made changes in the solrconfig.xml as below content_eng true content_eng,content_ja en,ja en:english ja:japanese en langid Please suggest me the solution. Thanks, Poornima
Re: Fwd: Language detection for solr 3.6.1
When i use solr-langid-3.5.0.jar file after reloading the core i am getting the below error SEVERE: java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException Even after adding the solr-jsonic-3.5.0.jar file in the webapps folder. Thanks, Poornima On Tuesday, 8 July 2014 3:36 PM, Alexandre Rafalovitch wrote: -- Forwarded message -- From: Poornima Jay Date: Tue, Jul 8, 2014 at 5:03 PM Subject: Re: Language detection for solr 3.6.1 When i try to use solr-langid-3.6.1.jar file in my path /apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/ and define the path in the solrconfig.xml as below I am getting the below error while reloading the core. SEVERE: java.lang.NoClassDefFoundError: com/cybozu/labs/langdetect/DetectorFactory Please advice. Thanks, Poornima On Tuesday, 8 July 2014 9:58 AM, Alexandre Rafalovitch wrote: If you are having troubles with jar location, just use absolute path in your lib statement and use path, not dir/regex. That will complain louder. You should be using the latest jar matching the version, they should be shipped with Solr itself. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Jul 8, 2014 at 11:14 AM, Poornima Jay wrote: > I am facing the issue with the jar file location. Where should i place the > solr-langid-3.6.1.jar. If i place it in the instance folder inside > /lib/solr-langid-3.6.1.jar the language detection class are not loaded. > Should i use solr-langid-3.5.1.jar in solr 3.6.1 version? > > Can you please attach the schema file also for reference. > > > > > where exactly the jar file should be placed? /dist/ or /contrib/langid/lib/ > > Thanks for your time. > > Regards, > Poornima > > > > On Monday, 7 July 2014 2:42 PM, Alexandre Rafalovitch > wrote: > > > I've had an example in my book: > https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/conf/solrconfig.xml > , though it was for Solr 4.2+. Solr in Action also has a section on > multilingual indexing. There is no generic advice, as everybody seems > to have slightly different multilingual requirements, but the books > will at least discuss the main issues. > > Regarding your specific email from a week ago, You haven't actually > said what is the problem was. Just what you did. So, we don't know > where you are stuck and what - specifically - you need help with. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Mon, Jul 7, 2014 at 4:06 PM, Poornima Jay > wrote: >> Hi, >> >> Please let me know if anyone had used google language detection for >> implementing multilanguage search in one schema. >> >> Thanks, >> Poornima >> >> >> >> >> On Tuesday, 1 July 2014 6:54 PM, Poornima Jay >> wrote: >> >> >> Hi, >> >> Can anyone please let me know how to integrate >> http://code.google.com/p/language-detection/ in solr 3.6.1. I want four >> languages (English, chinese simplified, chinese traditional, Japanes, and >> Korean) to be added in one schema ie. multilingual search from single >> schema >> file. >> >> I tried added solr-langdetect-3.5.0.jar in my /solr/contrib/langid/lib/ >> location and in /webapps/solr/WEB-INF/contrib/langid/lib/ and made changes >> in the solrconfig.xml as below >> >> > class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/> >> >> >> > >> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory"> >> >> content_eng >> true >> content_eng,content_ja >> en,ja >> en:english ja:japanese >> en >> >> >> >> >> >> >> langid >> >> >> >> Please suggest me the solution. >> >> Thanks, >> Poornima >> > >
Re: Fwd: Language detection for solr 3.6.1
I'm using the google library which I has mentioned in my first mail saying Im using http://code.google.com/p/language-detection/. I have downloaded the jar file from the below url https://www.versioneye.com/java/org.apache.solr:solr-langid/3.6.1 Please let me know from where I need to download the correct jar file. Regards, Poornima On Tuesday, 8 July 2014 3:42 PM, Alexandre Rafalovitch wrote: I just realized you are not using Solr language detect libraries. You are using third party one. You did mention that in your first message. I don't see that library integrated with Solr though, just as a standalone library. So, you can't just plug in it. Is there any reason you cannot use one of the two libraries Solr does already have (Tika's and Google's)? What's so special about that one? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Jul 8, 2014 at 5:08 PM, Poornima Jay wrote: > When i use solr-langid-3.5.0.jar file after reloading the core i am getting > the below error > > SEVERE: java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException > > > Even after adding the solr-jsonic-3.5.0.jar file in the webapps folder. > > Thanks, > Poornima > > > > On Tuesday, 8 July 2014 3:36 PM, Alexandre Rafalovitch > wrote: > > > > -- Forwarded message -- > > From: Poornima Jay > Date: Tue, Jul 8, 2014 at 5:03 PM > Subject: Re: Language detection for solr 3.6.1 > > > When i try to use solr-langid-3.6.1.jar file in my path > /apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/ > and define the path in the solrconfig.xml as below > > dir="/home/searchuser/apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/" > regex="solr-langid-.*\.jar" /> > > I am getting the below error while reloading the core. > > SEVERE: java.lang.NoClassDefFoundError: > com/cybozu/labs/langdetect/DetectorFactory > > Please advice. > > Thanks, > Poornima > > > On Tuesday, 8 July 2014 9:58 AM, Alexandre Rafalovitch > wrote: > > > If you are having troubles with jar location, just use absolute path > in your lib statement and use path, not dir/regex. That will complain > louder. You should be using the latest jar matching the version, they > should be shipped with Solr itself. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > On Tue, Jul 8, 2014 at 11:14 AM, Poornima Jay > wrote: >> I am facing the issue with the jar file location. Where should i place the >> solr-langid-3.6.1.jar. If i place it in the instance folder inside >> /lib/solr-langid-3.6.1.jar the language detection class are not loaded. >> Should i use solr-langid-3.5.1.jar in solr 3.6.1 version? >> >> Can you please attach the schema file also for reference. >> >> >> >> >> where exactly the jar file should be placed? /dist/ or /contrib/langid/lib/ >> >> Thanks for your time. >> >> Regards, >> Poornima >> >> >> >> On Monday, 7 July 2014 2:42 PM, Alexandre Rafalovitch >> wrote: >> >> >> I've had an example in my book: >> https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/conf/solrconfig.xml >> , though it was for Solr 4.2+. Solr in Action also has a section on >> multilingual indexing. There is no generic advice, as everybody seems >> to have slightly different multilingual requirements, but the books >> will at least discuss the main issues. >> >> Regarding your specific email from a week ago, You haven't actually >> said what is the problem was. Just what you did. So, we don't know >> where you are stuck and what - specifically - you need help with. >> >> Regards, >> Alex. >> Personal website: http://www.outerthoughts.com/ >> Current project: http://www.solr-start.com/ - Accelerating your Solr >> proficiency >> >> >> On Mon, Jul 7, 2014 at 4:06 PM, Poornima Jay >> wrote: >>> Hi, >>> >>> Please let me know if anyone had used google language detection for >>> implementing multilanguage search in one schema. >>> >>> Thanks, >>> Poornima >>> >>> >>> >>> >>> On Tuesday, 1 July 2014 6:54 PM, Poornima Jay >>> wrote: >>> >>> >>> Hi, >>> >>> Can anyone please let me know how to integrate >>> http://co
Korean Tokenizer in solr
Hi, Anyone tried to implement korean language in solr 3.6.1. I define the field as below in my schema file but the fieldtype is not working. Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype 'text_kr' specified on field product_name_kr Regards, Poornima
Re: Korean Tokenizer in solr
I have defined the fieldtype inside the fields section. When i checked the error log i found the below error Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory SEVERE: org.apache.solr.common.SolrException: analyzer without class or tokenizer & filter list Do i need to add any libraries for koreanTokenizer? Regards, Poornima On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch wrote: Double check your xml file that you don't - for example - define your fieldType outside of fields section. Or maybe you have exception earlier about some component in the type definition. This is not about Korean language, it seems. Something more fundamentally about XML config. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay wrote: > Hi, > > Anyone tried to implement korean language in solr 3.6.1. I define the field > as below in my schema file but the fieldtype is not working. > > > > > > hasCNoun="true" bigrammable="true"/> > > words="stopwords_kr.txt"/> > > > > hasCNoun="false" bigrammable="false"/> > > words="stopwords_kr.txt"/> > > > > Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype > 'text_kr' specified on field product_name_kr > > Regards, > Poornima >
Re: Korean Tokenizer in solr
Till now I was thinking solr will support KoreanTokenizer. I haven't used any other 3rd party one. Actually the issue i am facing is I need to integrate English, Chinese, Japanese and Korean language search in a single site. Based on the user's selected language to search the fields will be queried appropriately. I tried using cjk for all the 3 languages like below but only few search terms work for Chinese and Japanese. nothing works for Korean. So i tried to implement individual fieldtype for each language as below Chinese Japanese Korean I am really struck how to implement this. Please help me. Thanks, Poornima On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch wrote: I don't think Solr ships with Korean Tokenizer, does it? If you are using a 3rd party one, you need to give full class name, not just solr.Korean... And you need the library added in the lib statement in solrconfig.xml (at least in Solr 4). Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Thu, Jul 10, 2014 at 3:23 PM, Poornima Jay wrote: > I have defined the fieldtype inside the fields section. When i checked the > error log i found the below error > > Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory > > SEVERE: org.apache.solr.common.SolrException: analyzer without class or > tokenizer & filter list > > > Do i need to add any libraries for koreanTokenizer? > > Regards, > Poornima > > > On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch > wrote: > > > > Double check your xml file that you don't - for example - define your > fieldType outside of fields section. Or maybe you have exception > earlier about some component in the type definition. > > This is not about Korean language, it seems. Something more > fundamentally about XML config. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay > wrote: >> Hi, >> >> Anyone tried to implement korean language in solr 3.6.1. I define the field >> as below in my schema file but the fieldtype is not working. >> >> >> >> >> >> > hasCNoun="true" bigrammable="true"/> >> >> > words="stopwords_kr.txt"/> >> >> >> >> > hasCNoun="false" bigrammable="false"/> >> >> > words="stopwords_kr.txt"/> >> >> >> >> Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype >> 'text_kr' specified on field product_name_kr >> >> Regards, >> Poornima >>
Re: Korean Tokenizer in solr
I have upgrade the solr version to 4.8.1. But after making changes in the schema file i am getting the below error Error instantiating class: 'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory' I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in 4.8.1. Do I need to make any configuration changes to get this working. Please advice. Regards, Poornima On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch wrote: I would suggest you read through all 12 (?) articles in this series: http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html . It will probably lay out most of the issues for you. And if you are starting, I would really suggest using the latest Solr (4.9). A lot more people remember what the latest version has then what was in 3.6. And, as the series above will tell you, some relevant issues had been fixed in more recent Solr versions. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay wrote: > Till now I was thinking solr will support KoreanTokenizer. I haven't used any > other 3rd party one. > Actually the issue i am facing is I need to integrate English, Chinese, > Japanese and Korean language search in a single site. Based on the user's > selected language to search the fields will be queried appropriately. > > I tried using cjk for all the 3 languages like below but only few search > terms work for Chinese and Japanese. nothing works for Korean. > > positionIncrementGap="1" autoGeneratePhraseQueries="false"> > > > > > id="Traditional-Simplified"/> > id="Katakana-Hiragana"/> > > hiragana="true" katakana="true" hangul="true" outputUnigrams="true" /> > > > > So i tried to implement individual fieldtype for each language as below > > Chinese > positionIncrementGap="1000" autoGeneratePhraseQueries="false"> > > > > > > > > > Japanese > autoGeneratePhraseQueries="false"> > > > > tags="stoptags_ja.txt" /> > > words="stopwords_ja.txt" /> > minimumLength="4"/> > > > > > Korean > autoGeneratePhraseQueries="false"> > > > hasCNoun="true" bigrammable="true"/> > > words="stopwords_kr.txt"/> > > > > hasCNoun="false" bigrammable="false"/> > > words="stopwords_kr.txt"/> > > > > I am really struck how to implement this. Please help me. > > Thanks, > Poornima > > > > On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch > wrote: > > > > I don't think Solr ships with Korean Tokenizer, does it? > > If you are using a 3rd party one, you need to give full class name, > not just solr.Korean... And you need the library added in the lib > statement in solrconfig.xml (at least in Solr 4). > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > On Thu, Jul 10, 2014 at 3:23 PM, Poornima Jay > wrote: >> I have defined the fieldtype inside the fields section. When i checked the >> error log i found the below error >> >> Caused by: java.lang.ClassNotFoundException: solr.KoreanTokenizerFactory >> >> SEVERE: org.apache.solr.common.SolrException: analyzer without class or >> tokenizer & filter list >> >> >> Do i need to add any libraries for koreanTokenizer? >> >> Regards, >> Poornima >> >> >> On Thursday, 10 July 2014 1:03 PM, Alexandre Rafalovitch >> wrote: >> >> >> >> Double check your xml file that you don't - for example - define your >> fieldType outside of fields section. Or maybe you have exception >> earlier about some component in the type definition. >> >> This is not about Korean language, it seems. Something more >> fundamentally about XML config. >> >> Regards, >> Alex. >> Personal website: http://www.outerthoughts.com/ >> Current project: http://www.solr-start.com/ - Accelerating your Solr >> proficiency >> >> >> >> On Thu, Jul 10, 2014 at 2:26 PM, Poornima Jay >> wrote: >>> Hi, >>> >>> Anyone tried to implement korean language in solr 3.6.1. I define the field >>> as below in my schema file but the fieldtype is not working. >>> >>> >>> >>> >>> >>> >> hasCNoun="true" bigrammable="true"/> >>> >>> >> words="stopwords_kr.txt"/> >>> >>> >>> >>> >> hasCNoun="false" bigrammable="false"/> >>> >>> >> words="stopwords_kr.txt"/> >>> >>> >>> >>> Error : Caused by: org.apache.solr.common.SolrException: Unknown fieldtype >>> 'text_kr' specified on field product_name_kr >>> >>> Regards, >>> Poornima >>>
Re: Korean Tokenizer in solr
Yes, Below is my defined fieldtype Please correct me if I am doing anything wrong here Regards, Poornima On Monday, 14 July 2014 12:33 PM, Alexandre Rafalovitch wrote: You sure, it's not a spelling error or something other weird like that? Because Solr ships with that filter in it's example schema: So, you can compare what you are doing differently with that. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Mon, Jul 14, 2014 at 1:58 PM, Poornima Jay wrote: > I have upgrade the solr version to 4.8.1. But after making changes in the > schema file i am getting the below error > Error instantiating class: > 'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory' > I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in > 4.8.1. Do I need to make any configuration changes to get this working. > > Please advice. > > Regards, > Poornima > > > On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch > wrote: > > > > I would suggest you read through all 12 (?) articles in this series: > http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html > . It will probably lay out most of the issues for you. > > And if you are starting, I would really suggest using the latest Solr > (4.9). A lot more people remember what the latest version has then > what was in 3.6. And, as the series above will tell you, some relevant > issues had been fixed in more recent Solr versions. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay > wrote: >> Till now I was thinking solr will support KoreanTokenizer. I haven't used >> any other 3rd party one. >> Actually the issue i am facing is I need to integrate English, Chinese, >> Japanese and Korean language search in a single site. Based on the user's >> selected language to search the fields will be queried appropriately. >> >> I tried using cjk for all the 3 languages like below but only few search >> terms work for Chinese and Japanese. nothing works for Korean. >> >> > positionIncrementGap="1" autoGeneratePhraseQueries="false"> >> >> >> >> >class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/> >> >id="Traditional-Simplified"/> >> >id="Katakana-Hiragana"/> >> >> >hiragana="true" katakana="true" hangul="true" outputUnigrams="true" /> >> >> >> >> So i tried to implement individual fieldtype for each language as below >> >> Chinese >> >positionIncrementGap="1000" autoGeneratePhraseQueries="false"> >> >> >> >> >> >> >> >> >> Japanese >> > autoGeneratePhraseQueries="false"> >> >> >> >> >tags="stoptags_ja.txt" /> >> >> >words="stopwords_ja.txt" /> >> >minimumLength="4"/> >> >> >> >> >> Korean >> > autoGeneratePhraseQueries="false"> >> >> >> >hasCNoun="true" bigrammable="true"/> >> >> >words="stopwords_kr.txt"/> >> >> >> >> >hasCNoun="false" bigrammable="false"/> >> >> >words="stopwords_kr.txt"/> >> >> >> >> I am really struck how to implement this. Please help me. >> >> Thanks, >> Poornima >> >> >> >> On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch >> wrote: >> >> >> >> I don't think Solr ships with Korean Tokenizer, does it? >> >> If you are using a 3rd party one, you need to give full class name, >> not just solr.Korean... And you need the library added in the lib >> statement in solrconfig.xml (at least in Solr 4). >> >> Regards, >> Alex. >>
Re: Korean Tokenizer in solr
When I am trying to index the below error comes java.io.FileNotFoundException: /home/searchuser/multicore/apac_content/data/tlog/tlog.000 (No such file or directory) On Monday, 14 July 2014 2:07 PM, Poornima Jay wrote: Yes, Below is my defined fieldtype Please correct me if I am doing anything wrong here Regards, Poornima On Monday, 14 July 2014 12:33 PM, Alexandre Rafalovitch wrote: You sure, it's not a spelling error or something other weird like that? Because Solr ships with that filter in it's example schema: So, you can compare what you are doing differently with that. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Mon, Jul 14, 2014 at 1:58 PM, Poornima Jay wrote: > I have upgrade the solr version to 4.8.1. But after making changes in the > schema file i am getting the below error > Error instantiating class: > 'org.apache.lucene.analysis.cjk.CJKBigramFilterFactory' > I assume CJKBigramFilterFactory and CJKFoldingFilterFactory are supported in > 4.8.1. Do I need to make any configuration changes to get this working. > > Please advice. > > Regards, > Poornima > > > On Thursday, 10 July 2014 2:45 PM, Alexandre Rafalovitch > wrote: > > > > I would suggest you read through all 12 (?) articles in this series: > http://discovery-grindstone.blogspot.com/2013/10/cjk-with-solr-for-libraries-part-1.html > . It will probably lay out most of the issues for you. > > And if you are starting, I would really suggest using the latest Solr > (4.9). A lot more people remember what the latest version has then > what was in 3.6. And, as the series above will tell you, some relevant > issues had been fixed in more recent Solr versions. > > Regards, > Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > On Thu, Jul 10, 2014 at 4:11 PM, Poornima Jay > wrote: >> Till now I was thinking solr will support KoreanTokenizer. I haven't used >> any other 3rd party one. >> Actually the issue i am facing is I need to integrate English, Chinese, >> Japanese and Korean language search in a single site. Based on the user's >> selected language to search the fields will be queried appropriately. >> >> I tried using cjk for all the 3 languages like below but only few search >> terms work for Chinese and Japanese. nothing works for Korean. >> >> > positionIncrementGap="1" autoGeneratePhraseQueries="false"> >> >> >> >> >class="edu.stanford.lucene.analysis.CJKFoldingFilterFactory"/> >> >id="Traditional-Simplified"/> >> >id="Katakana-Hiragana"/> >> >> >hiragana="true" katakana="true" hangul="true" outputUnigrams="true" /> >> >> >> >> So i tried to implement individual fieldtype for each language as below >> >> Chinese >> >positionIncrementGap="1000" autoGeneratePhraseQueries="false"> >> >> >> >> >> >> >> >> >> Japanese >> > autoGeneratePhraseQueries="false"> >> >> >> >> >tags="stoptags_ja.txt" /> >> >> >words="stopwords_ja.txt" /> >> >minimumLength="4"/> >> >> >> >> >> Korean >> > autoGeneratePhraseQueries="false"> >> >> >> >hasCNoun="true" bigrammable="true"/> >> >> >words="stopwords_kr.txt"/> >> >> >> >> >hasCNoun="false" bigrammable="false"/> >> >> >words="stopwords_kr.txt"/> >> >> >> >> I am really struck how to implement this. Please help me. >> >> Thanks, >> Poornima >> >> >> >> On Thursday, 10 July 2014 2:22 PM, Alexandre Rafalovitch >> wrote: >> >> >> >> I don't think Solr ships with Korean Tokenizer, does it? >>
Re: Perm Gen issues in SolrCloud
Hi Nitin, Not sure of you have tried these steps. 1. Stop the Tomcat Server. 2.Find catalina.bat 3.Assign following line to JAVA_OPTS variable and add it into catalina.bat file. set JAVA_OPTS=-server -Xms512M -Xmx768M -XX:MaxPermSize=256m 4. restart On Saturday, 1 March 2014 6:02 AM, KNitin wrote: Hi Furkan I have read that before but I haven't added any new classes or changed anything with my setup. I just created more collections in solr. How will that increase perm gen space ? Doesn't solr intern strings at all ? Interned strings also go to the perm gen space right? - Nitin On Fri, Feb 28, 2014 at 3:11 PM, Furkan KAMACI wrote: > Hi; > > Jack has an answer for a PermGen usages: > > "PermGen memory has to do with number of classes loaded, rather than > documents. > > Here are a couple of pages that help explain Java PermGen issues. The > bottom > line is that you can increase the PermGen space, or enable unloading of > classes, or at least trace class loading to see why the problem occurs. > > > http://stackoverflow.com/questions/88235/how-to-deal-with-java-lang-outofmemoryerror- > permgen-space-error > > http://www.brokenbuild.com/blog/2006/08/04/java-jvm-gc-permgen > -and-memory-options/ > " > > You can see the conversation from here: > http://search-lucene.com/m/iMaR11lgj3Q1/permgen&subj=PermGen+OOM+Error > > Thanks; > Furkan KAMACI > > > 2014-02-28 21:37 GMT+02:00 KNitin : > > > Hi > > > > I am seeing the Perm Gen usage increase as i keep adding more > collections. > > What kind of strings get interned in solr? (Only schema , fields, > > collection metadata or the data itself?) > > > > Will Permgen space (atleast interned strings) increase proportional to > the > > size of the data in the collections or with the # of collections > > themselves? > > > > > > I have temporarily increased the size of PermGen to deal with this but > > would love to understand what goes on behind the scenes > > > > Thanks > > Nitin > > >
Range field for interger
Hi, I am using solr 3.6.1 and trying to find a range on a field which was defined as integer. but i'm not getting accurate results. below is my schema. The input will be as [-1 TO 0] or [2 TO 5] my query string will be interestlevel:[-1 TO 0] -- this is returning only 2 records from solr where as it has 21 records in the DB. Please advice. Thanks, Poornima
Chinese language search in SOLR 3.6.1
Hi, Did any one face a problem for chinese language in SOLR 3.6.1. Below is the analyzer in the schema.xml file. It works fine with the chinese strings but not working with product code or ISBN even though the fields are defined as string. Please let me know how should the chinese schema be configured. Thanks. Poornima
Re: Chinese language search in SOLR 3.6.1
Hi Rajani, Below is the configured in my schema. if I search with the query q=simple:总评价 it works but doesn't work if I search with q=simple:676767667. If the field is defined as string the chinese character works but doesn't work if it is defined as text_chinese. Regards, Poornima On Tuesday, 22 October 2013 7:52 PM, Rajani Maski wrote: Hi Poornima, Your statement : "It works fine with the chinese strings but not working with product code or ISBN even though the fields are defined as string" is confusing. Did you mean that the product code and ISBN fields are of type text_Chinese? Is it first or second: or What do you refer to when you tell that it's not working? Unable to search? On Tue, Oct 22, 2013 at 6:09 PM, Poornima Jay wrote: Hi, > >Did any one face a problem for chinese language in SOLR 3.6.1. Below is the >analyzer in the schema.xml file. > >positionIncrementGap="100"> > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > > > > > > words="stopwords.txt"/> > > > > >It works fine with the chinese strings but not working with product code or >ISBN even though the fields are defined as string. > >Please let me know how should the chinese schema be configured. > >Thanks. >Poornima >
Re: Chinese language search in SOLR 3.6.1
Hi Rajani, The string field type is not analyzed. But that is not the case for text_chinese field type for which is ChineseTokenizerFactory and ChineseFilterFactory is added for index and query analysis. Below check the schema and the fields how it is defined in my above mail. Thanks, Poornima On Wednesday, 23 October 2013 7:21 AM, Rajani Maski wrote: String field will work for any case when you do exact key search. text_chinese also should work if you are simply searching with exact string"676767667". Well, the best way to find an answer to this query is by using solr analysis tool : http://localhost:8983/solr/#/collection1/analysis Enter your field type and index time input that you had given with query value that you are searching for. You should be able to find your answers. On Tue, Oct 22, 2013 at 8:06 PM, Poornima Jay wrote: > Hi Rajani, > > Below is the configured in my schema. > positionIncrementGap="100"> > > > words="stopwords.txt" enablePositionIncrements="true" /> > > > > > > > words="stopwords.txt"/> > > > > > multiValued="true" /> > stored="false" multiValued="true"/> > stored="false" multiValued="true" /> > multiValued="true" /> > > > > if I search with the query q=simple:总评价 it works but doesn't work if I > search with q=simple:676767667. If the field is defined as string the > chinese character works but doesn't work if it is defined as text_chinese. > > Regards, > Poornima > > > > > On Tuesday, 22 October 2013 7:52 PM, Rajani Maski > wrote: > Hi Poornima, > > Your statement : "It works fine with the chinese strings but not > working with product code or ISBN even though the fields are defined as > string" is confusing. > > Did you mean that the product code and ISBN fields are of type > text_Chinese? > > Is it first or second: > > or > stored="false"/> > > > What do you refer to when you tell that it's not working? Unable to search? > > > > > > > > > > > > > > > > > > On Tue, Oct 22, 2013 at 6:09 PM, Poornima Jay > wrote: > > Hi, > > Did any one face a problem for chinese language in SOLR 3.6.1. Below is > the analyzer in the schema.xml file. > > positionIncrementGap="100"> > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > > > > > > words="stopwords.txt"/> > > > > > It works fine with the chinese strings but not working with product code > or ISBN even though the fields are defined as string. > > Please let me know how should the chinese schema be configured. > > Thanks. > Poornima > > > > >
Spell check SOLR 3.6.1 not working for numbers
Hi, I using SOLR 3.6.1 and implemented spellcheck. I found that the numbers in the spellcheck query does not return any results. Below is my solrconfig.xml and schema.xml details. Please any one let me know what needs to be done in order to get the spell check for numbers. solrConfig default solr.IndexBasedSpellChecker spell ./spellchecker 0.7 true .0001 textSpell default false false 10 spellcheck Schema Thanks, Poornima
Re: Spell check SOLR 3.6.1 not working for numbers
Hi James, Thanks for you reply. I got it worked and below was my old query. http://localhost:8080/solr_3.6.1_spellcheck/test_spellcheck/spellcheck?q=8956632541&spellcheck=true now I changed the q to spellcheck.q and it started working. This is the response 0210108956632541589566325415 Regards, Poornima From: "Dyer, James" To: "solr-user@lucene.apache.org" Sent: Thursday, 25 July 2013 9:03 PM Subject: RE: Spell check SOLR 3.6.1 not working for numbers I think the default SpellingQueryConverter has a hard time with terms that contain numbers. Can you provide a failing case...the query you're executing (with all the spellcheck.xxx params) and the spellcheck response (or lack thereof). Is it producing any hits? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Poornima Jay [mailto:poornima...@rocketmail.com] Sent: Thursday, July 25, 2013 5:00 AM To: solr-user Subject: Spell check SOLR 3.6.1 not working for numbers Hi, I using SOLR 3.6.1 and implemented spellcheck. I found that the numbers in the spellcheck query does not return any results. Below is my solrconfig.xml and schema.xml details. Please any one let me know what needs to be done in order to get the spell check for numbers. solrConfig default solr.IndexBasedSpellChecker spell ./spellchecker 0.7 true .0001 textSpell default false false 10 spellcheck Schema Thanks, Poornima
SOLR 3.6.1 auto complete sorting
Hi, We had implemented Auto Complete feature in our site. Below are the solr config details. schema.xml solrquery is q=ph_su%3Aepub+&start=0&rows=10&fl=dams_id&wt=json&indent=on&hl=true&hl.fl=ph_su&hl.simple.pre=&hl.simple.post= the requirement is to sort the results based on releavance and latest published products for the search term. I have the below parameters but nothing worked sort = dams_id desc,published_date desc order_by = dams_id desc,published_date desc Please let me know how to sort the results with relevance and published date descending. Thanks, Poornima