questions on Solr WordBreakSolrSpellChecker and WordDelimiterFilterFactory
Hello everyone :) I have a product called "xbox" indexed, and when the user search for either "x-box" or "x box" i want the "xbox" product to be returned. I'm new to Solr, and from reading online, I thought I need to use WordDelimiterFilterFactory for "x-box" case, and WordBreakSolrSpellChecker for "x box" case. Is this correct? (1) In my schema file, this is what I changed: But I don't see the xbox product returned when the search term is "x-box", so I must have missed something (2) I tried to use WordBreakSolrSpellChecker together with DirectSolrSpellChecker as shown below, but the WordBreakSolrSpellChecker never got used: wc_textSpell default spellCheck solr.DirectSolrSpellChecker internal 0.3 2 1 5 3 0.01 0.004 wordbreak solr.WordBreakSolrSpellChecker spellCheck true true 10 SpellCheck true default wordbreak true false 10 true false wc_spellcheck I tried to build the dictionary this way: http://localhost/solr/coreName/select?spellcheck=true&spellcheck.build=true, but the response returned is this: 0 0 true true build What's the correct way to build the dictionary? Even though my requestHandler's name="/spellcheck", i wasn't able to use http://localhost/solr/coreName/spellcheck?spellcheck=true&spellcheck.build=true .. is there something wrong with my definition above? (3) I also tried to use WordBreakSolrSpellChecker without the DirectSolrSpellChecker as shown below: wc_textSpell default solr.WordBreakSolrSpellChecker spellCheck true true 10 SpellCheck true default true false 10 true false wc_spellcheck And still unable to see WordBreakSolrSpellChecker being called anywhere. Would someone kindly help me? Many thanks, Jia
Re: questions on Solr WordBreakSolrSpellChecker and WordDelimiterFilterFactory
Hi Ahmet, using or didn't make any difference. Still running into the same issues aforementioned :( Thanks, Jia On 7/16/2014, "Ahmet Arslan" wrote: >Hi Jia, > >What happens when you use > > > >instead of > > > >Ahmet > > >On Wednesday, July 16, 2014 3:07 AM, "j...@ece.ubc.ca" wrote: > > > >Hello everyone :) > >I have a product called "xbox" indexed, and when the user search for >either "x-box" or "x box" i want the "xbox" product to be >returned. I'm new to Solr, and from reading online, I thought I need >to use WordDelimiterFilterFactory for "x-box" case, and >WordBreakSolrSpellChecker for "x box" case. Is this correct? > >(1) In my schema file, this is what I changed: >generateNumberParts="1" catenateWords="1" catenateNumbers="1" >catenateAll="1" splitOnCaseChange="0" preserveOriginal="1"/> > >But I don't see the xbox product returned when the search term is >"x-box", so I must have missed something > >(2) I tried to use WordBreakSolrSpellChecker together with >DirectSolrSpellChecker as shown below, but the WordBreakSolrSpellChecker >never got used: > >class="solr.SpellCheckComponent"> > wc_textSpell > > > default > spellCheck > solr.DirectSolrSpellChecker > internal > 0.3 > 2 > 1 > 5 > 3 > 0.01 > 0.004 > > > wordbreak > solr.WordBreakSolrSpellChecker > spellCheck > true > true > 10 > > > > class="org.apache.solr.handler.component.SearchHandler"> > > SpellCheck > true > default > wordbreak > true > false > 10 > true > false > > > wc_spellcheck > > > >I tried to build the dictionary this way: >http://localhost/solr/coreName/select?spellcheck=true&spellcheck.build=true, >but the response returned is this: > > >0 >0 > >true >true > > >build > > > >What's the correct way to build the dictionary? >Even though my requestHandler's name="/spellcheck", i wasn't able to >use >http://localhost/solr/coreName/spellcheck?spellcheck=true&spellcheck.build=true >.. is there something wrong with my definition above? > >(3) I also tried to use WordBreakSolrSpellChecker without the >DirectSolrSpellChecker as shown below: >class="solr.SpellCheckComponent"> > > wc_textSpell > > default > solr.WordBreakSolrSpellChecker > spellCheck > true > true > 10 > > > > class="org.apache.solr.handler.component.SearchHandler"> > > SpellCheck > true > default > > true > false > 10 > true > false > > > wc_spellcheck > > > >And still unable to see WordBreakSolrSpellChecker being called anywhere. > >Would someone kindly help me? > >Many thanks, >Jia >
how to combine solr join with boost in Edismax query?
Hello everyone :) I have an index for groupId and one for product. For an input search keyword, I only want to boost the result if the keyword appears in both groupId and product indices. I was able to get Solr join with fq to work with the following syntax: example: q=searchTerm&fq={!join from=id_1 to=id_2 fromIndex=groupId}searchTerm But I want to use solr join with bf or bq, does anyone have suggestions on how to make it work? (I also use qf, pf, and ps) I tried the following but failed: q=searchTerm&bf=({!join from=id_1 to=id_2 fromIndex=groupId}searchTerm)^100 q=searchTerm&bq=({!join from=id_1 to=id_2 fromIndex=groupId}searchTerm)^100 Many thanks jia
UUIDUpdateProcessorFactory causes repeated documents when uploading csv files?
Happy New Year Everyone :) I am trying to automatically generate document Id when indexing a csv file that contains multiple lines of documents. The desired case: if the csv file contains 2 lines (each line is a document), then the index should contain 2 documents. What I observed: If the csv files contains 2 lines, then the index contains 3 documents, because the 1st document is repeated once, an example output: doc1 rank1 randomlyGeneratedId1 doc1 rank1 randomlyGeneratedId2 doc2 rank2 randomlyGeneratedId3 And if the csv file contains 3 lines, then the index contains 6 elements, because document 1 is repeated 3 times and document 2 is repeated twice, as following: doc1 rank1 randomlyGeneratedId1 doc1 rank1 randomlyGeneratedId2 doc2 rank2 randomlyGeneratedId3 doc1 rank1 randomlyGeneratedId4 doc2 rank2 randomlyGeneratedId5 doc3 rank3 randomlyGeneratedId6 Here's what I have done: 1. In my solrConfig: doc_key autoGenId 2. in schema.xml: id This problem doesn't exist when I assign an Id field, instead of using the UUIDUpdateProcessorFactory, so I assumed the problem is there? Looks like the csv file is processed one line at a time, and the index shows the entire process: so we see each previous line repeated in the output. Is there a way to not show the 'appending of previous lines', and rather just the 'final results' - so the total number of indexed document would match the input number of documents from the csv file? Many thanks, Jia