Search speed issue on new core creation
Hello All, I am using Master - Slave architecture setup with hundreds of cores getting replicated between master and slave servers. I am facing very weird issue while creating a new core. Whenever there is a new call for a new core creation (using CoreAdminRequest.createCore(coreName,instanceDir,serverObj)) all the searches issued to other cores are getting blocked. Any help or thoughts would highly appreciated. Regards, Dhaivat
wildcard queries with custom analyzer
Hello everyone, I have written custom analyzer for indexing and querying data from solr indexes. Now i would like to enable wildcard search with this custom analyzer only. Please guide me on how to enable this feature? Many Thanks, Dhaivat
Indexing and Query time boosting together
Hello All, I want to boost certain products on particular keywords. for this i am using solr's indexing time boosting feature. i have given index time boosting with "1.0" value to all documents in my solr indices. now what i am doing is when user want to boost certain product i just increase index time boosting value to 10.0 of that particular product only. now the problem is: i have also used query time boosting (for boosting documents when searched term found directly in title field) and so even i have increase the indexing time boosting value of the particular product it appears after query time boosted product. consider following example: - I have indexed couple document related to mobile phone (nokia,samsung and so on) - All the documents contains the title field which contains following value *Doc1:* *==* 122 Nokia Phone 2610 Suprb phone .. *Doc2: * * ==* 123 Samsung smwer233 Samsung phone .. - now if some one searches for "Phone" it will display first "Nokia Phone" second "Samsung Phone" (by searching in and field) - to display "Samsung" before "Nokia" i have boost the index time value , some thing like below 123 Samsung smwer233 Samsung phone .. - i am also using boosting at query time to boost the document which has found terms in field *"titleName:phone^4"* now even though i have higher boosting in samsung mobile it displays nokia mobile first and then samsung mobile. can any one please guide how can i boost particular document using index time boosting(it should appear first even though i am applying query time boosting). Many Thanks, Dhaivat Dave
Re: Indexing and Query time boosting together
Hi Erick Many Thanks for your reply. I got your point. one question on this: is it possible to give more priority to those docs which has higher indexing time boosting against query time boosting. I am trying to achieve product promotions using this implementation. can you please guide how should i implement this feature ? Many Thanks, Dhaivat Dave On Fri, Aug 2, 2013 at 5:34 PM, Erick Erickson wrote: > Add &debug=all to your query, that'll show you exactly how the scores > are calculated. But the most obvious thing is that you're boosting > on the titleName field in your query, which for doc 123 does NOT > contain "phone" so I suspect the fact that "phone" is in the titleName > field for 122 is overriding the index-time boost, especially since "phone" > appears in both title and description for 122. > > Best > Erick > > > On Fri, Aug 2, 2013 at 7:53 AM, dhaivat dave wrote: > > > Hello All, > > > > I want to boost certain products on particular keywords. for this i am > > using solr's indexing time boosting feature. i have given index time > > boosting with "1.0" value to all documents in my solr indices. now what i > > am doing is when user want to boost certain product i just increase index > > time boosting value to 10.0 of that particular product only. now the > > problem is: i have also used query time boosting (for boosting documents > > when searched term found directly in title field) and so even i have > > increase the indexing time boosting value of the particular product it > > appears after query time boosted product. > > > > consider following example: > > > > - I have indexed couple document related to mobile phone (nokia,samsung > and > > so on) > > - All the documents contains the title field which contains following > value > >*Doc1:* > >*==* > > > >122 > >Nokia Phone 2610 > >Suprb phone > > .. > > > > > > > >*Doc2: * > > * ==* > > > > 123 > > Samsung smwer233 > > Samsung phone > > .. > > > > > > > > - now if some one searches for "Phone" it will display first "Nokia > Phone" > > second "Samsung Phone" (by searching in and > > field) > > - to display "Samsung" before "Nokia" i have boost the index time value > , > > some thing like below > > > > > > 123 > > Samsung smwer233 > > Samsung phone > > .. > > > > > > > > - i am also using boosting at query time to boost the document which has > > found terms in field > > *"titleName:phone^4"* > > > > now even though i have higher boosting in samsung mobile it displays > nokia > > mobile first and then samsung mobile. > > > > can any one please guide how can i boost particular document using index > > time boosting(it should appear first even though i am applying query time > > boosting). > > > > Many Thanks, > > Dhaivat Dave > > > -- Regards Dhaivat
Re: Indexing and Query time boosting together
Hey Jack, Thank you so much for your reply. This is very useful. Thanks again, Dhaivat Dave On Fri, Aug 2, 2013 at 8:04 PM, Jack Krupansky wrote: > "product promotions" = "query elevation" > > See: > http://wiki.apache.org/solr/**QueryElevationComponent<http://wiki.apache.org/solr/QueryElevationComponent> > https://cwiki.apache.org/**confluence/display/solr/The+** > Query+Elevation+Component<https://cwiki.apache.org/confluence/display/solr/The+Query+Elevation+Component> > > Or, boost the query using a function query referencing an external file > field that gets updated for promotions. > > -- Jack Krupansky > > -Original Message- From: dhaivat dave > Sent: Friday, August 02, 2013 9:17 AM > To: solr-user@lucene.apache.org > Subject: Re: Indexing and Query time boosting together > > > Hi Erick > > Many Thanks for your reply. I got your point. one question on this: is it > possible to give more priority to those docs which has higher indexing time > boosting against query time boosting. I am trying to achieve product > promotions using this implementation. can you please guide how should i > implement this feature ? > > Many Thanks, > Dhaivat Dave > > On Fri, Aug 2, 2013 at 5:34 PM, Erick Erickson ** > wrote: > > Add &debug=all to your query, that'll show you exactly how the scores >> are calculated. But the most obvious thing is that you're boosting >> on the titleName field in your query, which for doc 123 does NOT >> contain "phone" so I suspect the fact that "phone" is in the titleName >> field for 122 is overriding the index-time boost, especially since "phone" >> appears in both title and description for 122. >> >> Best >> Erick >> >> >> On Fri, Aug 2, 2013 at 7:53 AM, dhaivat dave wrote: >> >> > Hello All, >> > >> > I want to boost certain products on particular keywords. for this i am >> > using solr's indexing time boosting feature. i have given index time >> > boosting with "1.0" value to all documents in my solr indices. now what >> > i >> > am doing is when user want to boost certain product i just increase > >> index >> > time boosting value to 10.0 of that particular product only. now the >> > problem is: i have also used query time boosting (for boosting documents >> > when searched term found directly in title field) and so even i have >> > increase the indexing time boosting value of the particular product it >> > appears after query time boosted product. >> > >> > consider following example: >> > >> > - I have indexed couple document related to mobile phone (nokia,samsung >> and >> > so on) >> > - All the documents contains the title field which contains following >> value >> >*Doc1:* >> >*==* >> > >> >122 >> >Nokia Phone 2610 >> >Suprb phone >> > .. >> > >> > >> > >> >*Doc2: * >> > * ==* >> > >> > 123 >> > Samsung smwer233 >> > Samsung phone >> > .. >> > >> > >> > >> > - now if some one searches for "Phone" it will display first "Nokia >> Phone" >> > second "Samsung Phone" (by searching in and >> > field) >> > - to display "Samsung" before "Nokia" i have boost the index time value >> , >> > some thing like below >> > >> > >> > 123 >> > Samsung smwer233 >> > Samsung phone >> > .. >> > >> > >> > >> > - i am also using boosting at query time to boost the document which has >> > found terms in field >> > *"titleName:phone^4"* >> > >> > now even though i have higher boosting in samsung mobile it displays >> nokia >> > mobile first and then samsung mobile. >> > >> > can any one please guide how can i boost particular document using index >> > time boosting(it should appear first even though i am applying query > >> time >> > boosting). >> > >> > Many Thanks, >> > Dhaivat Dave >> > >> >> > > > -- > > > > > > > > Regards > Dhaivat > -- Regards Dhaivat
developing custom tokenizer
Hello All, I want to create custom tokeniser in solr 4.4. it will be very helpful if some one share any tutorials or information on this. Many Thanks, Dhaivat Dave
Re: developing custom tokenizer
Hi Alex, Thanks for your reply and i looked into core analyser and also created custom tokeniser using that.I have shared code below. when i tried to look into analysis of solr, the analyser is working fine but when i tried to submit 100 docs together i found in logs (with custom message printing) that for some of the document it's not calling "create" method from SampleTokeniserFactory (please see code below). can you please help me out what's wrong in following code. am i missing something? here is the class which extends TokeniserFactory class === SampleTokeniserFactory.java public class SampleTokeniserFactory extends TokenizerFactory { public SampleTokeniserFactory(Map args) { super(args); } public SampleTokeniser create(AttributeFactory factory, Reader reader) { return new SampleTokeniser(factory, reader); } } here is the class which extends Tokenizer class package ns.solr.analyser; import java.io.IOException; import java.io.Reader; import java.util.ArrayList; import java.util.List; import org.apache.lucene.analysis.Tokenizer; import org.apache.lucene.analysis.Token; import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; public class SampleTokeniser extends Tokenizer { private List tokenList = new ArrayList(); int tokenCounter = -1; private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class); /** * Object that defines the offset attribute */ private final OffsetAttribute offsetAttribute = (OffsetAttribute) addAttribute(OffsetAttribute.class); /** * Object that defines the position attribute */ private final PositionIncrementAttribute position = (PositionIncrementAttribute) addAttribute(PositionIncrementAttribute.class); public SampleTokeniser(AttributeFactory factory, Reader reader) { super(factory, reader); String textToProcess = null; try { textToProcess = readFully(reader); processText(textToProcess); } catch (IOException e) { e.printStackTrace(); } } public String readFully(Reader reader) throws IOException { char[] arr = new char[8 * 1024]; // 8K at a time StringBuffer buf = new StringBuffer(); int numChars; while ((numChars = reader.read(arr, 0, arr.length)) > 0) { buf.append(arr, 0, numChars); } return buf.toString(); } public void processText(String textToProcess) { String wordsList[] = textToProcess.split(" "); int startOffset = 0, endOffset = 0; for (String word : wordsList) { endOffset = word.length(); Token aToken = new Token("Token." + word, startOffset, endOffset); aToken.setPositionIncrement(1); tokenList.add(aToken); startOffset = endOffset + 1; } } @Override public boolean incrementToken() throws IOException { clearAttributes(); tokenCounter++; if (tokenCounter < tokenList.size()) { Token aToken = tokenList.get(tokenCounter); termAtt.append(aToken); termAtt.setLength(aToken.length()); offsetAttribute.setOffset(correctOffset(aToken.startOffset()), correctOffset(aToken.endOffset())); position.setPositionIncrement(aToken.getPositionIncrement()); return true; } return false; } /** * close object * * @throws IOException */ public void close() throws IOException { super.close(); System.out.println("Close method called"); } /** * called when end method gets called * * @throws IOException */ public void end() throws IOException { super.end(); // setting final offset System.out.println("end called with final offset"); } /** * method reset the record * * @throws IOException */ public void reset() throws IOException { super.reset(); System.out.println("Reset Called"); tokenCounter = -1; } } Many Thanks, Dhaivat On Mon, Aug 12, 2013 at 7:03 PM, Alexandre Rafalovitch wrote: > Have you tried looking at source code itself? Between simple organizer like > keyword and complex language ones, you should be able to get an idea. Then > ask specific follow up questions. > > Regards, > Alex > On 12 Aug 2013 09:29, "dhaivat dave" wrote: > > > Hello All, > > > > I want to create custom tokeniser in solr 4.4. it will be very helpful > if > > some one share any tutorials or information on this. > > > > > > Many Thanks, > > Dhaivat Dave > > > -- Regards Dhaivat
issue with custom tokenizer
Hello All, I am trying to develop custom tokeniser (please find code below) and found some issue while adding multiple document one after another. it works fine when i add first document and when i add another document it's not calling "create" method from SampleTokeniserFactory.java but it calls directly reset method and then call incrementToken(). any one have an idea on this what's wrong in the code below? please share your thoughts on this. here is the class which extends TokeniserFactory class === SampleTokeniserFactory.java public class SampleTokeniserFactory extends TokenizerFactory { public SampleTokeniserFactory(Map args) { super(args); } public SampleTokeniser create(AttributeFactory factory, Reader reader) { return new SampleTokeniser(factory, reader); } } here is the class which extends Tokenizer class package ns.solr.analyser; import java.io.IOException; import java.io.Reader; import java.util.ArrayList; import java.util.List; import org.apache.lucene.analysis.Tokenizer; import org.apache.lucene.analysis.Token; import org.apache.lucene.analysis.tokenattributes.CharTermAttribute; import org.apache.lucene.analysis.tokenattributes.OffsetAttribute; import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute; public class SampleTokeniser extends Tokenizer { private List tokenList = new ArrayList(); int tokenCounter = -1; private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class); /** * Object that defines the offset attribute */ private final OffsetAttribute offsetAttribute = (OffsetAttribute) addAttribute(OffsetAttribute.class); /** * Object that defines the position attribute */ private final PositionIncrementAttribute position = (PositionIncrementAttribute) addAttribute(PositionIncrementAttribute.class); public SampleTokeniser(AttributeFactory factory, Reader reader) { super(factory, reader); String textToProcess = null; try { textToProcess = readFully(reader); processText(textToProcess); } catch (IOException e) { e.printStackTrace(); } } public String readFully(Reader reader) throws IOException { char[] arr = new char[8 * 1024]; // 8K at a time StringBuffer buf = new StringBuffer(); int numChars; while ((numChars = reader.read(arr, 0, arr.length)) > 0) { buf.append(arr, 0, numChars); } return buf.toString(); } public void processText(String textToProcess) { String wordsList[] = textToProcess.split(" "); int startOffset = 0, endOffset = 0; for (String word : wordsList) { endOffset = word.length(); Token aToken = new Token("Token." + word, startOffset, endOffset); aToken.setPositionIncrement(1); tokenList.add(aToken); startOffset = endOffset + 1; } } @Override public boolean incrementToken() throws IOException { clearAttributes(); tokenCounter++; if (tokenCounter < tokenList.size()) { Token aToken = tokenList.get(tokenCounter); termAtt.append(aToken); termAtt.setLength(aToken.length()); offsetAttribute.setOffset(correctOffset(aToken.startOffset()), correctOffset(aToken.endOffset())); position.setPositionIncrement(aToken.getPositionIncrement()); return true; } return false; } /** * close object * * @throws IOException */ public void close() throws IOException { super.close(); System.out.println("Close method called"); } /** * called when end method gets called * * @throws IOException */ public void end() throws IOException { super.end(); // setting final offset System.out.println("end called with final offset"); } /** * method reset the record * * @throws IOException */ public void reset() throws IOException { super.reset(); System.out.println("Reset Called"); tokenCounter = -1; } }
Boosting Original Indexed Terms
Hello All, I need help in boosting original indexed terms. I am storing multiple terms at same position and i want to boost the original term. consider following scenario i am indexing document which contain the following text: "*baby t-shirts*" i am storing terms as following position12term textbabyt-shirtsbabet-shirtinfantchildkidstartOffset0505000 endOffset413413444 so now i want to boost results on original terms i.e if user searches baby it should returns that results which has original term baby in it. and then others. please let me know how to achieve this. Thanks Dhaivat
Error while indexing data using Solr (Unexpected character 'F' (code 70) in prolog; expected '<')
Hello Everyone , I am getting an error while indexing data to solr. i am using solrj apis to index the document and using the xml request handler to index document. i am getting an error *org.apache.solr.common.SolrException: Unexpected character 'F' (code 70) in prolog; expected '<' at [row,col {unknown-source}]: [1,1] *. i have also escaped the content before sending it to solr. can any please tell me the reason behind this error. Regards Dhaivat
Re: Load Testing in Solr
Thanks Pravedsh for your reply. i ll use the JMeter tool . On Thu, Aug 30, 2012 at 11:10 PM, pravesh wrote: > Hi Dhaivat, > JMeter is a nice tool. But it all depends what sort of load are you > expecting, how complex queries are you expecting(sorting/filtering/textual > searches). You need to consider all these to benchmark. > > Thanx > Pravedsh > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Load-Testing-in-Solr-tp4004117p4004428.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards Dhaivat