Search speed issue on new core creation

2015-04-08 Thread dhaivat dave
Hello All,

I am using Master - Slave architecture setup with hundreds of cores getting
replicated between master and slave servers. I am facing very weird issue
while creating a new core.

Whenever there is a new call for a new core creation (using
CoreAdminRequest.createCore(coreName,instanceDir,serverObj)) all the
searches issued to other cores are getting blocked.

Any help or thoughts would highly appreciated.

Regards,
Dhaivat


wildcard queries with custom analyzer

2013-12-28 Thread dhaivat dave
Hello everyone,

I have written custom analyzer for indexing and querying data from solr
indexes.

Now i would like to enable wildcard search with this custom analyzer only.

Please guide me on how to enable this feature?

Many Thanks,
Dhaivat


Indexing and Query time boosting together

2013-08-02 Thread dhaivat dave
Hello All,

I want to boost certain products on particular keywords. for this i am
using solr's indexing time boosting feature. i have given index time
boosting with "1.0" value to all documents in my solr indices. now what i
am doing is when user want to boost certain product i just increase index
time boosting value to 10.0 of that particular product only. now the
problem is: i have also used query time boosting (for boosting documents
when searched term found directly in title field) and so even i have
increase the indexing time boosting value of the particular product it
appears after query time boosted product.

consider following example:

- I have indexed couple document related to mobile phone (nokia,samsung and
so on)
- All the documents contains the title field which contains following value
   *Doc1:*
   *==*
   
   122
   Nokia Phone 2610 
   Suprb phone
 ..
   
   

   *Doc2: *
*   ==*
   
123
Samsung smwer233
Samsung phone
 ..

 

- now if some one searches for "Phone" it will display first "Nokia Phone"
second "Samsung Phone" (by searching in  and  field)
- to display "Samsung" before "Nokia"  i have boost the index time value ,
some thing like below


123  
Samsung smwer233
Samsung phone
 ..

 

- i am also using boosting at query time to boost the document which has
found terms in  field
*"titleName:phone^4"*

now even though i have higher boosting in samsung mobile it displays nokia
mobile first and then samsung mobile.

can any one please guide how can i boost particular document using index
time boosting(it should appear first even though i am applying query time
boosting).

Many Thanks,
Dhaivat Dave


Re: Indexing and Query time boosting together

2013-08-02 Thread dhaivat dave
Hi Erick

Many Thanks for your reply. I got your point. one question on this: is it
possible to give more priority to those docs which has higher indexing time
boosting against query time boosting. I am trying to achieve product
promotions using this implementation. can you please guide how should i
implement this feature ?

Many Thanks,
Dhaivat Dave

On Fri, Aug 2, 2013 at 5:34 PM, Erick Erickson wrote:

> Add &debug=all to your query, that'll show you exactly how the scores
> are calculated. But the most obvious thing is that you're boosting
> on the titleName field in your query, which for doc 123 does NOT
> contain "phone" so I suspect the fact that "phone" is in the titleName
> field for 122 is overriding the index-time boost, especially since "phone"
> appears in both title and description for 122.
>
> Best
> Erick
>
>
> On Fri, Aug 2, 2013 at 7:53 AM, dhaivat dave  wrote:
>
> > Hello All,
> >
> > I want to boost certain products on particular keywords. for this i am
> > using solr's indexing time boosting feature. i have given index time
> > boosting with "1.0" value to all documents in my solr indices. now what i
> > am doing is when user want to boost certain product i just increase index
> > time boosting value to 10.0 of that particular product only. now the
> > problem is: i have also used query time boosting (for boosting documents
> > when searched term found directly in title field) and so even i have
> > increase the indexing time boosting value of the particular product it
> > appears after query time boosted product.
> >
> > consider following example:
> >
> > - I have indexed couple document related to mobile phone (nokia,samsung
> and
> > so on)
> > - All the documents contains the title field which contains following
> value
> >*Doc1:*
> >*==*
> >
> >122
> >Nokia Phone 2610 
> >Suprb phone
> >  ..
> >
> >
> >
> >*Doc2: *
> > *   ==*
> >
> > 123
> > Samsung smwer233
> > Samsung phone
> >  ..
> > 
> >  
> >
> > - now if some one searches for "Phone" it will display first "Nokia
> Phone"
> > second "Samsung Phone" (by searching in  and 
> > field)
> > - to display "Samsung" before "Nokia"  i have boost the index time value
> ,
> > some thing like below
> >
> > 
> > 123  
> > Samsung smwer233
> > Samsung phone
> >  ..
> > 
> >  
> >
> > - i am also using boosting at query time to boost the document which has
> > found terms in  field
> > *"titleName:phone^4"*
> >
> > now even though i have higher boosting in samsung mobile it displays
> nokia
> > mobile first and then samsung mobile.
> >
> > can any one please guide how can i boost particular document using index
> > time boosting(it should appear first even though i am applying query time
> > boosting).
> >
> > Many Thanks,
> > Dhaivat Dave
> >
>



-- 







Regards
Dhaivat


Re: Indexing and Query time boosting together

2013-08-04 Thread dhaivat dave
Hey Jack,

Thank you so much for your reply. This is very useful.

Thanks again,
Dhaivat Dave

On Fri, Aug 2, 2013 at 8:04 PM, Jack Krupansky wrote:

> "product promotions" = "query elevation"
>
> See:
> http://wiki.apache.org/solr/**QueryElevationComponent<http://wiki.apache.org/solr/QueryElevationComponent>
> https://cwiki.apache.org/**confluence/display/solr/The+**
> Query+Elevation+Component<https://cwiki.apache.org/confluence/display/solr/The+Query+Elevation+Component>
>
> Or, boost the query  using a function query referencing an external file
> field that gets updated for promotions.
>
> -- Jack Krupansky
>
> -Original Message- From: dhaivat dave
> Sent: Friday, August 02, 2013 9:17 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Indexing and Query time boosting together
>
>
> Hi Erick
>
> Many Thanks for your reply. I got your point. one question on this: is it
> possible to give more priority to those docs which has higher indexing time
> boosting against query time boosting. I am trying to achieve product
> promotions using this implementation. can you please guide how should i
> implement this feature ?
>
> Many Thanks,
> Dhaivat Dave
>
> On Fri, Aug 2, 2013 at 5:34 PM, Erick Erickson **
> wrote:
>
>  Add &debug=all to your query, that'll show you exactly how the scores
>> are calculated. But the most obvious thing is that you're boosting
>> on the titleName field in your query, which for doc 123 does NOT
>> contain "phone" so I suspect the fact that "phone" is in the titleName
>> field for 122 is overriding the index-time boost, especially since "phone"
>> appears in both title and description for 122.
>>
>> Best
>> Erick
>>
>>
>> On Fri, Aug 2, 2013 at 7:53 AM, dhaivat dave  wrote:
>>
>> > Hello All,
>> >
>> > I want to boost certain products on particular keywords. for this i am
>> > using solr's indexing time boosting feature. i have given index time
>> > boosting with "1.0" value to all documents in my solr indices. now what
>> > i
>> > am doing is when user want to boost certain product i just increase >
>> index
>> > time boosting value to 10.0 of that particular product only. now the
>> > problem is: i have also used query time boosting (for boosting documents
>> > when searched term found directly in title field) and so even i have
>> > increase the indexing time boosting value of the particular product it
>> > appears after query time boosted product.
>> >
>> > consider following example:
>> >
>> > - I have indexed couple document related to mobile phone (nokia,samsung
>> and
>> > so on)
>> > - All the documents contains the title field which contains following
>> value
>> >*Doc1:*
>> >*==*
>> >
>> >122
>> >Nokia Phone 2610 
>> >Suprb phone
>> >  ..
>> >
>> >
>> >
>> >*Doc2: *
>> > *   ==*
>> >
>> > 123
>> > Samsung smwer233
>> > Samsung phone
>> >  ..
>> > 
>> >  
>> >
>> > - now if some one searches for "Phone" it will display first "Nokia
>> Phone"
>> > second "Samsung Phone" (by searching in  and 
>> > field)
>> > - to display "Samsung" before "Nokia"  i have boost the index time value
>> ,
>> > some thing like below
>> >
>> > 
>> > 123  
>> > Samsung smwer233
>> > Samsung phone
>> >  ..
>> > 
>> >  
>> >
>> > - i am also using boosting at query time to boost the document which has
>> > found terms in  field
>> > *"titleName:phone^4"*
>> >
>> > now even though i have higher boosting in samsung mobile it displays
>> nokia
>> > mobile first and then samsung mobile.
>> >
>> > can any one please guide how can i boost particular document using index
>> > time boosting(it should appear first even though i am applying query >
>> time
>> > boosting).
>> >
>> > Many Thanks,
>> > Dhaivat Dave
>> >
>>
>>
>
>
> --
>
>
>
>
>
>
>
> Regards
> Dhaivat
>



-- 







Regards
Dhaivat


developing custom tokenizer

2013-08-12 Thread dhaivat dave
Hello All,

I want to create custom tokeniser in solr 4.4.  it will be very helpful if
some one share any tutorials or information on this.


Many Thanks,
Dhaivat Dave


Re: developing custom tokenizer

2013-08-13 Thread dhaivat dave
Hi Alex,

Thanks for your reply and i looked into core analyser and also created
custom tokeniser using that.I have shared code below. when i tried to look
into analysis of solr, the analyser is working fine but when i tried to
submit 100 docs together i found in logs (with custom message printing)
 that for some of the document it's not calling "create" method from
SampleTokeniserFactory (please see code below).

can you please help me out what's wrong in following code. am i missing
something?

here is the class which extends TokeniserFactory class

=== SampleTokeniserFactory.java

public class SampleTokeniserFactory extends TokenizerFactory {

public SampleTokeniserFactory(Map args) {
super(args);
}

public SampleTokeniser create(AttributeFactory factory, Reader reader) {
return new SampleTokeniser(factory, reader);
}

}

here is the class which extends Tokenizer class


package ns.solr.analyser;

import java.io.IOException;
import java.io.Reader;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.analysis.Tokenizer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
import
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;

public class SampleTokeniser extends Tokenizer {

private List tokenList = new ArrayList();

int tokenCounter = -1;

private final CharTermAttribute termAtt =
addAttribute(CharTermAttribute.class);

/**
 * Object that defines the offset attribute
 */
private final OffsetAttribute offsetAttribute = (OffsetAttribute)
addAttribute(OffsetAttribute.class);

/**
 * Object that defines the position attribute
 */
private final PositionIncrementAttribute position =
(PositionIncrementAttribute) addAttribute(PositionIncrementAttribute.class);

public SampleTokeniser(AttributeFactory factory, Reader reader) {
super(factory, reader);
String textToProcess = null;
try {
textToProcess = readFully(reader);
processText(textToProcess);
} catch (IOException e) {
e.printStackTrace();
}

}

public String readFully(Reader reader) throws IOException {
char[] arr = new char[8 * 1024]; // 8K at a time
StringBuffer buf = new StringBuffer();
int numChars;
while ((numChars = reader.read(arr, 0, arr.length)) > 0) {
buf.append(arr, 0, numChars);
}
return buf.toString();
}

public void processText(String textToProcess) {

String wordsList[] = textToProcess.split(" ");

int startOffset = 0, endOffset = 0;

for (String word : wordsList) {

endOffset = word.length();

Token aToken = new Token("Token." + word, startOffset, endOffset);

aToken.setPositionIncrement(1);

tokenList.add(aToken);

startOffset = endOffset + 1;
}
}

@Override
public boolean incrementToken() throws IOException {

clearAttributes();
tokenCounter++;

if (tokenCounter < tokenList.size()) {
Token aToken = tokenList.get(tokenCounter);

termAtt.append(aToken);
termAtt.setLength(aToken.length());
offsetAttribute.setOffset(correctOffset(aToken.startOffset()),
correctOffset(aToken.endOffset()));
position.setPositionIncrement(aToken.getPositionIncrement());
return true;
}

return false;
}

/**
 * close object
 *
 * @throws IOException
 */
public void close() throws IOException {
super.close();
System.out.println("Close method called");

}

/**
 * called when end method gets called
 *
 * @throws IOException
 */
public void end() throws IOException {
super.end();
// setting final offset
System.out.println("end called with final offset");
}

/**
 * method reset the record
 *
 * @throws IOException
 */
public void reset() throws IOException {
super.reset();
System.out.println("Reset Called");
tokenCounter = -1;

}
}


Many Thanks,
Dhaivat


On Mon, Aug 12, 2013 at 7:03 PM, Alexandre Rafalovitch
wrote:

> Have you tried looking at source code itself? Between simple organizer like
> keyword and complex language ones, you should be able to get an idea. Then
> ask specific follow up questions.
>
> Regards,
>  Alex
> On 12 Aug 2013 09:29, "dhaivat dave"  wrote:
>
> > Hello All,
> >
> > I want to create custom tokeniser in solr 4.4.  it will be very helpful
> if
> > some one share any tutorials or information on this.
> >
> >
> > Many Thanks,
> > Dhaivat Dave
> >
>



-- 







Regards
Dhaivat


issue with custom tokenizer

2013-08-13 Thread dhaivat dave
Hello All,

I am trying to develop custom tokeniser (please find code below) and found
some issue while adding multiple document one after another.

it works fine when i add first document and when i add another document
it's not calling "create" method from SampleTokeniserFactory.java but it
calls directly reset method and then call incrementToken(). any one have an
idea on this what's wrong in the code below?  please share your thoughts on
this.

here is the class which extends TokeniserFactory class

=== SampleTokeniserFactory.java

public class SampleTokeniserFactory extends TokenizerFactory {

public SampleTokeniserFactory(Map args) {
super(args);
}

public SampleTokeniser create(AttributeFactory factory, Reader reader) {
return new SampleTokeniser(factory, reader);
}

}

here is the class which extends Tokenizer class


package ns.solr.analyser;

import java.io.IOException;
import java.io.Reader;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.analysis.Tokenizer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;
import
org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;

public class SampleTokeniser extends Tokenizer {

private List tokenList = new ArrayList();

int tokenCounter = -1;

private final CharTermAttribute termAtt =
addAttribute(CharTermAttribute.class);

/**
 * Object that defines the offset attribute
 */
private final OffsetAttribute offsetAttribute = (OffsetAttribute)
addAttribute(OffsetAttribute.class);

/**
 * Object that defines the position attribute
 */
private final PositionIncrementAttribute position =
(PositionIncrementAttribute) addAttribute(PositionIncrementAttribute.class);

public SampleTokeniser(AttributeFactory factory, Reader reader) {
super(factory, reader);
String textToProcess = null;
try {
textToProcess = readFully(reader);
processText(textToProcess);
} catch (IOException e) {
e.printStackTrace();
}

}

public String readFully(Reader reader) throws IOException {
char[] arr = new char[8 * 1024]; // 8K at a time
StringBuffer buf = new StringBuffer();
int numChars;
while ((numChars = reader.read(arr, 0, arr.length)) > 0) {
buf.append(arr, 0, numChars);
}
return buf.toString();
}

public void processText(String textToProcess) {

String wordsList[] = textToProcess.split(" ");

int startOffset = 0, endOffset = 0;

for (String word : wordsList) {

endOffset = word.length();

Token aToken = new Token("Token." + word, startOffset, endOffset);

aToken.setPositionIncrement(1);

tokenList.add(aToken);

startOffset = endOffset + 1;
}
}

@Override
public boolean incrementToken() throws IOException {

clearAttributes();
tokenCounter++;

if (tokenCounter < tokenList.size()) {
Token aToken = tokenList.get(tokenCounter);

termAtt.append(aToken);
termAtt.setLength(aToken.length());
offsetAttribute.setOffset(correctOffset(aToken.startOffset()),
correctOffset(aToken.endOffset()));
position.setPositionIncrement(aToken.getPositionIncrement());
return true;
}

return false;
}

/**
 * close object
 *
 * @throws IOException
 */
public void close() throws IOException {
super.close();
System.out.println("Close method called");

}

/**
 * called when end method gets called
 *
 * @throws IOException
 */
public void end() throws IOException {
super.end();
// setting final offset
System.out.println("end called with final offset");
}

/**
 * method reset the record
 *
 * @throws IOException
 */
public void reset() throws IOException {
super.reset();
System.out.println("Reset Called");
tokenCounter = -1;

}
}


Boosting Original Indexed Terms

2013-02-27 Thread dhaivat dave
Hello All,

I need help in boosting original indexed terms.

I am storing multiple terms at same position and i want to boost the
original term.

consider following scenario i am indexing document which contain the
following text:

"*baby t-shirts*" i am storing  terms as following


position12term textbabyt-shirtsbabet-shirtinfantchildkidstartOffset0505000
endOffset413413444

so now i want to boost results on original terms  i.e if user searches baby
it should returns that results which has original term baby in it. and then
others.

please let me know how to achieve this.

Thanks
Dhaivat


Error while indexing data using Solr (Unexpected character 'F' (code 70) in prolog; expected '<')

2012-08-27 Thread dhaivat dave
Hello Everyone ,


I am getting an error while indexing data to solr. i am using solrj apis to
index the document and using the xml request handler to index document. i
am getting an error *org.apache.solr.common.SolrException: Unexpected
character 'F' (code 70) in prolog; expected '<' at [row,col
{unknown-source}]: [1,1] *. i have also escaped the content before sending
it to solr. can any please tell me the reason behind this error.





Regards
Dhaivat


Re: Load Testing in Solr

2012-08-30 Thread dhaivat dave
Thanks Pravedsh for your reply. i ll use the JMeter tool .

On Thu, Aug 30, 2012 at 11:10 PM, pravesh  wrote:

> Hi Dhaivat,
> JMeter is a nice tool. But it all depends what sort of load are you
> expecting, how complex queries are you expecting(sorting/filtering/textual
> searches).  You need to consider all these to benchmark.
>
> Thanx
> Pravedsh
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Load-Testing-in-Solr-tp4004117p4004428.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 







Regards
Dhaivat