Solr 4.2.x replication events on slaves

2013-04-11 Thread Stephane Bailliez
In Solr 3.x, I was relying on a postCommit call to a listener in the update
handler to perform data update to caches, this data was used to perform
'realtime' filtering on the documents.

So something like:


...

  


In Solr 4.2.x, the postCommit is not called anymore on the slaves during
updates. (I agree the concept outlined above is kind of blurry when
thinking in term of NRT updates, but here it is more batch updates)

What would be a suggested alternative to hook up something similar  and
basically receive events related to replication ?

Thanks

- stephane


Re: Re:how to monitor solr in newrelic

2012-02-13 Thread Stephane Bailliez
Just run the java agent as indicated, solr will be detected and 3 new menu
items will be available automatically on the app: solr caches, solr
updates, solr requests.



On Mon, Feb 13, 2012 at 10:56 AM, solr  wrote:

> My question is how to check solr perfromance in newrelic.Basically we have
> javaagent .Bu these are instaling in tomcat,jetty,websphere etc..
> I have installed standard solr distribution.So jetty is default.But while
> installing newrelic in solr thats not finding jetty scirpt.Because jetty
> **.jar file sonly exist in standrd distribution.
>
> Can anybody suggest how to handle
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/how-to-monitor-solr-in-newrelic-tp3739567p3740649.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Post Sorting hook before the doc slicing.

2012-03-29 Thread Stephane Bailliez
I'm currently looking to see what would be a decent way to implement a
scrolling window in the result set when looking for an item.

Basically, I need to find item X in the result set and return say N items
before and N items after.

< - N items -- Item X --- N items >

So I was thinking a post filter could do the work, where I'm basically
looking for the id of the document, select it + the N documents after.
This is easy to do, however in the end it cannot work since this doc
selection need to happen post sorting while the post filtering is before
the sorting.

So I might be wrong, but it looks like the only way would be to create a
custom SolrIndexSearcher which will find the offset and create the related
docslice. That slicing part doesn't seem to be well factored that I can
see, so it seems to imply copy/pasting a significant chunk off the code. Am
I looking at the wrong place ?

-- stephane


Re: Per-User Sorting on an ExternalFileField

2012-04-26 Thread Stephane Bailliez
On Fri, Apr 27, 2012 at 12:07 AM, Phill Tornroth wrote:

> I'm using Solr 3.5. Does anyone have a suggestion as to how to end up
> adding this extra dimension so that I can do per-user relevance? It seems
> like an oft-asked, rarely-answered question.
>

Use a function that make use of your externalfilefield and alter the score
so that you can sort on the score ?


getLuceneVersion parsing xml node on every request

2011-05-03 Thread Stephane Bailliez
I' m using Solr 3.1 right now.

I was looking at a threadump trying to figure out why queries were not
exactly fast and noticed that it keeps parsing xml over and over from
the schema to get the lucene version.

SolrQueryParser are created for each request and in the constructor
there is a call similar to

getSchema().getSolrConfig().getLuceneVersion("luceneMatchVersion",
Version.LUCENE_24)

which calls getVal() which is calling getNode() which creates a new
XPath object which ends up creating a new object factory which ends up
loading a class...

I cannot find a reference to this issue anywhere in jira nor google.
Hard to see right now how much effect that does have, but this seems
not quite optimal to do for every request.

Am I missing something obvious here ?

The stack looks like:

  java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:369)
- waiting to lock <0x2aaab3bb43b0> (a
org.mortbay.jetty.webapp.WebAppClassLoader)
at 
org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131)
at 
com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101)
at 
com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135)
at 
com.sun.org.apache.xpath.internal.XPathContext.(XPathContext.java:100)
at 
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201)
at 
com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
at org.apache.solr.core.Config.getNode(Config.java:230)
at org.apache.solr.core.Config.getVal(Config.java:256)
at org.apache.solr.core.Config.getLuceneVersion(Config.java:325)
at 
org.apache.solr.search.SolrQueryParser.(SolrQueryParser.java:76)
at 
org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277)
at 
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:76)
at org.apache.solr.search.QParser.getQuery(QParser.java:142)

Cheers,

-- stephane


Re: getLuceneVersion parsing xml node on every request

2011-05-03 Thread Stephane Bailliez
I went ahead and patched locally the SolrQueryParser in current 3_x branch.
Doing a quick test, baring any obvious mistake due to sleep
deprivation I get close to a 10X performance boost from 200qps to
2000qps.

I opened https://issues.apache.org/jira/browse/SOLR-2493

cheers,

-- stephane


On Tue, May 3, 2011 at 1:45 PM, Stephane Bailliez  wrote:
> I' m using Solr 3.1 right now.
>
> I was looking at a threadump trying to figure out why queries were not
> exactly fast and noticed that it keeps parsing xml over and over from
> the schema to get the lucene version.
>
> SolrQueryParser are created for each request and in the constructor
> there is a call similar to
>
> getSchema().getSolrConfig().getLuceneVersion("luceneMatchVersion",
> Version.LUCENE_24)
>
> which calls getVal() which is calling getNode() which creates a new
> XPath object which ends up creating a new object factory which ends up
> loading a class...
>
> I cannot find a reference to this issue anywhere in jira nor google.
> Hard to see right now how much effect that does have, but this seems
> not quite optimal to do for every request.
>
> Am I missing something obvious here ?
>
> The stack looks like:
>
>  java.lang.Thread.State: BLOCKED (on object monitor)
>        at 
> org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:369)
>        - waiting to lock <0x2aaab3bb43b0> (a
> org.mortbay.jetty.webapp.WebAppClassLoader)
>        at 
> org.mortbay.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:363)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.findProviderClass(ObjectFactory.java:506)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.lookUpFactoryClass(ObjectFactory.java:217)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:131)
>        at 
> com.sun.org.apache.xml.internal.dtm.ObjectFactory.createObject(ObjectFactory.java:101)
>        at 
> com.sun.org.apache.xml.internal.dtm.DTMManager.newInstance(DTMManager.java:135)
>        at 
> com.sun.org.apache.xpath.internal.XPathContext.(XPathContext.java:100)
>        at 
> com.sun.org.apache.xpath.internal.jaxp.XPathImpl.eval(XPathImpl.java:201)
>        at 
> com.sun.org.apache.xpath.internal.jaxp.XPathImpl.evaluate(XPathImpl.java:275)
>        at org.apache.solr.core.Config.getNode(Config.java:230)
>        at org.apache.solr.core.Config.getVal(Config.java:256)
>        at org.apache.solr.core.Config.getLuceneVersion(Config.java:325)
>        at 
> org.apache.solr.search.SolrQueryParser.(SolrQueryParser.java:76)
>        at 
> org.apache.solr.schema.IndexSchema.getSolrQueryParser(IndexSchema.java:277)
>        at 
> org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:76)
>        at org.apache.solr.search.QParser.getQuery(QParser.java:142)
>
> Cheers,
>
> -- stephane
>


Dealing with field values as key/value pairs

2008-12-01 Thread Stephane Bailliez

Hi all,


I'm looking for ideas about how to best deal with a situation where I 
need to deal with storing key/values pairs in the index for consumption 
in the client.



Typical example would be to have a document with multiple genres where 
for simplicity reasons i'd like to send both the 'id' and the 'human 
readable label' (might not be the best example since one would 
immediatly say 'what about localization', but in that case assume it's 
an entity such as company name or a person name).


So say I have

field1 = { 'key1':'this is value1', 'key2':'this is value2' }


I was thinking the easiest (not the prettiest) solution would be to 
store it as effectively a string 'key:this is the value' and then have 
the client deal with this 'format' and then parse it based on 
':' pattern


Another alternative I was thinking may have been to use a custom field 
that effectively would make the field value as a map key/value for the 
writer but I'm not so sure it can really be done, haven't investigated 
that one deeply.


Any feedback would be welcome, solution might even be simpler and 
cleaner than what I'm mentioning above, but my brain is mushy in the 
last couple of weeks.


-- stephane



Re: Dealing with field values as key/value pairs

2008-12-02 Thread Stephane Bailliez


Yeh, sorry was not clear in my question. Storage would end up being done 
the same way of course


I guess I'm more looking for feedback about what people have used as a 
strategy to handle this type of situation. This goes for faceting as well.


Assuming I do faceting by author and there is 2 authors with the same 
name. Does not work right.


So discovering hot water, here, the facet value is best expressed with 
identifiers which would uniquely identify your author. Then you lose the 
'name' and you need to effectively get it.


But if you want to effectively also offer the ability to offer the name 
of the author in your solr response in a 'standalone' way (ie: don't 
rely an another source of data, like the db where is stored that 
mapping) ...then you need to store this data in a convenient form in the 
index to be able to access it later.


So i'm basically looking for design pattern/best practice for that 
scenario based on people's experience.



I was also thinking about storing each values into dynamic fields such 
as 'metadata__' and then assuming I have a facet 
'facet_' which stores identifiers,  use a search component to 
provide the mapping as an 'extra' in the response  and give the mapping 
in another section of the response (similar to the debug, facets, etc)


ie: something like:
mapping: {

  '': { '': '', '': '' },

  '': { '': '', '': '' }
}

does that make sense ?

-- stephane

Noble Paul നോബിള്‍ नोब्ळ् wrote:

In the end lucene stores stuff as strings.

Even if you do store your data as map FieldType , Solr May not be able
to treat it like a map.
So it is fine to put is the map as one single string

On Mon, Dec 1, 2008 at 10:07 PM, Stephane Bailliez <[EMAIL PROTECTED]> wrote:

Hi all,


I'm looking for ideas about how to best deal with a situation where I need
to deal with storing key/values pairs in the index for consumption in the
client.


Typical example would be to have a document with multiple genres where for
simplicity reasons i'd like to send both the 'id' and the 'human readable
label' (might not be the best example since one would immediatly say 'what
about localization', but in that case assume it's an entity such as company
name or a person name).

So say I have

field1 = { 'key1':'this is value1', 'key2':'this is value2' }


I was thinking the easiest (not the prettiest) solution would be to store it
as effectively a string 'key:this is the value' and then have the client
deal with this 'format' and then parse it based on ':' pattern

Another alternative I was thinking may have been to use a custom field that
effectively would make the field value as a map key/value for the writer but
I'm not so sure it can really be done, haven't investigated that one deeply.

Any feedback would be welcome, solution might even be simpler and cleaner
than what I'm mentioning above, but my brain is mushy in the last couple of
weeks.

-- stephane










Filter Query with multiple values

2009-05-06 Thread Stephane Bailliez
Hello,

I cannot seem to find a way with the syntax to express multiple values for a
filter query.

I have documents with field 'type'  : a, b, c, d and I'd like to only search
within documents a and b.

One way to do it would be to work on exclusion fq like:
fq=-type:c&fq=-type:d but then all hell break lose if I introduce a document
of type 'e'.

Is there something extremely obvious that I'm missing ?

Cheers,

-- stephane


Re: Filter Query with multiple values

2009-05-06 Thread Stephane Bailliez
Sweet, thanks Erik !


On Wed, May 6, 2009 at 6:01 PM, Erik Hatcher wrote:

>
> On May 6, 2009, at 11:38 AM, Stephane Bailliez wrote:
>
>  Hello,
>>
>> I cannot seem to find a way with the syntax to express multiple values for
>> a
>> filter query.
>>
>> I have documents with field 'type'  : a, b, c, d and I'd like to only
>> search
>> within documents a and b.
>>
>> One way to do it would be to work on exclusion fq like:
>> fq=-type:c&fq=-type:d but then all hell break lose if I introduce a
>> document
>> of type 'e'.
>>
>> Is there something extremely obvious that I'm missing ?
>>
>
> fq=type:(a OR b)
>
>:)
>
>Erik
>
>
>