Re: UpdateProcessor and copyField

2011-02-23 Thread Markus Jelsma
You are right, i misread. Fields a copied first, then analyzed and then behave 
like other fields and pass the same way through the update processor.

Cheers,

> Markus,
> 
> I searched but I couldn't find a definite answer, so I posted this
> question.
> The article you quoted talks about implementing a copyField-like operation
> using UpdateProcessor.  It doesn't talk about relationship between
> the copyField operation proper and UpdateProcessors.
> 
> Kuro
> 
> On 2/22/11 3:00 PM, "Markus Jelsma"  wrote:
> >Yes. But did you actually search the mailing list or Solr's wiki? I guess
> >not.
> >
> >Here it is:
> >http://wiki.apache.org/solr/UpdateRequestProcessor
> >
> >> Can fields created by copyField instructions be processed by
> >> UpdateProcessors?
> >> Or only raw input fields can?
> >> 
> >> So far my experiment is suggesting the latter.
> >> 
> >> 
> >> T. "Kuro" Kurosaka


How to support fault tolerant indexing?

2011-02-23 Thread kristofd
Hi,

I am working on a setup where we will need fault tolerant indexing. This seems 
not to be supported by Solr per default, and I wonder what the options are.

My plan is to:
* Use 2 separate, self-contained Solr nodes (no master-slave config in Solr)
* Use a hot standby failover setup in front of the nodes
* Put the index on a file system (Oracle DBFS) shared between the nodes
* Let a single node perform both indexing and searching at any given time

The idea is that if the active node goes down, the standby node will take over 
and receive both search and indexing traffic. (I will need to ensure that the 
failover soluition ensures that only one node can read and write to the index 
at any given time.)

This way, the active node, when it recovers, will have access to all index 
updates that have taken place while it was down. (I assume that Solr on the 
active node will get a new Reader when it starts - so any updates since last 
commit from that node will be available.)

A "classic" Solr master-slave setup with local indexes on the nodes will AFAIK 
in this case not be sufficient since the master (when it starts again after 
downtime) will not be able to replicate from the slave and thus any index 
updates sent to the slave (while the master was down) will be lost.

This could be solved if the roles of the master and the slave were switched 
when the master goes down. AFAIK this is not easily supported.

Any suggestions are very welcome!


Kristoffer


Re: Configure 2 or more Tomcat instances.

2011-02-23 Thread rajini maski
  I created 2 tomcat instances. With respective folders tomcat0
tomcat1
And server xml edited with the different port numbers respecitvely(all the 3
ports).
Now when I am tryin to connect .. http://localhost:8090/   or
http://localhost:8091/
webpage failed to open in both the cases.  Is there something else that i
need to do?

  While I am trying to run the bootstrap.jar (present in
//tomcat/bin/) through command prompt. I am getting an error -

Run command:
C:\Program Files\Apache Software Foundation\tomcat6.0\bin>java -jar
bootstrap.ja
r
Exception in thread "main" java.lang.UnsupportedClassVersionError:
org/apache/ca
talina/startup/Bootstrap (Unsupported major.minor version 49.0)
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$100(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClassInternal(Unknown Source)

Any idea why this error ?  I have jdk1.6.0_02  and tomcat 6 version set up..


Regards
Rajani Maski




On Tue, Feb 22, 2011 at 7:53 PM, Paul Libbrecht  wrote:

> Rajini,
>
> you need to make the (~3) ports defined in conf/server.xml different.
>
> paul
>
>
> Le 22 févr. 2011 à 12:15, rajini maski a écrit :
>
> >   I have a tomcat6.0 instance running in my system, with
> > connector port-8090, shutdown port -8005 ,AJP/1.3  port-8009 and redirect
> > port-8443  in server.xml (path = C:\Program Files\Apache Software
> > Foundation\Tomcat 6.0\conf\server.xml)
> >
> >   How do I configure one more independent tomcat instance
> > in the same system..? I went through many sites.. but couldn't fix
> > this. If anyone one know the proper configuration steps please reply..
> >
> > Regards,
> > Rajani Maski
>
>


Re: UpdateProcessor and copyField

2011-02-23 Thread Jan Høydahl
This is how I've got it:

A document first passes through the UpdateChain (processors), which is document 
centric.
Then copyFields are processed
And finally the analysis in fieldTypes are processed

So you cannot use  before UpdateProcessors nor after Analysis :(

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 23. feb. 2011, at 09.45, Markus Jelsma wrote:

> You are right, i misread. Fields a copied first, then analyzed and then 
> behave 
> like other fields and pass the same way through the update processor.
> 
> Cheers,
> 
>> Markus,
>> 
>> I searched but I couldn't find a definite answer, so I posted this
>> question.
>> The article you quoted talks about implementing a copyField-like operation
>> using UpdateProcessor.  It doesn't talk about relationship between
>> the copyField operation proper and UpdateProcessors.
>> 
>> Kuro
>> 
>> On 2/22/11 3:00 PM, "Markus Jelsma"  wrote:
>>> Yes. But did you actually search the mailing list or Solr's wiki? I guess
>>> not.
>>> 
>>> Here it is:
>>> http://wiki.apache.org/solr/UpdateRequestProcessor
>>> 
 Can fields created by copyField instructions be processed by
 UpdateProcessors?
 Or only raw input fields can?
 
 So far my experiment is suggesting the latter.
 
 
 T. "Kuro" Kurosaka



Re: How to support fault tolerant indexing?

2011-02-23 Thread Jan Høydahl
Hi,

This is what we're aiming for SolrCloud and ZooKeeper to handle for us.
It does not currently do that, but the vision is that ZK will keep track of the 
state of each node, and do master election and everything. Contributions 
welcome :)

http://wiki.apache.org/solr/SolrCloud

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 23. feb. 2011, at 10.02,   wrote:

> Hi,
> 
> I am working on a setup where we will need fault tolerant indexing. This 
> seems not to be supported by Solr per default, and I wonder what the options 
> are.
> 
> My plan is to:
> * Use 2 separate, self-contained Solr nodes (no master-slave config in Solr)
> * Use a hot standby failover setup in front of the nodes
> * Put the index on a file system (Oracle DBFS) shared between the nodes
> * Let a single node perform both indexing and searching at any given time
> 
> The idea is that if the active node goes down, the standby node will take 
> over and receive both search and indexing traffic. (I will need to ensure 
> that the failover soluition ensures that only one node can read and write to 
> the index at any given time.)
> 
> This way, the active node, when it recovers, will have access to all index 
> updates that have taken place while it was down. (I assume that Solr on the 
> active node will get a new Reader when it starts - so any updates since last 
> commit from that node will be available.)
> 
> A "classic" Solr master-slave setup with local indexes on the nodes will 
> AFAIK in this case not be sufficient since the master (when it starts again 
> after downtime) will not be able to replicate from the slave and thus any 
> index updates sent to the slave (while the master was down) will be lost.
> 
> This could be solved if the roles of the master and the slave were switched 
> when the master goes down. AFAIK this is not easily supported.
> 
> Any suggestions are very welcome!
> 
> 
> Kristoffer



Re: UpdateProcessor and copyField

2011-02-23 Thread Erik Hatcher
Maybe copy fields should be refactored to happen in a new, core, update 
processor, so there is nothing special/awkward about them?  It seems they fit 
as part of what an update processor is all about, augmenting/modifying incoming 
documents.

Erik

On Feb 23, 2011, at 04:40 , Jan Høydahl wrote:

> This is how I've got it:
> 
> A document first passes through the UpdateChain (processors), which is 
> document centric.
> Then copyFields are processed
> And finally the analysis in fieldTypes are processed
> 
> So you cannot use  before UpdateProcessors nor after Analysis :(
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> On 23. feb. 2011, at 09.45, Markus Jelsma wrote:
> 
>> You are right, i misread. Fields a copied first, then analyzed and then 
>> behave 
>> like other fields and pass the same way through the update processor.
>> 
>> Cheers,
>> 
>>> Markus,
>>> 
>>> I searched but I couldn't find a definite answer, so I posted this
>>> question.
>>> The article you quoted talks about implementing a copyField-like operation
>>> using UpdateProcessor.  It doesn't talk about relationship between
>>> the copyField operation proper and UpdateProcessors.
>>> 
>>> Kuro
>>> 
>>> On 2/22/11 3:00 PM, "Markus Jelsma"  wrote:
 Yes. But did you actually search the mailing list or Solr's wiki? I guess
 not.
 
 Here it is:
 http://wiki.apache.org/solr/UpdateRequestProcessor
 
> Can fields created by copyField instructions be processed by
> UpdateProcessors?
> Or only raw input fields can?
> 
> So far my experiment is suggesting the latter.
> 
> 
> T. "Kuro" Kurosaka
> 



fq field with facets

2011-02-23 Thread Rosa (Anuncios)

Hi,

I'm trying to reduce results from facets. (by category with my schema)

My category field is String type in my schema.xml.

The problem i've got is when the category value has space or special 
caracter it doen't work?


Example:

?q=home&fq=category:Appartement  ---> works fine

?q=home&fq=category:Appartement for rent--> doesn't work?

?q=home&fq=category:Appartement > Sale--> doesn't work?

I guess there is a workaround this? Sorry if it's obvious... i'm a 
newbie with Solr


thanks for your help

rosa


Re: fq field with facets

2011-02-23 Thread Savvas-Andreas Moysidis
Hello,

you could try wrapping your fq terms in double quotes as in:

?q=home&fq=category:"Appartement > Sale"


On 23 February 2011 13:25, Rosa (Anuncios) wrote:

> Hi,
>
> I'm trying to reduce results from facets. (by category with my schema)
>
> My category field is String type in my schema.xml.
>
> The problem i've got is when the category value has space or special
> caracter it doen't work?
>
> Example:
>
> ?q=home&fq=category:Appartement  ---> works fine
>
> ?q=home&fq=category:Appartement for rent--> doesn't work?
>
> ?q=home&fq=category:Appartement > Sale--> doesn't work?
>
> I guess there is a workaround this? Sorry if it's obvious... i'm a newbie
> with Solr
>
> thanks for your help
>
> rosa
>


Re: fq field with facets

2011-02-23 Thread Erik Hatcher
Try -

  fq={!field f=category}

You can also try surrounding with quotes, but that gets tricky and you'll need 
to escape things possibly.  Or you could simply backslash escape the whitespace 
(and colon, etc) characters.

Erik

On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote:

> Hi,
> 
> I'm trying to reduce results from facets. (by category with my schema)
> 
> My category field is String type in my schema.xml.
> 
> The problem i've got is when the category value has space or special caracter 
> it doen't work?
> 
> Example:
> 
> ?q=home&fq=category:Appartement  ---> works fine
> 
> ?q=home&fq=category:Appartement for rent--> doesn't work?
> 
> ?q=home&fq=category:Appartement > Sale--> doesn't work?
> 
> I guess there is a workaround this? Sorry if it's obvious... i'm a newbie 
> with Solr
> 
> thanks for your help
> 
> rosa



Re: fq field with facets

2011-02-23 Thread Savvas-Andreas Moysidis
Hi Eric,

could you please let us know where can we find more info about this notation
( fq={!field f=category})? What is it called, how to use it etc? Is there a
wiki page?

Thanks,
- Savvas

On 23 February 2011 14:17, Erik Hatcher  wrote:

> Try -
>
>  fq={!field f=category}
>
> You can also try surrounding with quotes, but that gets tricky and you'll
> need to escape things possibly.  Or you could simply backslash escape the
> whitespace (and colon, etc) characters.
>
>Erik
>
> On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote:
>
> > Hi,
> >
> > I'm trying to reduce results from facets. (by category with my schema)
> >
> > My category field is String type in my schema.xml.
> >
> > The problem i've got is when the category value has space or special
> caracter it doen't work?
> >
> > Example:
> >
> > ?q=home&fq=category:Appartement  ---> works fine
> >
> > ?q=home&fq=category:Appartement for rent--> doesn't work?
> >
> > ?q=home&fq=category:Appartement > Sale--> doesn't work?
> >
> > I guess there is a workaround this? Sorry if it's obvious... i'm a newbie
> with Solr
> >
> > thanks for your help
> >
> > rosa
>
>


Re: fq field with facets

2011-02-23 Thread Stefan Matheis
Savvas, have a look here: http://wiki.apache.org/solr/FunctionQuery

On Wed, Feb 23, 2011 at 3:25 PM, Savvas-Andreas Moysidis
 wrote:
> Hi Eric,
>
> could you please let us know where can we find more info about this notation
> ( fq={!field f=category})? What is it called, how to use it etc? Is there a
> wiki page?
>
> Thanks,
> - Savvas
>
> On 23 February 2011 14:17, Erik Hatcher  wrote:
>
>> Try -
>>
>>  fq={!field f=category}
>>
>> You can also try surrounding with quotes, but that gets tricky and you'll
>> need to escape things possibly.  Or you could simply backslash escape the
>> whitespace (and colon, etc) characters.
>>
>>        Erik
>>
>> On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote:
>>
>> > Hi,
>> >
>> > I'm trying to reduce results from facets. (by category with my schema)
>> >
>> > My category field is String type in my schema.xml.
>> >
>> > The problem i've got is when the category value has space or special
>> caracter it doen't work?
>> >
>> > Example:
>> >
>> > ?q=home&fq=category:Appartement  ---> works fine
>> >
>> > ?q=home&fq=category:Appartement for rent    --> doesn't work?
>> >
>> > ?q=home&fq=category:Appartement > Sale    --> doesn't work?
>> >
>> > I guess there is a workaround this? Sorry if it's obvious... i'm a newbie
>> with Solr
>> >
>> > thanks for your help
>> >
>> > rosa
>>
>>
>


Re: fq field with facets

2011-02-23 Thread Erik Hatcher

On Feb 23, 2011, at 09:25 , Savvas-Andreas Moysidis wrote:

> Hi Eric,
> 
> could you please let us know where can we find more info about this notation
> ( fq={!field f=category})? What is it called, how to use it etc? Is there a
> wiki page?

There's some details of this here: 

The {} syntax is for specifying what we call "local parameters" (local to the 
context in which it is used) and allows in this case the specification of the 
query parser to use. 

Erik



Re: Problem with XML encode UFT-8

2011-02-23 Thread jayronsoares

Hi Jan,

I appreciate you attention.
I've tried to answer your questions to the best of my knowledge.

2011/2/22 Jan Høydahl / Cominvent [via Lucene] <
ml-node+2551500-1071759141-363...@n3.nabble.com>

> Hi,
>
> Please explain some more.
> a) What version of Solr?
>
  Solr version 1.4



> b) Are you trying to feed XML or PDF?
>
   XML via solrpy


> c) What request handler are you feeding to? /update or /update/extract ?
>
   I don't know, see the example attached

> d) Can you copy/paste some more lines from the error log?
>

   I'm attaching one example, so you can test for yourself.


Thanks for your help.
Cheers
jayron


> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 21. feb. 2011, at 15.02, jayronsoares wrote:
>
> >
> > Hi I'm using solr py to stored files in pdf, however at moment of run
> script,
> > shows me that issue:
> >
> > An invalid XML character (Unicode: 0xc) was found in the element content
> of
> > the document.
> >
> > Someone could give some help?
> >
> > cheers
> > jayron
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Any-new-python-libraries-tp493419p2545020.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Any-new-python-libraries-tp493419p2551500.html
>  To unsubscribe from Any new python libraries?, click 
> here.
>
>



-- 
" A Vida é arte do Saber...Quem quiser saber tem que viver!"

http://bucolick.tumblr.com
http://artecultural.wordpress.com/

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Any-new-python-libraries-tp493419p2559636.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: fq field with facets

2011-02-23 Thread Stefan Matheis
On Wed, Feb 23, 2011 at 3:57 PM, Erik Hatcher  wrote:
> There's some details of this here: 
> 

Oh yes, i'm wrong - just ignore my post .. got confused from the
similar syntax, i'm sorry.


Re: fq field with facets

2011-02-23 Thread Rosa (Anuncios)

Thanks Erik,

this works well.

the only thing, but i'm not sure is comes from there is with the accents:

q=memoire+sd&fq={!field f=category}Electronique+>+Cartes+mémoires

any tricks for that?

I've noticed that i've got the same issue in a normal simple query?

Le 23/02/2011 15:17, Erik Hatcher a écrit :

Try -

   fq={!field f=category}

You can also try surrounding with quotes, but that gets tricky and you'll need 
to escape things possibly.  Or you could simply backslash escape the whitespace 
(and colon, etc) characters.

Erik

On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote:


Hi,

I'm trying to reduce results from facets. (by category with my schema)

My category field is String type in my schema.xml.

The problem i've got is when the category value has space or special caracter 
it doen't work?

Example:

?q=home&fq=category:Appartement  --->  works fine

?q=home&fq=category:Appartement for rent-->  doesn't work?

?q=home&fq=category:Appartement>  Sale-->  doesn't work?

I guess there is a workaround this? Sorry if it's obvious... i'm a newbie with 
Solr

thanks for your help

rosa






Re: fq field with facets

2011-02-23 Thread Erik Hatcher

On Feb 23, 2011, at 10:06 , Rosa (Anuncios) wrote:

> Thanks Erik,
> 
> this works well.
> 
> the only thing, but i'm not sure is comes from there is with the accents:
> 
> q=memoire+sd&fq={!field f=category}Electronique+>+Cartes+mémoires
> 
> any tricks for that?

Hard to say what the problem is.  Maybe it needs to be URL encoded?  Maybe your 
container isn't accepting those characters successfully?  (see the Tomcat and 
Jetty pages on the Solr wiki for details).  Maybe that's not quite how you 
indexed the string?

What's the problem exactly?

Try moving it to a solo q parameter and doing &debugQuery=true:

  q={!field f=category}Electronique+>+Cartes+mémoires

and see what it parses to.

Erik



Re: fq field with facets

2011-02-23 Thread Rosa (Anuncios)
I've got it, the problem was from php-solr-client... for some reason the 
php urlencode function does not work in this script? So the accent are 
not converted.


Thanks anyway

Le 23/02/2011 16:11, Erik Hatcher a écrit :

On Feb 23, 2011, at 10:06 , Rosa (Anuncios) wrote:


Thanks Erik,

this works well.

the only thing, but i'm not sure is comes from there is with the accents:

q=memoire+sd&fq={!field f=category}Electronique+>+Cartes+mémoires

any tricks for that?

Hard to say what the problem is.  Maybe it needs to be URL encoded?  Maybe your 
container isn't accepting those characters successfully?  (see the Tomcat and 
Jetty pages on the Solr wiki for details).  Maybe that's not quite how you 
indexed the string?

What's the problem exactly?

Try moving it to a solo q parameter and doing&debugQuery=true:

   q={!field f=category}Electronique+>+Cartes+mémoires

and see what it parses to.

Erik






Re: fq field with facets

2011-02-23 Thread Bill Bell
Double quotes should work as well.

Bill Bell
Sent from mobile


On Feb 23, 2011, at 7:17 AM, Erik Hatcher  wrote:

> Try -
> 
>  fq={!field f=category}
> 
> You can also try surrounding with quotes, but that gets tricky and you'll 
> need to escape things possibly.  Or you could simply backslash escape the 
> whitespace (and colon, etc) characters.
> 
>Erik
> 
> On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote:
> 
>> Hi,
>> 
>> I'm trying to reduce results from facets. (by category with my schema)
>> 
>> My category field is String type in my schema.xml.
>> 
>> The problem i've got is when the category value has space or special 
>> caracter it doen't work?
>> 
>> Example:
>> 
>> ?q=home&fq=category:Appartement  ---> works fine
>> 
>> ?q=home&fq=category:Appartement for rent--> doesn't work?
>> 
>> ?q=home&fq=category:Appartement > Sale--> doesn't work?
>> 
>> I guess there is a workaround this? Sorry if it's obvious... i'm a newbie 
>> with Solr
>> 
>> thanks for your help
>> 
>> rosa
> 


Re: Problem with XML encode UFT-8

2011-02-23 Thread Bill Bell
Certain Utf characters are not valid and need to be stripped. BOT etc.

Bill Bell
Sent from mobile


On Feb 23, 2011, at 5:29 AM, jayronsoares  wrote:

> 
> Hi Jan,
> 
> I appreciate you attention.
> I've tried to answer your questions to the best of my knowledge.
> 
> 2011/2/22 Jan Høydahl / Cominvent [via Lucene] <
> ml-node+2551500-1071759141-363...@n3.nabble.com>
> 
>> Hi,
>> 
>> Please explain some more.
>> a) What version of Solr?
>> 
>  Solr version 1.4
> 
> 
> 
>> b) Are you trying to feed XML or PDF?
>> 
>   XML via solrpy
> 
> 
>> c) What request handler are you feeding to? /update or /update/extract ?
>> 
>   I don't know, see the example attached
> 
>> d) Can you copy/paste some more lines from the error log?
>> 
> 
>   I'm attaching one example, so you can test for yourself.
> 
> 
> Thanks for your help.
> Cheers
> jayron
> 
> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> On 21. feb. 2011, at 15.02, jayronsoares wrote:
>> 
>>> 
>>> Hi I'm using solr py to stored files in pdf, however at moment of run
>> script,
>>> shows me that issue:
>>> 
>>> An invalid XML character (Unicode: 0xc) was found in the element content
>> of
>>> the document.
>>> 
>>> Someone could give some help?
>>> 
>>> cheers
>>> jayron
>>> --
>>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Any-new-python-libraries-tp493419p2545020.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>> 
>> 
>> --
>> If you reply to this email, your message will be added to the discussion
>> below:
>> 
>> http://lucene.472066.n3.nabble.com/Any-new-python-libraries-tp493419p2551500.html
>> To unsubscribe from Any new python libraries?, click 
>> here.
>> 
>> 
> 
> 
> 
> -- 
> " A Vida é arte do Saber...Quem quiser saber tem que viver!"
> 
> http://bucolick.tumblr.com
> http://artecultural.wordpress.com/
> 
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Any-new-python-libraries-tp493419p2559636.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to support fault tolerant indexing?

2011-02-23 Thread Bill Bell
Would love to help.

Bill Bell
Sent from mobile


On Feb 23, 2011, at 2:42 AM, Jan Høydahl  wrote:

> Hi,
> 
> This is what we're aiming for SolrCloud and ZooKeeper to handle for us.
> It does not currently do that, but the vision is that ZK will keep track of 
> the state of each node, and do master election and everything. Contributions 
> welcome :)
> 
> http://wiki.apache.org/solr/SolrCloud
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> On 23. feb. 2011, at 10.02,   wrote:
> 
>> Hi,
>> 
>> I am working on a setup where we will need fault tolerant indexing. This 
>> seems not to be supported by Solr per default, and I wonder what the options 
>> are.
>> 
>> My plan is to:
>> * Use 2 separate, self-contained Solr nodes (no master-slave config in Solr)
>> * Use a hot standby failover setup in front of the nodes
>> * Put the index on a file system (Oracle DBFS) shared between the nodes
>> * Let a single node perform both indexing and searching at any given time
>> 
>> The idea is that if the active node goes down, the standby node will take 
>> over and receive both search and indexing traffic. (I will need to ensure 
>> that the failover soluition ensures that only one node can read and write to 
>> the index at any given time.)
>> 
>> This way, the active node, when it recovers, will have access to all index 
>> updates that have taken place while it was down. (I assume that Solr on the 
>> active node will get a new Reader when it starts - so any updates since last 
>> commit from that node will be available.)
>> 
>> A "classic" Solr master-slave setup with local indexes on the nodes will 
>> AFAIK in this case not be sufficient since the master (when it starts again 
>> after downtime) will not be able to replicate from the slave and thus any 
>> index updates sent to the slave (while the master was down) will be lost.
>> 
>> This could be solved if the roles of the master and the slave were switched 
>> when the master goes down. AFAIK this is not easily supported.
>> 
>> Any suggestions are very welcome!
>> 
>> 
>> Kristoffer
> 


Re: Sort Stability With Date Boosting and Rounding

2011-02-23 Thread Stephen Duncan Jr
That would improve things for recent documents, but documents that were
close to each other, but a long time from NOW, would still have very small
differences that would be susceptible to rounding errors that can cause
results to get shuffled.

Stephen Duncan Jr
www.stephenduncanjr.com


On Tue, Feb 22, 2011 at 6:07 PM, David Yang  wrote:

> One suggestion: use logarithms to compress the large time range into
> something easier to compare: 1/log(ms(now,date)
>
> -Original Message-
> From: Stephen Duncan Jr [mailto:stephen.dun...@gmail.com]
> Sent: Tuesday, February 22, 2011 6:03 PM
> To: solr-user@lucene.apache.org
> Subject: Sort Stability With Date Boosting and Rounding
>
> I'm trying to use
>
> http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
> as
> a bf parameter to my dismax handler.  The problem is, the value of NOW can
> cause documents in a similar range (date value within a few seconds of each
> other) to sometimes round to be equal, and sometimes not, changing their
> sort order (when equal, falling back to a secondary sort).  This, in turn,
> screws up paging.
>
> The problem is that score is rounded to a lower level of precision than
> what
> the suggested formula produces as a difference between two values within
> seconds of each other.  It seems to me if I could round the value to
> minutes
> or hours, where the difference will be large enough to not be rounded-out,
> then I wouldn't have problems with order changing on me.  But it's not
> legal
> syntax to specify something like:
> recip(ms(NOW,manufacturedate_dt/HOUR),3.16e-11,1,1)
>
> Is this a problem anyone has faced and solved?  Anyone have suggested
> solutions, other than indexing a copy of the date field that's rounded to
> the hour?
>
> --
> Stephen Duncan Jr
> www.stephenduncanjr.com
>


Re: Date Math

2011-02-23 Thread Andreas Kemkes
Thank you, that clarifies it.  Good catch on "-DAY".  I had noticed it after 
submitting but as "-1DAY" causes the same ParseException, I didn't amend the 
question.

Andreas




From: Chris Hostetter 
To: solr-user@lucene.apache.org
Sent: Tue, February 22, 2011 6:18:56 PM
Subject: Re: Date Math


: org.apache.lucene.queryParser.ParseException: Cannot parse 
'last_modified:-DAY': 

...
: Are they not supported as a short-cut for "NOW-1DAY"?  I'm using Solr 1.4.

No, "-1DAY" is a valid DateMath string (to the DateMathParser) but as a 
field value you must specify a valid date string, which can *end* with a 
DateMath string.  so "NOW-1DAY" is legal, as is 
"2011-02-22T12:34:56Z-1DAY"

Note also: you didn't do "-1DAY" you tried "-DAY" which isn't valid 
anywhere.


-Hoss



  

Re: hierarchical faceting, SOLR-792 - confused on config

2011-02-23 Thread kmf

I'm really confused now.  Is this page completely out of date -
http://wiki.apache.org/solr/HierarchicalFaceting - as it seems to imply that
solr-792 is a form of hierarchical faceting. "There are currently two
similar, non-competing, approaches to generating tree/hierarchical facets
from Solr: SOLR-64 and SOLR-792"

To achieve hierarchical faceting, is the rule then that you form the
hierarchical facets using a transformer in the DIH and do nothing in
schema.xml or solrconfig.xml?   I seem to recall reading somewhere that
creating a copyField is needed.  Sorry for the entry level question but, I'm
still trying to understand how to configure solr to do hierarchical
faceting.

Thanks,
kmf
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/hierarchical-faceting-SOLR-792-confused-on-config-tp2556394p2561445.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Sorting - bad performance

2011-02-23 Thread Ahmet Arslan

--- On Tue, 2/22/11, Jon Drukman  wrote:

> From: Jon Drukman 
> Subject: Sorting - bad performance
> To: solr-user@lucene.apache.org
> Date: Tuesday, February 22, 2011, 9:44 PM
> The performance factors wiki says:
> "If you do a lot of field based sorting, it is advantageous
> to add explicitly
> warming queries to the "newSearcher" and "firstSearcher"
> event listeners in your
> solrconfig which sort on those fields, so the FieldCache is
> populated prior to
> any queries being executed by your users."
> 
> I've got an index with 24+ million docs of forum posts from
> users.  I want to be
> able to get a given user's posts sorted by date.  It's
> taking 20 seconds right
> now.  What would I put in the newSearch/firstSearcher
> to make that quicker?  Is
> there any other general approach I can use to speed up
> sorting?
> 
> The schema looks like
> 
>  
>     indexed="true" stored="true"
> required="true" />
>     indexed="true" stored="true"/>
>     indexed="true" stored="true" />
>     indexed="true" stored="true" />
>     type="cistring" indexed="true" stored="true" />
>  
> 
> cistring is a case-insensitive string type i created:
> 
>     class="solr.StrField" sortMissingLast="true"
> omitNorms="true">
>         
>                
> 
>         
>         
>                
> 
>         
>     
> 

It is not directly related with sorting performance but this will reduce number 
of unique terms:

If you define a type with class="solr.StrField", then analyzer definition is 
ignored. Although analysis.jsp displays as if it is not ignored.

If you want to activate tokenizer etc, you need to use class="solr.TextField".

And about your author fields, depending of your domain you may want to use 
KeywordTokenizerFactory instead of LowerCaseTokenizerFactory. 
 





DataImportHandler in Solr 4.0

2011-02-23 Thread Alexandre Rocco
Hi guys,

I'm having some issues when trying to use the DataImportHandler on Solr
4.0.
I've downloaded the latest nightly build of Solr 4.0 and configured normally
(on the example folder) solrconfig.xml file like this:



data-config.xml



At this point I noticed that the DIH jar was not being loaded correctly
causing exceptions like:
Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
and
java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler

Do I need to build to get DIH running on Solr 4.0?

Thanks!
Alexandre


Getting lucene index differences

2011-02-23 Thread raimon.bosch


When you are working with full-imports in Solr you have to send over the net
all your index to your slaves, with the correspondent loss of time and
resources in some cases.

Is it a way to get the differences between two indexes and send only the
differences?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-lucene-index-differences-tp2561994p2561994.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Getting lucene index differences

2011-02-23 Thread Otis Gospodnetic
Raimon,

You want to use incremental indexing instead of full reindexing.  Look for 
deltaQuery mentions on DIH Wiki page.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: raimon.bosch 
> To: solr-user@lucene.apache.org
> Sent: Wed, February 23, 2011 1:34:39 PM
> Subject: Getting lucene index differences
> 
> 
> 
> When you are working with full-imports in Solr you have to send over the  net
> all your index to your slaves, with the correspondent loss of time  and
> resources in some cases.
> 
> Is it a way to get the differences  between two indexes and send only the
> differences?
> -- 
> View this  message in context: 
>http://lucene.472066.n3.nabble.com/Getting-lucene-index-differences-tp2561994p2561994.html
>
> Sent  from the Solr - User mailing list archive at Nabble.com.
> 


Re: Indexing languages, dataimporthandler

2011-02-23 Thread Otis Gospodnetic
Hi Greg,

I think you simply need to ID the language (e.g. Using Lang ID like 
http://sematext.com/products/language-identifier/index.html ) and then 
analyze/index it appropriately.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Greg Georges 
> To: "solr-user@lucene.apache.org" 
> Sent: Tue, February 22, 2011 2:50:23 PM
> Subject: Indexing languages, dataimporthandler
> 
> Hello all,
> 
> I have just gone through the mailing list and have set up my  different field 
>type analysers for my 6 different languages in my shema.xml.  Here is my 
>question. I am using the dataimporthandler to import data from my  database 
>into 
>my index. In my table, the documentname column's data can be in  any of the 6 
>languages. Lets say I want to index this data and apply the  different 
>language 
>analysers for certain cases, what would be the best way in my  case. The real 
>problem is that I do not know the language of the string in the  documentname 
>column once I create my index, therefore I cannot apply the correct  field 
>type. 
>Should I create a custom transformer?
> 
> Thanks
> 
> Greg
> 


Re: Question about Nested Span Near Query

2011-02-23 Thread Otis Gospodnetic
Hi,

What do you mean by "this doesn't work fine"?  Does it not work correctly or is 
it slow or ...

I was going to suggest you look at Surround QP, but it looks like you already 
did that.  Wouldn't it be better to get Surround QP to work?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Ahsan |qbal 
> To: solr-user@lucene.apache.org
> Sent: Tue, February 22, 2011 10:59:26 AM
> Subject: Question about Nested Span Near Query
> 
> Hi All
> 
> I had a requirement to implement queries that involves phrase  proximity.
> like user should be able to search "ab cd" w/5 "de fg", both  phrases as
> whole should be with in 5 words of each other. For this I  implement a query
> parser that make use of nested span queries, so above query  would be parsed
> as
> 
> spanNear([spanNear([Contents:ab, Contents:cd], 0,  true),
> spanNear([Contents:de, Contents:fg], 0, true)], 5,  false)
> 
> Queries like this seems to work really good when phrases are small  but when
> phrases are large this doesn't work fine. Now my question, Is there  any
> limitation of SpanNearQuery. that we cannot handle large phrases in  this
> way?
> 
> please help
> 
> Regards
> Ahsan
> 


Spellcheck Phrases

2011-02-23 Thread Tanner Postert
right now when I search for 'brake a leg', solr returns valid results with
no indication of misspelling, which is understandable since all of those
terms are valid words and are probably found in a few pieces of our content.
My question is:

is there any way for it to recognize that the phase should be "break a leg"
and not "brake a leg" and suggest the proper phrase?


Re: AlternateDistributedMLT.patch not working

2011-02-23 Thread Otis Gospodnetic
Hi Isha,

The patch is out of date.  You need to look at the patch and rejection and 
update your local copy of the code to match the logic from the patch, if it's 
still applicable to the version of Solr source code you have.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Isha Garg 
> To: solr-user@lucene.apache.org
> Sent: Tue, February 22, 2011 2:13:23 AM
> Subject: AlternateDistributedMLT.patch not working
> 
> Hello,
> 
>  I tried to use SOLR-788 with solr1.4 so that  distributed MLT works well 
> . 
>While working with this patch i got an error mesg  like
> 
> 1 out of 1 hunk FAILED -- saving rejects to file  
>src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java.rej
> 
> Can  anybody help me out?
> 
> Thanks!
> Isha Garg
> 
> 


Re: Faceting

2011-02-23 Thread Otis Gospodnetic
Hi Praveen,

I know this is not your original question, but if you have a product search 
system, you should really look into adding auto-completion of queries to it.  
Concretely, you probably want to start with auto-suggesting product names.  
People like this, know how to use it, and it allows *them* to choose precisely 
what they want.  To use your example, if they typed "Sony L" and auto-complete 
shows them Sony LED ... and Sony LCD... which one will they choose?  Well, they 
know what they are looking for, so they'll know exactly what to pick.  See URLs 
below my name for some auto-complete examples.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Praveen Parameswaran 
> To: solr-user@lucene.apache.org
> Sent: Tue, February 22, 2011 1:23:52 AM
> Subject: Re: Faceting
> 
> Hi ,
> @Tommaso @Jan Høydahl Thanks for the response :)
> 
> I 've done it  almost similar to what Tommaso suggested and yes it's about
> 70-80%  accurate.
> I understand the contradiction in the search - customer find stuff  without
> the exact right wording (recall) at the same time as you want the  query to
> be precise (precision).
> 
> In my scenario both cases are there  as well, but mostly a customer would
> know which product name he is searching  for and he will be interested in
> comparing the prices that different  marchants offer. What I feel is that ,
> may be the "Search" itself has to be  classified based on the contexts.
> 
> Will it be possible in solr to have the  below:
> 1 . A customer uses the correct product name to search , get the  accurate
> results
> 2.  A customer uses a keyword or without the exact  name , get the most
> relevant results.
> 
> 2nd part is fine as it's working  good. 1st part is where I'm struggling.
> 
> thanks
> Praveen
> 
> On Mon,  Feb 21, 2011 at 5:23 PM, Tommaso Teofili
> wrote:
> 
> >  Hi Praveen,
> > as far as I understand you have to set the type of the  field(s) you are
> > searching over to be conservative.
> > So for  example you won't include stemmer and lowercase filters and use only
> > a  whitespace tokenizer, more over you should search with the default
> >  operator set to AND.
> > Then faceting over those field(s) will depend on  those type settings.
> > You may find the following wiki page  useful:
> > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
> > My 2  cents,
> >
> >
> > 2011/2/21 Praveen Parameswaran 
> >
> > >  Hi,
> > >
> > > Is it possible to have 100% accuracy for facet  counts using solr ? Since
> > > this is for a product price comparison  site I would need the search to
> > > return accurate results. for  example if I search "sony lcd Tv" I do not
> > > want
> > > "sony  Led Tv" to be returned int he results.  Please let me know if this
> >  is
> > > possible and how?
> > >
> > >
> > >  Thanks
> > >
> > > Prav
> > >
> >
>


More Date Math: NOW/WEEK

2011-02-23 Thread Andreas Kemkes
Date Math is great.
NOW/MONTH, NOW/DAY are all working and very useful, so naively I tried 
NOW/WEEK, 
which failed.
Digging into the source code of DateMathParser.java, i found the following 
comment:
   99   // NOTE: consciously choosing not to support WEEK at this time,   
100   // because of complexity in rounding down to the nearest week   101   

// arround a month/year boundry.   102   // (Not to mention: it's not clear 
what people would *expect*) 

I was able to implement a work-around in my ruby client using the following 
pseudo code:
  wd=NOW.wday; "NOW-#{wd}DAY/DAY"
This could be extended and integrated into the DateMathParser.java directly 
using the something like the following mapping:
  valWEEKS --> (val*7)DAYS
  date/WEEK --> (date-(date.DAY_OF_WEEK)DAYS)/DAY
What other concerns are there to consider?
Andreas



  

Re: Getting lucene index differences

2011-02-23 Thread raimon.bosch


Hi Otis,

The problem is that we are using hadoop for batch index building. So in this
case we are not capable to do incremental indexing by now. It will be cool
if we could simulate incremental indexing only for the index uploads.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-lucene-index-differences-tp2561994p2562395.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Spellcheck Phrases

2011-02-23 Thread Dyer, James
Tanner,

Currently Solr will only make suggestions for words that are not in the 
dictionary, unless you specifiy "spellcheck.onlyMorePopular=true".  However, if 
you do that, then it will try to "improve" every word in your query, even the 
ones that are spelled correctly (so while it might change "brake" to "break" it 
might also change "leg" to "log".)  

You might be able to alleviate some of the pain by setting the 
"thresholdTokenFrequency" so as to remove misspelled and rarely-used words from 
your dictionary, although I personally haven't been able to get this parameter 
to work.  It also doesn't seem to be documented on the wiki but it is in the 
1.4.1. source code, in class IndexBasedSpellChecker.  Its also mentioned in 
Smiley&Pugh's book.  I tried setting it like this, but got a ClassCastException 
on the float value:


 text_spelling
 
  spellchecker
  Spelling_Dictionary
  text_spelling
  true
  .001 
 


I have it on my to-do list to look into this further but haven't yet.  If you 
decide to try it and can get it to work, please let me know how you do it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Tanner Postert [mailto:tanner.post...@gmail.com] 
Sent: Wednesday, February 23, 2011 12:53 PM
To: solr-user@lucene.apache.org
Subject: Spellcheck Phrases

right now when I search for 'brake a leg', solr returns valid results with
no indication of misspelling, which is understandable since all of those
terms are valid words and are probably found in a few pieces of our content.
My question is:

is there any way for it to recognize that the phase should be "break a leg"
and not "brake a leg" and suggest the proper phrase?


1.4.1 replication failure is still 200 OK

2011-02-23 Thread Jonathan Rochkind

Should this be considered a bug, or is there something i"m missing?

Let's say I have a replication slave set up, but without polling. So I'm 
going to manually trigger replication.


So I do that:  http://example.org/solr/core/replication?command=fetchIndex

I get a 200 OK _even if_ the masterUrl configured is wrong, has a typo, 
is unreachable, doesn't point at Solr, whatever.  No replication 
actually happened.


So is it a bug that I get a 200 OK anyway?

Alternately, where should I look to see if a replication actually 
succeeded?  Just the main log file?


Re: Sort Stability With Date Boosting and Rounding

2011-02-23 Thread Markus Jelsma
Hi,

This seems to be a tricky issue judging from the other replies. I'm just 
thinking out of the box now and the following options come to mind:

1) can you store the timestamp in the session in your middleware for each 
user? This way it stays fixed and doesn't change the order between requests. Of 
course, the order can still change when new documents are committed but this 
cannot be avoided. 

2) if you have frequent commits, you might find a way to modify Solr's 
RandomSortField to create a NOW for each commit. The timestamp remains fixed 
for all consequetive requests if you use the same field for the timestamp 
everytime. So instead of generating a random value, you'd just compute the 
current timestamp and the behavior will stay the same as RandomSortField.

Cheers

> The problem comes when you have results that are all the same natural score
> (because you've filtered them, with no primary search, for instance), and
> are very close together in time.  Then, as you page through, the order
> changes.  So the user experience is that they see duplicate documents, and
> miss out on some of the docs in the overall set.  It's not something
> negligible that I can ignore.  I either have to come up with a fix for
> this, or get rid of the boost function altogether.
> 
> Stephen Duncan Jr
> www.stephenduncanjr.com
> 
> 
> On Tue, Feb 22, 2011 at 6:09 PM, Markus Jelsma
> 
> wrote:
> > Hi,
> > 
> > You're right, it's illegal syntax to use other functions in the ms
> > function,
> > which is a pity indeed.
> > 
> > However, you reduce the score by 50% for each year. Therefore paging
> > through
> > the results shouldn't make that much of a difference because the
> > difference in
> > score with NOW+2 minutes has a negligable impact on the total score.
> > 
> > I had some thoughts on this issue as well but i decided the impact was
> > too little to bother about.
> > 
> > Cheers,
> > 
> > > I'm trying to use
> > 
> > http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of
> > _n
> > 
> > > ewer_documents as
> > > a bf parameter to my dismax handler.  The problem is, the value of NOW
> > 
> > can
> > 
> > > cause documents in a similar range (date value within a few seconds of
> > 
> > each
> > 
> > > other) to sometimes round to be equal, and sometimes not, changing
> > > their sort order (when equal, falling back to a secondary sort). 
> > > This, in
> > 
> > turn,
> > 
> > > screws up paging.
> > > 
> > > The problem is that score is rounded to a lower level of precision than
> > > what the suggested formula produces as a difference between two values
> > > within seconds of each other.  It seems to me if I could round the
> > > value to minutes or hours, where the difference will be large enough
> > > to not be rounded-out, then I wouldn't have problems with order
> > > changing on me.
> >  
> >  But
> >  
> > > it's not legal syntax to specify something like:
> > > recip(ms(NOW,manufacturedate_dt/HOUR),3.16e-11,1,1)
> > > 
> > > Is this a problem anyone has faced and solved?  Anyone have suggested
> > > solutions, other than indexing a copy of the date field that's rounded
> > > to the hour?
> > > 
> > > --
> > > Stephen Duncan Jr
> > > www.stephenduncanjr.com


Re: UpdateProcessor and copyField

2011-02-23 Thread Teruhiko Kurosaka
Jan,
So you are implying that the fields made by copyField are not processed by
UpdateProcessors, right?

Erik,
Logically this makes sense but then copyField operations must move to
solrconfig.xml?

Editing solrconfig.xml is more challenging than schema.xml, I feel.

Kuro

On 2/23/11 2:09 AM, "Erik Hatcher"  wrote:

>Maybe copy fields should be refactored to happen in a new, core, update
>processor, so there is nothing special/awkward about them?  It seems they
>fit as part of what an update processor is all about,
>augmenting/modifying incoming documents.
>
>Erik
>
>On Feb 23, 2011, at 04:40 , Jan Høydahl wrote:
>
>> This is how I've got it:
>> 
>> A document first passes through the UpdateChain (processors), which is
>>document centric.
>> Then copyFields are processed
>> And finally the analysis in fieldTypes are processed
>> 
>> So you cannot use  before UpdateProcessors nor after
>>Analysis :(
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> On 23. feb. 2011, at 09.45, Markus Jelsma wrote:
>> 
>>> You are right, i misread. Fields a copied first, then analyzed and
>>>then behave 
>>> like other fields and pass the same way through the update processor.
>>> 
>>> Cheers,
>>> 
 Markus,
 
 I searched but I couldn't find a definite answer, so I posted this
 question.
 The article you quoted talks about implementing a copyField-like
operation
 using UpdateProcessor.  It doesn't talk about relationship between
 the copyField operation proper and UpdateProcessors.
 
 Kuro
 
 On 2/22/11 3:00 PM, "Markus Jelsma" 
wrote:
> Yes. But did you actually search the mailing list or Solr's wiki? I
>guess
> not.
> 
> Here it is:
> http://wiki.apache.org/solr/UpdateRequestProcessor
> 
>> Can fields created by copyField instructions be processed by
>> UpdateProcessors?
>> Or only raw input fields can?
>> 
>> So far my experiment is suggesting the latter.
>> 
>> 
>> T. "Kuro" Kurosaka
>> 
>



Re: Solr Ajax

2011-02-23 Thread Markus Jelsma
Hi,

I may have misread it all but SolrJ is the Java client and you don't need it 
for a pretty AJAX interface.

Cheers,

> Hello list,
> 
> I'm in the process of trying to implement Ajax within my Solr-backed webapp
> I have been reading both the Solrj wiki as well as the tutorial provided
> via the google group and various info from the wiki page
> https://github.com/evolvingweb/ajax-solr/wiki
> 
> I have all solrj jar libraries available in my webapp /lib but I am
> unsure as to what steps I take to configure the Solrj client. What do I
> need to configure to begin working with Solrj? I am unsure as to where to
> go and finding information on the wiki seems to be a non trivial task.
> 
> Any help would be great. Thanks
> 
> Lewis
> 
> Glasgow Caledonian University is a registered Scottish charity, number
> SC021474
> 
> Winner: Times Higher Education’s Widening Participation Initiative of the
> Year 2009 and Herald Society’s Education Initiative of the Year 2009.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,
> en.html
> 
> Winner: Times Higher Education’s Outstanding Support for Early Career
> Researchers of the Year 2010, GCU as a lead with Universities Scotland
> partners.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691
> ,en.html


DataImportHandler in Solr 4.0

2011-02-23 Thread Alexandre Rocco
Hi guys,

I'm having some issues when trying to use the DataImportHandler on Solr 4.0.

I've downloaded the latest nightly build of Solr 4.0 and configured normally
(on the example folder) solrconfig.xml file like this:



data-config.xml



At this point I noticed that the DIH jar was not being loaded correctly
causing exceptions like:
Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
and
java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler

Do I need to build to get DIH running on Solr 4.0?

Thanks!
Alexandre


Re: Help with parsing configuration using SolrParams/NamedList

2011-02-23 Thread Markus Jelsma
Hi,

The params you have suggest you're planning to use SweetSpotSimilarity. There 
already is a factory you can use in Jira.

https://issues.apache.org/jira/browse/SOLR-1365

Cheers,
> Hi,
> 
> I'm trying to use a CustomSimilarityFactory and pass in per-field
> options from the schema.xml, like so:
> 
>  
> 
> 500
> 1
> 0.5
> 
> 
> 500
> 2
> 0.5
> 
>  
> 
> My problem is I am utterly failing to figure out how to parse this
> nested option structure within my CustomSimilarityFactory class. I
> know that the settings are available as a SolrParams object within the
> getSimilarity() method. I'm convinced I need to convert to a NamedList
> using params.toNamedList(), but my java fu is too feeble to code the
> dang thing. The closest I seem to get is the top level as a NamedList
> where the keys are "field_a" and "field_b", but then my values are
> strings, e.g., "{min=500,max=1,steepness=0.5}".
> 
> Anyone who could dash off a quick example of how to do this?
> 
> Thanks,
> --jay


Re: Detailed Steps for Scaling Solr

2011-02-23 Thread Markus Jelsma
Hi,

Scaling might be required. How large is the index going to be in number of 
documents, fields and bytes and what hardware do you have? Powerful CPU's and a 
lot of RAM will help. And, how many queries per second do you expect? And how 
many updates per minute?

Depending on average document size you can have up to tens of millions 
documents on a single box. If read load is high, you can then easily replicate 
the data to slaves to balance load.

If the data outgrows a single box then sharding is the way to go. But first i'd 
try to see if a single box will do the trick and perhaps replace spinning 
disks with SSD's and stick more RAM in it.

Cheers,
> Dear all,
> 
> I need to construct a site which supports searching for a large index. I
> think scaling Solr is required. However, I didn't get a tutorial which
> helps me do that step by step. I only have two resources as references.
> But both of them do not tell me the exact operations.
> 
> 1)
> http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Sc
> aling-Lucene-and-Solr
> 
> 2) David Smiley, Eric Pugh; Solr 1.4 Enterprise Search Server
> 
> If you have experiences to scale Solr, could you give me such tutorials?
> 
> Thanks so much!
> LB


Re: UpdateProcessor and copyField

2011-02-23 Thread Yonik Seeley
On Wed, Feb 23, 2011 at 5:09 AM, Erik Hatcher  wrote:
> Maybe copy fields should be refactored to happen in a new, core, update 
> processor, so there is nothing special/awkward about them?  It seems they fit 
> as part of what an update processor is all about, augmenting/modifying 
> incoming documents.

Seems reasonable.
By default, the copyFields could be read from the schema for back
compat (and the fact that copyField does feel more natural in the
schema)

-Yonik
http://lucidimagination.com


Re: 1.4.1 replication failure is still 200 OK

2011-02-23 Thread Markus Jelsma
Hi,

I'd guess a non-200 HTTP response code would be more appropriate indeed but 
it's just a detail.

A successful replication will change a few things on the slave:
- increment of generation value
- updated indexVersion value
- lastReplication will have a new timestamp

You can also check for a replication.properties in your slave's data 
directory.

Cheers,

> Should this be considered a bug, or is there something i"m missing?
> 
> Let's say I have a replication slave set up, but without polling. So I'm
> going to manually trigger replication.
> 
> So I do that:  http://example.org/solr/core/replication?command=fetchIndex
> 
> I get a 200 OK _even if_ the masterUrl configured is wrong, has a typo,
> is unreachable, doesn't point at Solr, whatever.  No replication
> actually happened.
> 
> So is it a bug that I get a 200 OK anyway?
> 
> Alternately, where should I look to see if a replication actually
> succeeded?  Just the main log file?


MailEntityProcessor

2011-02-23 Thread Husrev Yilmaz
Hi,

I am new to Solr, without any Java knowledge.

I downloaded and run Solr under Tomcat. At the other hand I have a working
IMAP server on the same machine. I want to index Date, From, To, Cc, Bcc,
Subject, Body.

How can I set up Solr to do this? Could you write a small guide to help me?
(where to put which xml by which content). There is enough documentation
about DBs, but I couldn't get it working for IMAP?

Regards.

-- 
Husrev Yilmaz
+90 554 3304911


[ANN] new SolrMeter release

2011-02-23 Thread Tomás Fernández Löbbe
Hi All, I'm happy to announce a new release of SolrMeter, an open source
stress test tool for Solr.

You can obtain the code or executable jar from the google code page at:

http://code.google.com/p/solrmeter

There have been a lot of improvements since the last release, you can see
what's new by checking the "issues" tool or entering here:

http://code.google.com/p/solrmeter/issues/list?can=1&q=Milestone%3DRelease-0.2.0+&colspec=ID+Type+Status+Priority+Milestone+Owner+Summary&cells=tiles


Best Regards,

Tomás


Re: [ANN] new SolrMeter release

2011-02-23 Thread Lance Norskog
Cool!

On 2/23/11, Tomás Fernández Löbbe  wrote:
> Hi All, I'm happy to announce a new release of SolrMeter, an open source
> stress test tool for Solr.
>
> You can obtain the code or executable jar from the google code page at:
>
> http://code.google.com/p/solrmeter
>
> There have been a lot of improvements since the last release, you can see
> what's new by checking the "issues" tool or entering here:
>
> http://code.google.com/p/solrmeter/issues/list?can=1&q=Milestone%3DRelease-0.2.0+&colspec=ID+Type+Status+Priority+Milestone+Owner+Summary&cells=tiles
>
>
> Best Regards,
>
> Tomás
>


-- 
Lance Norskog
goks...@gmail.com


Re: MailEntityProcessor

2011-02-23 Thread Smiley, David W.
I assume you found this?: http://wiki.apache.org/solr/MailEntityProcessor
You don't provide enough information to get assistance when you simply say "I 
couldn't get it working".

(disclaimer: I haven't used DIH's mail feature)
~ David

On Feb 23, 2011, at 5:15 PM, Husrev Yilmaz wrote:

> Hi,
> 
> I am new to Solr, without any Java knowledge.
> 
> I downloaded and run Solr under Tomcat. At the other hand I have a working
> IMAP server on the same machine. I want to index Date, From, To, Cc, Bcc,
> Subject, Body.
> 
> How can I set up Solr to do this? Could you write a small guide to help me?
> (where to put which xml by which content). There is enough documentation
> about DBs, but I couldn't get it working for IMAP?
> 
> Regards.
> 
> -- 
> Husrev Yilmaz
> +90 554 3304911



Re: DataImportHandler in Solr 4.0

2011-02-23 Thread Smiley, David W.
The DIH is no longer supplied embedded in the Solr war file.  You need to get 
it on the classpath somehow. You could add another http://www.packtpub.com/solr-1-4-enterprise-search-server/

On Feb 23, 2011, at 4:11 PM, Alexandre Rocco wrote:

> Hi guys,
> 
> I'm having some issues when trying to use the DataImportHandler on Solr 4.0.
> 
> I've downloaded the latest nightly build of Solr 4.0 and configured normally
> (on the example folder) solrconfig.xml file like this:
> 
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
> data-config.xml
> 
> 
> 
> At this point I noticed that the DIH jar was not being loaded correctly
> causing exceptions like:
> Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
> and
> java.lang.ClassNotFoundException:
> org.apache.solr.handler.dataimport.DataImportHandler
> 
> Do I need to build to get DIH running on Solr 4.0?
> 
> Thanks!
> Alexandre










Re: fq field with facets

2011-02-23 Thread Chris Hostetter


:   fq={!field f=category}

There are subtle nuanced cases where "field" won't work properly, which is 
why i usually recommend "raw", but there are also subtle nuanced cases 
where raw won't work either (although in my opinion those cases are much 
less likely in typical faceting)

this is why the "term" parser was added.

I cover this fairly in depth in the notes attached to the relevant slides 
here...

http://people.apache.org/~hossman/apachecon2010/facets/

--
raw QParser
  * Default Query Parser does special things with whitespace and punctuation
  * Problematic when "filtering" on Facet Field Constraints that contain 
whitespace or punctuation
  * Use the raw parser to filter on an exact Term

fq = {!raw f=category}Books & Magazines

This is utilizing Solr's LocalParams Syntax to embed metadata directly 
into the fq param value. {!raw} is the short form of {!type=raw}

You could also alter the default parser, but it's unlikely you would want 
all of your query params parsed with the raw parser by default.

One potential pitfall with using the raw QParser is if you facet on 
Numeric fields that utilize an encoded representation. (ie: the "TrieFoo" 
or "SortableFoo" Field Types. The RawQParser expects truly "Raw" Terms, 
but for encoded numeric types the term you get in the facet response is 
the "external" value, and RawQParser won't convert that to the internal 
value. The field QParser may be a better choice in those situations (it's 
the one Yonik recommends) -- However if you have a Query Analyzer that is 
not idempotent (a situation that's easy to get in w/o realizing it) it's 
very possible to get incorrect results. The term QParser discussed later 
will be the best of both worlds.
--

...

--
term QParser

 * All of the advantages of the raw QParser
 * Will also work on encoded numeric fields

fq = {!term f=category}Books & Magazines fq = {!term f=weight}1.56

SOLR-2113 Tracks this functionality. It has already been committed to the 
trunk, so it should certainly be included in Solr 4.0, and it will likely 
be back ported and included in Solr 3.1 as well.
--


-Hoss


Re: DataImportHandler in Solr 4.0

2011-02-23 Thread Estrada Groups
Curious...why was this feature removed?

Adam

On Feb 23, 2011, at 6:55 PM, "Smiley, David W."  wrote:

> The DIH is no longer supplied embedded in the Solr war file.  You need to get 
> it on the classpath somehow. You could add another  solrconfig.xml to resolve this.
> 
> ~ David Smiley
> Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/
> 
> On Feb 23, 2011, at 4:11 PM, Alexandre Rocco wrote:
> 
>> Hi guys,
>> 
>> I'm having some issues when trying to use the DataImportHandler on Solr 4.0.
>> 
>> I've downloaded the latest nightly build of Solr 4.0 and configured normally
>> (on the example folder) solrconfig.xml file like this:
>> 
>> > class="org.apache.solr.handler.dataimport.DataImportHandler">
>> 
>> data-config.xml
>> 
>> 
>> 
>> At this point I noticed that the DIH jar was not being loaded correctly
>> causing exceptions like:
>> Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
>> and
>> java.lang.ClassNotFoundException:
>> org.apache.solr.handler.dataimport.DataImportHandler
>> 
>> Do I need to build to get DIH running on Solr 4.0?
>> 
>> Thanks!
>> Alexandre
> 
> 
> 
> 
> 
> 
> 
> 


Re: [ANN] new SolrMeter release

2011-02-23 Thread Savvas-Andreas Moysidis
Nice! will definitely give it a try! :)

On 23 February 2011 22:55, Lance Norskog  wrote:

> Cool!
>
> On 2/23/11, Tomás Fernández Löbbe  wrote:
> > Hi All, I'm happy to announce a new release of SolrMeter, an open source
> > stress test tool for Solr.
> >
> > You can obtain the code or executable jar from the google code page at:
> >
> > http://code.google.com/p/solrmeter
> >
> > There have been a lot of improvements since the last release, you can see
> > what's new by checking the "issues" tool or entering here:
> >
> >
> http://code.google.com/p/solrmeter/issues/list?can=1&q=Milestone%3DRelease-0.2.0+&colspec=ID+Type+Status+Priority+Milestone+Owner+Summary&cells=tiles
> >
> >
> > Best Regards,
> >
> > Tomás
> >
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


Re: DataImportHandler in Solr 4.0

2011-02-23 Thread Bill Bell
It is a contrib module. But the example solrconfig should have the lib set.

Bill Bell
Sent from mobile


On Feb 23, 2011, at 5:13 PM, Estrada Groups  
wrote:

> Curious...why was this feature removed?
> 
> Adam
> 
> On Feb 23, 2011, at 6:55 PM, "Smiley, David W."  wrote:
> 
>> The DIH is no longer supplied embedded in the Solr war file.  You need to 
>> get it on the classpath somehow. You could add another > solrconfig.xml to resolve this.
>> 
>> ~ David Smiley
>> Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/
>> 
>> On Feb 23, 2011, at 4:11 PM, Alexandre Rocco wrote:
>> 
>>> Hi guys,
>>> 
>>> I'm having some issues when trying to use the DataImportHandler on Solr 4.0.
>>> 
>>> I've downloaded the latest nightly build of Solr 4.0 and configured normally
>>> (on the example folder) solrconfig.xml file like this:
>>> 
>>> >> class="org.apache.solr.handler.dataimport.DataImportHandler">
>>> 
>>> data-config.xml
>>> 
>>> 
>>> 
>>> At this point I noticed that the DIH jar was not being loaded correctly
>>> causing exceptions like:
>>> Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
>>> and
>>> java.lang.ClassNotFoundException:
>>> org.apache.solr.handler.dataimport.DataImportHandler
>>> 
>>> Do I need to build to get DIH running on Solr 4.0?
>>> 
>>> Thanks!
>>> Alexandre
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 


Re: MailEntityProcessor

2011-02-23 Thread Lance Norskog
The DIH config does not mention port numbers, or security options. I
recently wrote a custom app to download and index mail- there were
several complexities (the above problems, attachments, calculating
mail threads).

I think your best bet is to find a utility that downloads mail into
mbox files. Tika has a parser for these files. The
ExtractingRequestHandler will parse them, but I don't know if it makes
separate documents for each email.

Solr 3.x (unreleased) has Tika in the DataImportHandler. This might be
more flexible.

On Wed, Feb 23, 2011 at 3:52 PM, Smiley, David W.  wrote:
> I assume you found this?: http://wiki.apache.org/solr/MailEntityProcessor
> You don't provide enough information to get assistance when you simply say "I 
> couldn't get it working".
>
> (disclaimer: I haven't used DIH's mail feature)
> ~ David
>
> On Feb 23, 2011, at 5:15 PM, Husrev Yilmaz wrote:
>
>> Hi,
>>
>> I am new to Solr, without any Java knowledge.
>>
>> I downloaded and run Solr under Tomcat. At the other hand I have a working
>> IMAP server on the same machine. I want to index Date, From, To, Cc, Bcc,
>> Subject, Body.
>>
>> How can I set up Solr to do this? Could you write a small guide to help me?
>> (where to put which xml by which content). There is enough documentation
>> about DBs, but I couldn't get it working for IMAP?
>>
>> Regards.
>>
>> --
>> Husrev Yilmaz
>> +90 554 3304911
>
>



-- 
Lance Norskog
goks...@gmail.com


Re: DataImportHandler in Solr 4.0

2011-02-23 Thread Chris Hostetter

: Curious...why was this feature removed?

it hasn't been removed, it still ships with the solr releases as a contrib 
jar the same way it always did.

what changed is that in the past it was mistakenly/foolishly/inexplicable 
also included in the solr.war -- even though it didn't need to be.

This is loudly noted in the upgrading section of CHANGES.txt so people who 
were using it in previous releases and expected to it be embeded in the 
solr.war will have notice that they should update their configuration to 
include it as a plugin... 

https://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/CHANGES.txt?r1=1071480&r2=1072131

...for people starting with the 3.x or 4.x example-DIH config files, they 
already include the neccessary  declaration.


-Hoss


Re: DataImportHandler in Solr 4.0

2011-02-23 Thread Alexandre Rocco
I got it working by building the DIH from the contrib folder and made a
change on the lib statements to map the folder that contains the .jar files.

Thanks!
Alexandre

On Wed, Feb 23, 2011 at 8:55 PM, Smiley, David W.  wrote:

> The DIH is no longer supplied embedded in the Solr war file.  You need to
> get it on the classpath somehow. You could add another  solrconfig.xml to resolve this.
>
> ~ David Smiley
> Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/
>
> On Feb 23, 2011, at 4:11 PM, Alexandre Rocco wrote:
>
> > Hi guys,
> >
> > I'm having some issues when trying to use the DataImportHandler on Solr
> 4.0.
> >
> > I've downloaded the latest nightly build of Solr 4.0 and configured
> normally
> > (on the example folder) solrconfig.xml file like this:
> >
> >  > class="org.apache.solr.handler.dataimport.DataImportHandler">
> > 
> > data-config.xml
> > 
> > 
> >
> > At this point I noticed that the DIH jar was not being loaded correctly
> > causing exceptions like:
> > Error loading class
> 'org.apache.solr.handler.dataimport.DataImportHandler'
> > and
> > java.lang.ClassNotFoundException:
> > org.apache.solr.handler.dataimport.DataImportHandler
> >
> > Do I need to build to get DIH running on Solr 4.0?
> >
> > Thanks!
> > Alexandre
>
>
>
>
>
>
>
>
>


Re: UpdateProcessor and copyField

2011-02-23 Thread Chris Hostetter

: > Maybe copy fields should be refactored to happen in a new, core, 
: update processor, so there is nothing special/awkward about them?  It 
: seems they fit as part of what an update processor is all about, 
: augmenting/modifying incoming documents.
: 
: Seems reasonable.
: By default, the copyFields could be read from the schema for back
: compat (and the fact that copyField does feel more natural in the
: schema)

As someone who has written special case UpdateProcessors that clone field 
values, I agree that it would be handy to have a new generic 
"CopyFieldUpdateProcessor" but i'm not really on board the idea of it 
reading  declarations by default.  the ideas really serve 
differnet purposes...

* as an UpdateProcessor it's something that can be 
adjusted/configured/overridden on a use cases basis - some request 
handlers could be confgured to use a processor chain that includes the 
CopyFieldUpdateProcessor and some could be configured not to.

* schema copyField declarations are things hat happen to *every* document, 
regardless of where it comes from.

the use cases would be very differnet: consider a schema with many 
differnet fields specific to certain types of documents, as well as a few 
required fields that every type of document must have: "title", 
"description", "body", and "maintext" fields.  it might make sense for 
to use differnet processor chains along with a 
CopyFieldUpdateProcessor to clone some some other fields (say: an 
"dust_jacked_text" field for books, and a "plot_summary" field for movies) 
into the "description" field when those docs are indexed -- but if you 
absolutely positively *allways* wanted the contents of title, description, 
and body to be copied into the "maintext" field that would make more sense 
as a schema.xml declaration.

likewise: it would be handy t have an UpdateProcessor that rejected 
documents that were missing some fields -- but that would not be a true 
substitute for using required="true" on a field in the schema.xml.

a single index may have multiple valid processor chains for differnet 
indexing situations -- but "rules" declared in the schema.xml are absolute 
and can not be circumvented.


-Hoss

Re: taxonomy faceting

2011-02-23 Thread Chris Hostetter

: I have many taxonomies and each document can apply to some of them. I dont
: know how many taxonomies they are, so i cant define a field in the schema
: for each taxonomy (one field per each taxonomy).
: 
: I want to use these feature but i need to know if i can handle the context
: where each document apply few taxonomies and i cant define a field for each
: taxonomy on the schema because they are dinamyc. Can Solr handle these
: situation?

Well, i'm not sure that i really understand your question...

you could easily use a dynamic field to declare taxonomy_* naming pattern 
for all of your taxonomy fields.  so then as long as you know what 
taxonomies each doc is in (and which branches it is in in each of those 
taxonomies) when you index teh doc you'd be fine.  but if you don't 
actaully know the list of all taxonomies, what owuld you do with those 
fields once you indexed them?

alternately you could model your data so that you only had one "taxonomy" 
field, and the root level nodes of that taxonomy would be the names of 
each of the multitudes of taxonomies you have -- then the same faceting 
tricks i described in that webinar would work (but again: youd have to 
know know what taxonomies each doc is in, and which branches it is in in 
each of those taxonomies, when you index each doc).





-Hoss


Re: disable replication in a persistent way

2011-02-23 Thread Otis Gospodnetic
Hi,


- Original Message 
> From: Ahmet Arslan 
> Subject: disable replication in a persistent way
> 
> Hello,
> 
> solr/replication?command=disablepoll disables replication on  slave(s). 
> However 
>it is not persistent. After solr/tomcat restart, slave(s) will  continue 
>polling. 
>
> 
> Is there a built-in way to disable replication on  slave side in a persistent 
>manner?

Not that I know of.

Hoss or somebody else will correct me if I'm wrong :)

> Currently I am using system property  substitution along with 
>solrcore.properties file to simulate  this.
> 
> 
> ${enable.slave:false} 
> 
> #solrcore.properties  in slave
> enable.master=true
> 
> And modify solrcore.properties with a  custom solr request handler after the 
>disablepoll command, to make it  persistent. It seems that there is no 
>existing 
>mechanism to write  solrconfig.properties file, am I  correct?

What about modifying the existing classes (the one/ones that handle the 
disablepoll command) to take another param: persist=true|false ?
Would that be better than a custom Solr request handler that requires a 
separate 
call?

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



custom query parameters

2011-02-23 Thread Michael Moores
I'm required to provide a handler with some specialized query string inputs.

I'd like to translate the query inputs to a lucene/solr query and delegate the 
request to the existing lucene/dismax handler.

What's the best way to do this?
Do I implement SolrRequestHandler, or a QParser?  Do I extend the existing 
StandardRequestHandler?

thanks,
--Michael








Re: Question about Nested Span Near Query

2011-02-23 Thread Ahsan |qbal
Hi

It didn't search.. (means no results found even results exist) one
observation is that it works well even in the long phrases but when the long
phrases contain stop words and same stop word exist two or more time in the
phrase then, solr can't search with query parsed in this way.


On Wed, Feb 23, 2011 at 11:49 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

> Hi,
>
> What do you mean by "this doesn't work fine"?  Does it not work correctly
> or is
> it slow or ...
>
> I was going to suggest you look at Surround QP, but it looks like you
> already
> did that.  Wouldn't it be better to get Surround QP to work?
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Ahsan |qbal 
> > To: solr-user@lucene.apache.org
> > Sent: Tue, February 22, 2011 10:59:26 AM
> > Subject: Question about Nested Span Near Query
> >
> > Hi All
> >
> > I had a requirement to implement queries that involves phrase  proximity.
> > like user should be able to search "ab cd" w/5 "de fg", both  phrases as
> > whole should be with in 5 words of each other. For this I  implement a
> query
> > parser that make use of nested span queries, so above query  would be
> parsed
> > as
> >
> > spanNear([spanNear([Contents:ab, Contents:cd], 0,  true),
> > spanNear([Contents:de, Contents:fg], 0, true)], 5,  false)
> >
> > Queries like this seems to work really good when phrases are small  but
> when
> > phrases are large this doesn't work fine. Now my question, Is there  any
> > limitation of SpanNearQuery. that we cannot handle large phrases in  this
> > way?
> >
> > please help
> >
> > Regards
> > Ahsan
> >
>


Re: custom query parameters

2011-02-23 Thread Michael Moores
Trying to answer my own question.. seems like it would be a good idea to create 
a SearchComponent and add this to the list of existing components.
My component just converts query parameters to something that the solr 
QueryComponent understands.
One good way of doing it?



On Feb 23, 2011, at 8:12 PM, Michael Moores wrote:

> I'm required to provide a handler with some specialized query string inputs.
> 
> I'd like to translate the query inputs to a lucene/solr query and delegate 
> the request to the existing lucene/dismax handler.
> 
> What's the best way to do this?
> Do I implement SolrRequestHandler, or a QParser?  Do I extend the existing 
> StandardRequestHandler?
> 
> thanks,
> --Michael
> 
> 
> 
> 
> 
> 



Re: fq field with facets

2011-02-23 Thread dhanesh

Hi
I have faced the same problem and I solved it by adding double quotes
dhanesh s.r

On 2/23/2011 7:47 PM, Erik Hatcher wrote:

Try -

   fq={!field f=category}

You can also try surrounding with quotes, but that gets tricky and you'll need 
to escape things possibly.  Or you could simply backslash escape the whitespace 
(and colon, etc) characters.

Erik

On Feb 23, 2011, at 08:25 , Rosa (Anuncios) wrote:


Hi,

I'm trying to reduce results from facets. (by category with my schema)

My category field is String type in my schema.xml.

The problem i've got is when the category value has space or special caracter 
it doen't work?

Example:

?q=home&fq=category:Appartement  --->  works fine

?q=home&fq=category:Appartement for rent-->  doesn't work?

?q=home&fq=category:Appartement>  Sale-->  doesn't work?

I guess there is a workaround this? Sorry if it's obvious... i'm a newbie with 
Solr

thanks for your help

rosa