Re: schema.xml configuration for file names?

2011-02-15 Thread Stefan Matheis
Alan,

if you want to search the filename .. it has to be part of the
file-content itself. solr doesn't care about the filename itself, only
the content of the given file will be indexed.

HTH
Stefan

On Tue, Feb 15, 2011 at 1:15 AM, alan bonnemaison  wrote:
> Hello!
>
> We receive from our suppliers hardware manufacturing data in XML files. On a
> typical day, we got 25,000 files. That is why I chose to implement Solr.
>
> The file names are made of eleven fields separated by tildas like so
>
> CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML
>
> Our R&D guys want to be able search each field of the file XML file names
> (OR operation) but they don't care to search the file contents. Ideally,
> they would like to do a query all files where "stbmodel" equal to "R16-500"
> or "result" is "P" or "filedate" is "20110125"...you get the idea.
>
> I defined in schema.xml each data field like so (from left to right -- sorry
> for the long list):
>
>    stored="true"   multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="false"  multiValued="false"/>
>    stored="fase"   multiValued="false"/>
>    stored="true"    multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="true"    multiValued="false"/>
>    stored="true"    multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="true"   multiValued="false"/>
>
> Also, I defined as unique key the field "receiver". But no results are
> returned by my queries. I made sure to update my index like so: "java -jar
> apache-solr-1.4.1/example/exampledocs/post.jar *XML".
>
> I am obviously missing something. Is there a way to configure schema.xml to
> search for file names? I welcome your input.
>
> Al.
>


Re: SolrCloud - Example C not working

2011-02-15 Thread Thorsten Scherler
Hmm, nobody has an idea, for everybody the example c is working fine.

salu2

On Mon, 2011-02-14 at 14:08 +0100, Thorsten Scherler wrote:
> Hi all,
> 
> I followed http://wiki.apache.org/solr/SolrCloud and everything worked
> fine till I tried "Example C:".
> 
> I start all 4 server but all of them keep looping through:
> 
> "java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/127.0.0.1:9983
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9900
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:17 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9983
> Feb 14, 2011 1:31:17 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:19 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:8574
> Feb 14, 2011 1:31:19 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:20 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/127.0.0.1:8574
> Feb 14, 2011 1:31:20 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> 
> The problem seems that the zk instances can not connects to the
> different nodes and so do not get up at all.
> 
> I am using revision 1070473 for the tests. Anybody has an idea?
> 
> salu2

-- 
Thorsten Scherler 
codeBusters S.L. - web based systems

http://www.codebusters.es/


smime.p7s
Description: S/MIME cryptographic signature


Re: Solr 1.4 requestHandler "update" Runtime disable/enable

2011-02-15 Thread Jan Høydahl
Sure.

  

Then set Java System Property -Dsolr.enable.master=false and restart.

But I don't see why you need to disable it. You will anyway need to stop 
sending updates to the old master yourself. Disabling the handler like this 
will cause an exception if you try to call it because it will not be registered.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 14. feb. 2011, at 17.42, gzyl wrote:

> 
> Is there a possibility at the runtime to disable or enable of update handler?
> 
> I have two servers and would like to turn off the update handler on master.
> Then replicate master to slave and switch slave to master.
> 
> 
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-1-4-requestHandler-update-Runtime-disable-enable-tp2493745p2493745.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Guidance for event-driven indexing

2011-02-15 Thread Jan Høydahl
Solr is multi threaded, so you are free to send as many parallel update 
requests needed to utilize your HW. Each request will get its own thread. 
Simply configure StreamingUpdateSolrServer from your client.

If there is some lengthy work to be done, it needs to be done in SOME thread, 
and I guess you just have to choose where :)

A JMSUpdateHandler sounds heavy weight, but does not need to be, and might be 
the logically best place for such a feature imo. 

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 14. feb. 2011, at 17.42, Rich Cariens wrote:

> Thanks Jan,
> 
> I don't think I want to tie up a thread on two boxes waiting for an
> UpdateRequestProcessor to finish. I'd prefer to offload it all to the target
> shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm
> really just looking for a simple API that allows me to add a
> SolrInputDocument to the index in-process.
> 
> Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages? I'm
> worried that this will break all the nice stuff one gets with the standard
> SOLR webapp (stats, admin, etc).
> 
> Best,
> Rich
> 
> 
> On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl  wrote:
> 
>> Hi,
>> 
>> One option would be to keep the JMS listener as today but move the
>> downloading
>> and transforming part to a SolrUpdateRequestProcessor on each shard. The
>> benefit
>> is that you ship only a tiny little SolrInputDocument over the wire with a
>> reference to the doc to download, and do the heavy lifting on Solr side.
>> 
>> If each JMS topic/channel corresponds to a particular shard, you could
>> move the whole thing to Solr. If so, a new JMSUpdateHandler could perhaps
>> be a way to go?
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> On 14. feb. 2011, at 16.53, Rich Cariens wrote:
>> 
>>> Hello,
>>> 
>>> I've built a system that receives JMS events containing links to docs
>> that I
>>> must download and index. Right now the JMS receiving, downloading, and
>>> transformation into SolrInputDoc's happens in a separate JVM that then
>> uses
>>> Solrj javabin HTTP POSTs to distribute these docs across many index
>> shards.
>>> 
>>> For various reasons I won't go into here, I'd like to relocate/deploy
>> most
>>> of my processing (JMS receiving, downloading, and transformation into
>>> SolrInputDoc's) into the SOLR webapps running on each distributed shard
>>> host. I might be wrong, but I don't think the request-driven idiom of the
>>> DataImportHandler is not a good fit for me as I'm not kicking off full or
>>> delta imports. If that's true, what's the correct way to hook my
>> components
>>> into SOLR's update facilities? Should I try to get a reference a
>> configured
>>> DirectUpdateHandler?
>>> 
>>> I don't know if this information helps, but I'll put it out there
>> anyways:
>>> I'm using Spring 3 components to receive JMS events, wired up via webapp
>>> context hooks. My plan would be to add all that to my SOLR shard webapp.
>>> 
>>> Best,
>>> Rich
>> 
>> 



Re: rollback to other versions of index

2011-02-15 Thread Jan Høydahl
Yes and no. The index grows like an onion adding new segments for each commit.
There is no API to remove the newly added segments, but I guess you could hack 
something.
The other problem is that as soon as you trigger an optimize() all history is 
gone as the segments are merged into one. Optimize normally happens 
automatically behind the scenes. You could turn off merging but that will badly 
hurt your performance after some time and ultimately crash your OS.

Since you only need a few versions back, you COULD write your own custom 
mergePolicy, always preserving at least N versions. But beware that a "version" 
may be ONE document or 1 documents, depending on how you commit or if 
autoCommit is active. so if you go this route you also need strict control over 
your commits.

Perhaps best option is to handle this on feeding client side, where you keep a 
buffer of N last docs. Then you can freely roll back or re-index as you choose, 
based on time, number of docs etc.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 15. feb. 2011, at 01.21, Tri Nguyen wrote:

> Hi,
> 
> Does solr version each index build?  
> 
> We'd like to be able to rollback to not just a previous version but maybe a 
> few 
> version before the current one.
> 
> Thanks,
> 
> Tri



Re: carrot2 clustering component error

2011-02-15 Thread Markus Jelsma
I've seen that before on a 3.1 check out after i compiled the clustering 
component, copied the jars and started Solr. For some reason , recompiling 
didn't work and doing an ant clean in front didn't fix it either. Updating to a 
revision i knew did work also failed.

I just removed the entire checkout and took it back in, repeated my steps and 
it works fine now.

> help me out of this error:
> 
>   java.lang.NoClassDefFoundError: org/apache/solr/util/plugin/SolrCoreAware


Solr not Available with Ping when DocBuilder is running

2011-02-15 Thread stockii

Hello.

I do every 2 Minutes a Delta and if one Core (of 7) is running a delta, solr
isnt available. when i look in the logFile the ping comes in this time, when
DocBuilder is running ... 

Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder
doDelta
INFO: Delta Import completed successfully
Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder
execute
INFO: Time taken = 0:0:0.15
Feb 15, 2011 11:50:28 AM org.apache.solr.core.SolrCore execute


PHP Error at 11:50:12 Error: ...

so i get errors, but nothing is wrong for me ... !?!?

thx

-
--- System


One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores < 100.000

- Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
- Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-not-Available-with-Ping-when-DocBuilder-is-running-tp2500214p2500214.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr not Available with Ping when DocBuilder is running

2011-02-15 Thread Stefan Matheis
And what exactly is your error? and what is your response, for your
ping-request?

On Tue, Feb 15, 2011 at 12:02 PM, stockii  wrote:
>
> Hello.
>
> I do every 2 Minutes a Delta and if one Core (of 7) is running a delta, solr
> isnt available. when i look in the logFile the ping comes in this time, when
> DocBuilder is running ...
>
> Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder
> doDelta
> INFO: Delta Import completed successfully
> Feb 15, 2011 11:49:20 AM org.apache.solr.handler.dataimport.DocBuilder
> execute
> INFO: Time taken = 0:0:0.15
> Feb 15, 2011 11:50:28 AM org.apache.solr.core.SolrCore execute
>
>
> PHP Error at 11:50:12 Error: ...
>
> so i get errors, but nothing is wrong for me ... !?!?
>
> thx
>
> -
> --- System
> 
>
> One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
> 1 Core with 31 Million Documents other Cores < 100.000
>
> - Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
> - Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-not-Available-with-Ping-when-DocBuilder-is-running-tp2500214p2500214.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Deploying Solr CORES on OVH Cloud

2011-02-15 Thread Rosa (Anuncios)

Thanks for your response, but it doesn't help me a whole lot!

Jetty VS Tomcat?
Ubuntu o Debian?

What are the pro of solr using?



Le 14/02/2011 23:12, William Bell a écrit :

The first two questions are almost like religion. I am not sure we
want to start a debate.

Core setup is fairly easy. Add a solr.xml file and subdirs one per
core (see example/) directory. Make sure you use the right URL for the
admin console.

On Mon, Feb 14, 2011 at 3:38 AM, Rosa (Anuncios)
  wrote:

Hi,

I'm a bit new in Solr. I'm trying to set up a bunch of server (just for
solr) on OVH cloud (http://www.ovh.co.uk/cloud/) and create new cores as
needed on each server.

First question:

What do you recommend: Ubuntu or Debian? I mean in term od performance?

Second question:

Jetty or Tomcat? Again in term of performance and security?

Third question:

I've followed all the wiki but i can't get it working the CORES...
Impossible to create CORE or access my cores? Does anyone have a working
config to share?

Thanks a lot for your help

Regards,





Re: Which version of Solr?

2011-02-15 Thread Jeff Schmidt
I guess hijacking my own thread is still hijacking. :)  I'll avoid that in the 
future.

It is great for SolrJ and Solr to be working as expected and to be making 
forward progress!

Jeff

On Feb 14, 2011, at 11:01 PM, David Smiley (@MITRE.org) wrote:

> 
> Wow; I'm glad you figured it out -- sort of.
> 
> FYI, in the future, don't hijack email threads to talk about a new subject.
> Start a new thread.
> 
> ~ David
> p.s. yes, I'm working on the 2nd edition.
> 
> -
> Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Which-version-of-Solr-tp2482468p2498641.html
> Sent from the Solr - User mailing list archive at Nabble.com.



--
Jeff Schmidt
535 Consulting
j...@535consulting.com
(650) 423-1068
http://www.535consulting.com









Re: Which version of Solr?

2011-02-15 Thread Jeff Schmidt
Hi Otis:

I guess I got too obsessed trying to resolve my SolrJ/Solr interaction problem, 
I missed your reply...  I've heard using 3.1 is the best approach, and now 
4.0/trunk.  Will trunk be undergoing a release in the next few months then?  It 
seems so soon after 3.x.

Fortunately, I have both branch_3x and trunk checked out and I can generate 
Maven artifacts for each one. That makes it easy for me to use one or the 
other, at least until I get set on some feature only available in one of them.  
Is trunk currently a superset of branch_3x, or are there some 3.x features that 
won't be merged into trunk for quite some time?

Cheers,

Jeff


On Feb 13, 2011, at 6:49 PM, Otis Gospodnetic wrote:

> Hi Jeff,
> 
> For projects that are going live in 6 months I would use trunk.
> 
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> 
> - Original Message 
>> From: Jeff Schmidt 
>> To: solr-user@lucene.apache.org
>> Sent: Sat, February 12, 2011 4:37:37 PM
>> Subject: Which version of Solr?
>> 
>> Hello:
>> 
>> I'm working on incorporating Solr into a SaaS based life sciences  semantic 
>> search project. This will be released in about six months. I'm trying  to 
>> determine which version of Solr makes the most sense. When going to the Solr 
>>  
>> download page, there are 1.3.0, 1.4.0, and 1.4.1. I've been using 1.4.1 
>> while  
>> going through some examples in my Packt book ("Solr 1.4 Enterprise Search  
>> Server").
>> 
>> But, I also see that Solr 3.1 and 4.0 are in the works.   According to:
>> 
>> 
>> https://issues.apache.org/jira/browse/#selectedTab=com.atlassian.jira.plugin.system.project%3Aroadmap-panel
>> 
>> 
>> there  is a high degree of progress on both of those releases; including a 
>> slew 
>> of bug  fixes, new features, performance enhancements etc. Should I be 
>> making 
>> use of one  of the newer versions?  The hierarchical faceting seems like it 
>> could be  quite useful.  Are there any guesses on when either 3.1 or 4.0 
>> will be  
>> officially released?
>> 
>> So far, 1.4.1 has been good. But I'm unable to get  SolrJ to work due to the 
>> 'javabin' version mismatch. I'm using the 1.4.1 version  of SolrJ, but I 
>> always 
>> get an HTTP response code of 200, but the return entity  is simply a null 
>> byte, 
>> which does not match the version number of 1 defined in  Solr common.  
>> Anyway, I 
>> can follow up on that issue if 1.4.1 is still the  most appropriate version 
>> to 
>> use these days. Otherwise, I'll try again with  whatever version you suggest.
>> 
>> Thanks a lot!
>> 
>> Jeff
>> --
>> Jeff  Schmidt
>> 535 Consulting
>> j...@535consulting.com
>> (650)  423-1068
>> 
>> 
>> 
>> 
>> 
>> 



--
Jeff Schmidt
535 Consulting
j...@535consulting.com
(650) 423-1068
http://www.535consulting.com









Re: Guidance for event-driven indexing

2011-02-15 Thread Rich Cariens
Thanks Jan.

For the JMSUpdateHandler option, how does one plugin a custom UpdateHandler?
I want to make sure I'm not missing any important pieces of Solr processing
pipeline.

Best,
Rich

On Tue, Feb 15, 2011 at 4:36 AM, Jan Høydahl  wrote:

> Solr is multi threaded, so you are free to send as many parallel update
> requests needed to utilize your HW. Each request will get its own thread.
> Simply configure StreamingUpdateSolrServer from your client.
>
> If there is some lengthy work to be done, it needs to be done in SOME
> thread, and I guess you just have to choose where :)
>
> A JMSUpdateHandler sounds heavy weight, but does not need to be, and might
> be the logically best place for such a feature imo.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 14. feb. 2011, at 17.42, Rich Cariens wrote:
>
> > Thanks Jan,
> >
> > I don't think I want to tie up a thread on two boxes waiting for an
> > UpdateRequestProcessor to finish. I'd prefer to offload it all to the
> target
> > shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm
> > really just looking for a simple API that allows me to add a
> > SolrInputDocument to the index in-process.
> >
> > Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages?
> I'm
> > worried that this will break all the nice stuff one gets with the
> standard
> > SOLR webapp (stats, admin, etc).
> >
> > Best,
> > Rich
> >
> >
> > On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl 
> wrote:
> >
> >> Hi,
> >>
> >> One option would be to keep the JMS listener as today but move the
> >> downloading
> >> and transforming part to a SolrUpdateRequestProcessor on each shard. The
> >> benefit
> >> is that you ship only a tiny little SolrInputDocument over the wire with
> a
> >> reference to the doc to download, and do the heavy lifting on Solr side.
> >>
> >> If each JMS topic/channel corresponds to a particular shard, you could
> >> move the whole thing to Solr. If so, a new JMSUpdateHandler could
> perhaps
> >> be a way to go?
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >>
> >> On 14. feb. 2011, at 16.53, Rich Cariens wrote:
> >>
> >>> Hello,
> >>>
> >>> I've built a system that receives JMS events containing links to docs
> >> that I
> >>> must download and index. Right now the JMS receiving, downloading, and
> >>> transformation into SolrInputDoc's happens in a separate JVM that then
> >> uses
> >>> Solrj javabin HTTP POSTs to distribute these docs across many index
> >> shards.
> >>>
> >>> For various reasons I won't go into here, I'd like to relocate/deploy
> >> most
> >>> of my processing (JMS receiving, downloading, and transformation into
> >>> SolrInputDoc's) into the SOLR webapps running on each distributed shard
> >>> host. I might be wrong, but I don't think the request-driven idiom of
> the
> >>> DataImportHandler is not a good fit for me as I'm not kicking off full
> or
> >>> delta imports. If that's true, what's the correct way to hook my
> >> components
> >>> into SOLR's update facilities? Should I try to get a reference a
> >> configured
> >>> DirectUpdateHandler?
> >>>
> >>> I don't know if this information helps, but I'll put it out there
> >> anyways:
> >>> I'm using Spring 3 components to receive JMS events, wired up via
> webapp
> >>> context hooks. My plan would be to add all that to my SOLR shard
> webapp.
> >>>
> >>> Best,
> >>> Rich
> >>
> >>
>
>


Dismax problem

2011-02-15 Thread Ezequiel Calderara
Hi, im having a problem while trying to do a dismax search.
For example i have the standard query url like this:
It returns 1 result.
But when i try to use the dismax query type i have the following error:

> 15/02/2011 10:27:07 org.apache.solr.common.SolrException log
> GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28
> at
> org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721)
> at
> org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224)
> at
> org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692)
> at
> org.apache.solr.search.function.StringIndexDocValues.(StringIndexDocValues.java:35)
> at
> org.apache.solr.search.function.OrdFieldSource$1.(OrdFieldSource.java:84)
> at
> org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58)
> at
> org.apache.solr.search.function.FunctionQuery$AllScorer.(FunctionQuery.java:123)
> at
> org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93)
> at
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
> at
> org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268)
> at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258)
> at org.apache.lucene.search.Searcher.search(Searcher.java:171)
> at
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
> at
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
> at
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
> at
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401)
> at
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281)
> at
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579)
> at
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
>

The Solr instance is running as a replication slave.
This is the solrconfig.xml: http://pastebin.com/GSv2wBB4
This is the schema.xml: http://pastebin.com/5VpRT5Jj

Any help? How can i find what is causing this exception? I thought that the
dismax didn't throw exceptions...
-- 
__
Ezequiel.

Http://www.ironicnet.com


Re: Solr 1.4 requestHandler "update" Runtime disable/enable

2011-02-15 Thread gzyl

Ok.
I can set it so:
 

But how without restarting Tomcat can I change solr.enable.master from true
to false ??

>But I don't see why you need to disable it. 
>You will anyway need to stop sending updates to the old master yourself. 
>Disabling the handler like this will cause an exception if you try to call
it because it will not be registered. 

I have a buffer in the application. So I do not have it off. 
I have to switch automatically, without administrator intervention.

Application is switching from one location to other and solr must
automatically follow the application.



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-1-4-requestHandler-update-Runtime-disable-enable-tp2493745p2500603.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: rollback to other versions of index

2011-02-15 Thread Michael McCandless
Lucene is able to do this, if you make a custom DeletionPolicy (which
controls when commit points are deleted).

By default Lucene only saves the most recent commit
(KeepOnlyLastCommitDeletionPolicy), but if your policy keeps more
around, then you can open an IndexReader or IndexWriter on any
IndexCommit.

Any changes (including optimize, and even opening a new IW with
create=true) are safe within a commit; Lucene is fully transactional.

For example, I use this for benchmarking: I save 4 commit points in a
single index.  First is a multi-segment index, second is the same
index with 5% deletions, third is an optimized index, and fourth is
the optimized index with 5% deletions.  This gives me a single index
w/ 4 different commit points, so I can then benchmark searching
against any of those 4.

Mike

On Tue, Feb 15, 2011 at 4:43 AM, Jan Høydahl  wrote:
> Yes and no. The index grows like an onion adding new segments for each commit.
> There is no API to remove the newly added segments, but I guess you could 
> hack something.
> The other problem is that as soon as you trigger an optimize() all history is 
> gone as the segments are merged into one. Optimize normally happens 
> automatically behind the scenes. You could turn off merging but that will 
> badly hurt your performance after some time and ultimately crash your OS.
>
> Since you only need a few versions back, you COULD write your own custom 
> mergePolicy, always preserving at least N versions. But beware that a 
> "version" may be ONE document or 1 documents, depending on how you commit 
> or if autoCommit is active. so if you go this route you also need strict 
> control over your commits.
>
> Perhaps best option is to handle this on feeding client side, where you keep 
> a buffer of N last docs. Then you can freely roll back or re-index as you 
> choose, based on time, number of docs etc.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 15. feb. 2011, at 01.21, Tri Nguyen wrote:
>
>> Hi,
>>
>> Does solr version each index build?
>>
>> We'd like to be able to rollback to not just a previous version but maybe a 
>> few
>> version before the current one.
>>
>> Thanks,
>>
>> Tri
>
>


Re: Guidance for event-driven indexing

2011-02-15 Thread Jan Høydahl
Hi,

You would wire your JMSUpdateRequestHandler into solrconfig.xml as normal, and 
if you want to apply an UpdateChain, that would look like this:

  

  myPipeline



See http://wiki.apache.org/solr/SolrRequestHandler for details

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 15. feb. 2011, at 14.30, Rich Cariens wrote:

> Thanks Jan.
> 
> For the JMSUpdateHandler option, how does one plugin a custom UpdateHandler?
> I want to make sure I'm not missing any important pieces of Solr processing
> pipeline.
> 
> Best,
> Rich
> 
> On Tue, Feb 15, 2011 at 4:36 AM, Jan Høydahl  wrote:
> 
>> Solr is multi threaded, so you are free to send as many parallel update
>> requests needed to utilize your HW. Each request will get its own thread.
>> Simply configure StreamingUpdateSolrServer from your client.
>> 
>> If there is some lengthy work to be done, it needs to be done in SOME
>> thread, and I guess you just have to choose where :)
>> 
>> A JMSUpdateHandler sounds heavy weight, but does not need to be, and might
>> be the logically best place for such a feature imo.
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> On 14. feb. 2011, at 17.42, Rich Cariens wrote:
>> 
>>> Thanks Jan,
>>> 
>>> I don't think I want to tie up a thread on two boxes waiting for an
>>> UpdateRequestProcessor to finish. I'd prefer to offload it all to the
>> target
>>> shards. And a special JMSUpdateHandler feels like overkill. I *think* I'm
>>> really just looking for a simple API that allows me to add a
>>> SolrInputDocument to the index in-process.
>>> 
>>> Perhaps I just need to use the EmbeddedSolrServer in the Solrj packages?
>> I'm
>>> worried that this will break all the nice stuff one gets with the
>> standard
>>> SOLR webapp (stats, admin, etc).
>>> 
>>> Best,
>>> Rich
>>> 
>>> 
>>> On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl 
>> wrote:
>>> 
 Hi,
 
 One option would be to keep the JMS listener as today but move the
 downloading
 and transforming part to a SolrUpdateRequestProcessor on each shard. The
 benefit
 is that you ship only a tiny little SolrInputDocument over the wire with
>> a
 reference to the doc to download, and do the heavy lifting on Solr side.
 
 If each JMS topic/channel corresponds to a particular shard, you could
 move the whole thing to Solr. If so, a new JMSUpdateHandler could
>> perhaps
 be a way to go?
 
 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 
 On 14. feb. 2011, at 16.53, Rich Cariens wrote:
 
> Hello,
> 
> I've built a system that receives JMS events containing links to docs
 that I
> must download and index. Right now the JMS receiving, downloading, and
> transformation into SolrInputDoc's happens in a separate JVM that then
 uses
> Solrj javabin HTTP POSTs to distribute these docs across many index
 shards.
> 
> For various reasons I won't go into here, I'd like to relocate/deploy
 most
> of my processing (JMS receiving, downloading, and transformation into
> SolrInputDoc's) into the SOLR webapps running on each distributed shard
> host. I might be wrong, but I don't think the request-driven idiom of
>> the
> DataImportHandler is not a good fit for me as I'm not kicking off full
>> or
> delta imports. If that's true, what's the correct way to hook my
 components
> into SOLR's update facilities? Should I try to get a reference a
 configured
> DirectUpdateHandler?
> 
> I don't know if this information helps, but I'll put it out there
 anyways:
> I'm using Spring 3 components to receive JMS events, wired up via
>> webapp
> context hooks. My plan would be to add all that to my SOLR shard
>> webapp.
> 
> Best,
> Rich
 
 
>> 
>> 



very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Ravish Bhagdev
Hi Guys,

I've been trying various combinations but unable to perform a "OR" query for
a specific field in my solr schema.

I have a string field called myfield and I want to return all documents that
have this field which either matches "abc" or  "xyz"

So all records that have myfield=abc and all records that have myfield=xyz
should be returned (union)

What should my query be?  I have tried (myfield=abc OR myfield=xyz) which
works, but only returns all the documents that contain xyz in that field,
which I find quite weird. I have tried running this as fq query as well but
same result!

It is such a simple thing but I can't find right syntax after going through
a lot of documentation and searching.

Will appreciate any quick reply or examples, thanks very much.

Ravish


Re: Solr 1.4 requestHandler "update" Runtime disable/enable

2011-02-15 Thread Jan Høydahl
> Ok.
> I can set it so:
>  enable="${solr.enable.master:true}"/> 
> 
> But how without restarting Tomcat can I change solr.enable.master from true
> to false ??

That's an exercise left to the reader :)

Honestly, I don't think you need to. Why would you? The handler does not do 
anything if never called.

>> But I don't see why you need to disable it. 
>> You will anyway need to stop sending updates to the old master yourself. 
>> Disabling the handler like this will cause an exception if you try to call
> it because it will not be registered. 
> 
> I have a buffer in the application. So I do not have it off. 
> I have to switch automatically, without administrator intervention.
> 
> Application is switching from one location to other and solr must
> automatically follow the application.

One Solr server does not know about the other, so you do not need to "switch" 
anything
on the Solr side. You simply need to design your client in such a way that it 
handles
the operations in correct order and timing, i.e. that it pauses all feeding 
until replication
is done, then feeds to the new master instead of the old.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com



Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Jan Høydahl
http://wiki.apache.org/solr/SolrQuerySyntax

Examples:
q=myfield:(xyz OR abc)

q={!lucene q.op=OR df=myfield}xyz abc

q=xyz OR abc&defType=edismax&qf=myfield

PS: If using type="string", you will not match individual words inside the 
field, only an exact case sensitive match of whole field. Use some variant of 
"text" if this is not what you want.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote:

> Hi Guys,
> 
> I've been trying various combinations but unable to perform a "OR" query for
> a specific field in my solr schema.
> 
> I have a string field called myfield and I want to return all documents that
> have this field which either matches "abc" or  "xyz"
> 
> So all records that have myfield=abc and all records that have myfield=xyz
> should be returned (union)
> 
> What should my query be?  I have tried (myfield=abc OR myfield=xyz) which
> works, but only returns all the documents that contain xyz in that field,
> which I find quite weird. I have tried running this as fq query as well but
> same result!
> 
> It is such a simple thing but I can't find right syntax after going through
> a lot of documentation and searching.
> 
> Will appreciate any quick reply or examples, thanks very much.
> 
> Ravish



Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Ravish Bhagdev
Hi Jan,

Thanks for reply.

I have tried the first variation in your example (and again after reading
your reply).

It returns no results!

Note: it is not a multivalued field, I think when you use example 1 below,
it looks for both xyz and abc in same field for same document, what i'm
trying to get are all records that match either of the two.

I hope I am making sense.

Thanks,
Ravish

On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl  wrote:

> http://wiki.apache.org/solr/SolrQuerySyntax
>
> Examples:
> q=myfield:(xyz OR abc)
>
> q={!lucene q.op=OR df=myfield}xyz abc
>
> q=xyz OR abc&defType=edismax&qf=myfield
>
> PS: If using type="string", you will not match individual words inside the
> field, only an exact case sensitive match of whole field. Use some variant
> of "text" if this is not what you want.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote:
>
> > Hi Guys,
> >
> > I've been trying various combinations but unable to perform a "OR" query
> for
> > a specific field in my solr schema.
> >
> > I have a string field called myfield and I want to return all documents
> that
> > have this field which either matches "abc" or  "xyz"
> >
> > So all records that have myfield=abc and all records that have
> myfield=xyz
> > should be returned (union)
> >
> > What should my query be?  I have tried (myfield=abc OR myfield=xyz) which
> > works, but only returns all the documents that contain xyz in that field,
> > which I find quite weird. I have tried running this as fq query as well
> but
> > same result!
> >
> > It is such a simple thing but I can't find right syntax after going
> through
> > a lot of documentation and searching.
> >
> > Will appreciate any quick reply or examples, thanks very much.
> >
> > Ravish
>
>


Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Jan Høydahl
The OR implies that all documents matching either one of the two terms shold be 
returned.

Are you sure you are searching with correct uppercase/lowercase, as string 
fields are case sensitive?

To further help you, we need copies of relevant sections of your schema and an 
exact copy of the query string you attempt to run, as well as proof that the 
documents exist.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 15. feb. 2011, at 14.54, Ravish Bhagdev wrote:

> Hi Jan,
> 
> Thanks for reply.
> 
> I have tried the first variation in your example (and again after reading
> your reply).
> 
> It returns no results!
> 
> Note: it is not a multivalued field, I think when you use example 1 below,
> it looks for both xyz and abc in same field for same document, what i'm
> trying to get are all records that match either of the two.
> 
> I hope I am making sense.
> 
> Thanks,
> Ravish
> 
> On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl  wrote:
> 
>> http://wiki.apache.org/solr/SolrQuerySyntax
>> 
>> Examples:
>> q=myfield:(xyz OR abc)
>> 
>> q={!lucene q.op=OR df=myfield}xyz abc
>> 
>> q=xyz OR abc&defType=edismax&qf=myfield
>> 
>> PS: If using type="string", you will not match individual words inside the
>> field, only an exact case sensitive match of whole field. Use some variant
>> of "text" if this is not what you want.
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote:
>> 
>>> Hi Guys,
>>> 
>>> I've been trying various combinations but unable to perform a "OR" query
>> for
>>> a specific field in my solr schema.
>>> 
>>> I have a string field called myfield and I want to return all documents
>> that
>>> have this field which either matches "abc" or  "xyz"
>>> 
>>> So all records that have myfield=abc and all records that have
>> myfield=xyz
>>> should be returned (union)
>>> 
>>> What should my query be?  I have tried (myfield=abc OR myfield=xyz) which
>>> works, but only returns all the documents that contain xyz in that field,
>>> which I find quite weird. I have tried running this as fq query as well
>> but
>>> same result!
>>> 
>>> It is such a simple thing but I can't find right syntax after going
>> through
>>> a lot of documentation and searching.
>>> 
>>> Will appreciate any quick reply or examples, thanks very much.
>>> 
>>> Ravish
>> 
>> 



Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Ravish Bhagdev
Arghhh..

I think its the regexp parser messing things up (just looked at the
debugQuery ouput and its parsing incorrectly some "/" kind of letters I had.

I think I can clean up the data off these characters or maybe there is  a
way to escape them...

Ravish

On Tue, Feb 15, 2011 at 1:54 PM, Ravish Bhagdev wrote:

> Hi Jan,
>
> Thanks for reply.
>
> I have tried the first variation in your example (and again after reading
> your reply).
>
> It returns no results!
>
> Note: it is not a multivalued field, I think when you use example 1 below,
> it looks for both xyz and abc in same field for same document, what i'm
> trying to get are all records that match either of the two.
>
> I hope I am making sense.
>
> Thanks,
> Ravish
>
>
> On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl wrote:
>
>> http://wiki.apache.org/solr/SolrQuerySyntax
>>
>> Examples:
>> q=myfield:(xyz OR abc)
>>
>> q={!lucene q.op=OR df=myfield}xyz abc
>>
>> q=xyz OR abc&defType=edismax&qf=myfield
>>
>> PS: If using type="string", you will not match individual words inside the
>> field, only an exact case sensitive match of whole field. Use some variant
>> of "text" if this is not what you want.
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>>
>> On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote:
>>
>> > Hi Guys,
>> >
>> > I've been trying various combinations but unable to perform a "OR" query
>> for
>> > a specific field in my solr schema.
>> >
>> > I have a string field called myfield and I want to return all documents
>> that
>> > have this field which either matches "abc" or  "xyz"
>> >
>> > So all records that have myfield=abc and all records that have
>> myfield=xyz
>> > should be returned (union)
>> >
>> > What should my query be?  I have tried (myfield=abc OR myfield=xyz)
>> which
>> > works, but only returns all the documents that contain xyz in that
>> field,
>> > which I find quite weird. I have tried running this as fq query as well
>> but
>> > same result!
>> >
>> > It is such a simple thing but I can't find right syntax after going
>> through
>> > a lot of documentation and searching.
>> >
>> > Will appreciate any quick reply or examples, thanks very much.
>> >
>> > Ravish
>>
>>
>


Re: schema.xml configuration for file names?

2011-02-15 Thread Erick Erickson
Can we see a small sample of an xml file you're posting? Because it should
look something like

   
 R16-500
more fields here.
   


Take a look at the Solr admin page after you've indexed data to see what's
actually in your index, I suspect what's in there isn't what you
expect.

Try querying q=*:* just for yucks to see what the documents returned look like.

I suspect your index doesn't contain anything like what you think, but
that's only
a guess...

Best
Erick

On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison  wrote:
> Hello!
>
> We receive from our suppliers hardware manufacturing data in XML files. On a
> typical day, we got 25,000 files. That is why I chose to implement Solr.
>
> The file names are made of eleven fields separated by tildas like so
>
> CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML
>
> Our R&D guys want to be able search each field of the file XML file names
> (OR operation) but they don't care to search the file contents. Ideally,
> they would like to do a query all files where "stbmodel" equal to "R16-500"
> or "result" is "P" or "filedate" is "20110125"...you get the idea.
>
> I defined in schema.xml each data field like so (from left to right -- sorry
> for the long list):
>
>    stored="true"   multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="false"  multiValued="false"/>
>    stored="fase"   multiValued="false"/>
>    stored="true"    multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="true"    multiValued="false"/>
>    stored="true"    multiValued="false"/>
>    stored="true"   multiValued="false"/>
>    stored="true"   multiValued="false"/>
>
> Also, I defined as unique key the field "receiver". But no results are
> returned by my queries. I made sure to update my index like so: "java -jar
> apache-solr-1.4.1/example/exampledocs/post.jar *XML".
>
> I am obviously missing something. Is there a way to configure schema.xml to
> search for file names? I welcome your input.
>
> Al.
>


Re: Deploying Solr CORES on OVH Cloud

2011-02-15 Thread Erick Erickson
The usual answer is "whatever you're most comfortable/experienced with".
>From my perspective, there's enough to learn getting Solr running and
understanding how search works without throwing new environments into
the mix...

So, I'd pick the one you're most familiar with and use that. If you're
not familiar
with either, flip a coin ...

This isn't all that  helpful either, but all that means is that this
is a question
that doesn't have a one-recommendation-fits-all answer.

Best
Erick

On Tue, Feb 15, 2011 at 6:08 AM, Rosa (Anuncios)
 wrote:
> Thanks for your response, but it doesn't help me a whole lot!
>
> Jetty VS Tomcat?
> Ubuntu o Debian?
>
> What are the pro of solr using?
>
>
>
> Le 14/02/2011 23:12, William Bell a écrit :
>>
>> The first two questions are almost like religion. I am not sure we
>> want to start a debate.
>>
>> Core setup is fairly easy. Add a solr.xml file and subdirs one per
>> core (see example/) directory. Make sure you use the right URL for the
>> admin console.
>>
>> On Mon, Feb 14, 2011 at 3:38 AM, Rosa (Anuncios)
>>   wrote:
>>>
>>> Hi,
>>>
>>> I'm a bit new in Solr. I'm trying to set up a bunch of server (just for
>>> solr) on OVH cloud (http://www.ovh.co.uk/cloud/) and create new cores as
>>> needed on each server.
>>>
>>> First question:
>>>
>>> What do you recommend: Ubuntu or Debian? I mean in term od performance?
>>>
>>> Second question:
>>>
>>> Jetty or Tomcat? Again in term of performance and security?
>>>
>>> Third question:
>>>
>>> I've followed all the wiki but i can't get it working the CORES...
>>> Impossible to create CORE or access my cores? Does anyone have a working
>>> config to share?
>>>
>>> Thanks a lot for your help
>>>
>>> Regards,
>>>
>
>


Re: Which version of Solr?

2011-02-15 Thread Erick Erickson
Let's see if I have this right. 3x is "1.4.1 with selected features from trunk
backported". Which translates as "lots of cool new stuff is in in 3x
(geospatial comes to mind, eDismax, etc) but the more fluid changes
are not being backported".

I guess it depends on how risk-averse you are. There are people using both
trunk and 3x in production.

Personally, though, I'd go with 3x for a project 6 months out unless there's a
feature of trunk that would make your life a whole lot easier. Trunk is well-
tested, but why take any risk unless there are measurable benefits? You
might read through the changes.txt file to see if there's anything in trunk
you can't live without

Best
Erick

On Tue, Feb 15, 2011 at 7:30 AM, Jeff Schmidt  wrote:
> Hi Otis:
>
> I guess I got too obsessed trying to resolve my SolrJ/Solr interaction 
> problem, I missed your reply...  I've heard using 3.1 is the best approach, 
> and now 4.0/trunk.  Will trunk be undergoing a release in the next few months 
> then?  It seems so soon after 3.x.
>
> Fortunately, I have both branch_3x and trunk checked out and I can generate 
> Maven artifacts for each one. That makes it easy for me to use one or the 
> other, at least until I get set on some feature only available in one of 
> them.  Is trunk currently a superset of branch_3x, or are there some 3.x 
> features that won't be merged into trunk for quite some time?
>
> Cheers,
>
> Jeff
>
>
> On Feb 13, 2011, at 6:49 PM, Otis Gospodnetic wrote:
>
>> Hi Jeff,
>>
>> For projects that are going live in 6 months I would use trunk.
>>
>> Otis
>> 
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> Lucene ecosystem search :: http://search-lucene.com/
>>
>>
>>
>> - Original Message 
>>> From: Jeff Schmidt 
>>> To: solr-user@lucene.apache.org
>>> Sent: Sat, February 12, 2011 4:37:37 PM
>>> Subject: Which version of Solr?
>>>
>>> Hello:
>>>
>>> I'm working on incorporating Solr into a SaaS based life sciences  semantic
>>> search project. This will be released in about six months. I'm trying  to
>>> determine which version of Solr makes the most sense. When going to the Solr
>>> download page, there are 1.3.0, 1.4.0, and 1.4.1. I've been using 1.4.1 
>>> while
>>> going through some examples in my Packt book ("Solr 1.4 Enterprise Search
>>> Server").
>>>
>>> But, I also see that Solr 3.1 and 4.0 are in the works.   According to:
>>>
>>>
>>> https://issues.apache.org/jira/browse/#selectedTab=com.atlassian.jira.plugin.system.project%3Aroadmap-panel
>>>
>>>
>>> there  is a high degree of progress on both of those releases; including a 
>>> slew
>>> of bug  fixes, new features, performance enhancements etc. Should I be 
>>> making
>>> use of one  of the newer versions?  The hierarchical faceting seems like it
>>> could be  quite useful.  Are there any guesses on when either 3.1 or 4.0 
>>> will be
>>> officially released?
>>>
>>> So far, 1.4.1 has been good. But I'm unable to get  SolrJ to work due to the
>>> 'javabin' version mismatch. I'm using the 1.4.1 version  of SolrJ, but I 
>>> always
>>> get an HTTP response code of 200, but the return entity  is simply a null 
>>> byte,
>>> which does not match the version number of 1 defined in  Solr common.  
>>> Anyway, I
>>> can follow up on that issue if 1.4.1 is still the  most appropriate version 
>>> to
>>> use these days. Otherwise, I'll try again with  whatever version you 
>>> suggest.
>>>
>>> Thanks a lot!
>>>
>>> Jeff
>>> --
>>> Jeff  Schmidt
>>> 535 Consulting
>>> j...@535consulting.com
>>> (650)  423-1068
>>>
>>>
>>>
>>>
>>>
>>>
>
>
>
> --
> Jeff Schmidt
> 535 Consulting
> j...@535consulting.com
> (650) 423-1068
> http://www.535consulting.com
>
>
>
>
>
>
>
>


Re: Which version of Solr?

2011-02-15 Thread Yonik Seeley
On Tue, Feb 15, 2011 at 9:18 AM, Erick Erickson  wrote:
> I guess it depends on how risk-averse you are. There are people using both
> trunk and 3x in production.

Right.  It also depends on how easy it is to re-index your data.  If
it's hard/impossible, IMO that's the single biggest argument for going
with 3x (soon 3.1) instead of trunk.  All of the new coolness in trunk
has come with index format changes along the way.

-Yonik
http://lucidimagination.com


Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Erick Erickson
You might look at the analysis page from the admin console for the
field in question, it'll show you what various parts of the analysis chain
do.

But I agree with Jan, having your field as a "string" type is a red flag. This
field is NOT analyzed, parsed, or filtered. For instance, if a doc has
a value for the field of: [My life], only [My life] will match. Not [my], not
[life], not even [my life] (ignore all brackets, but quotes are often confused
with phrases).

It may well be that this is the exact behavior you want, but this is often
a point of confusion.

Best
Erick

On Tue, Feb 15, 2011 at 9:00 AM, Ravish Bhagdev
 wrote:
> Arghhh..
>
> I think its the regexp parser messing things up (just looked at the
> debugQuery ouput and its parsing incorrectly some "/" kind of letters I had.
>
> I think I can clean up the data off these characters or maybe there is  a
> way to escape them...
>
> Ravish
>
> On Tue, Feb 15, 2011 at 1:54 PM, Ravish Bhagdev 
> wrote:
>
>> Hi Jan,
>>
>> Thanks for reply.
>>
>> I have tried the first variation in your example (and again after reading
>> your reply).
>>
>> It returns no results!
>>
>> Note: it is not a multivalued field, I think when you use example 1 below,
>> it looks for both xyz and abc in same field for same document, what i'm
>> trying to get are all records that match either of the two.
>>
>> I hope I am making sense.
>>
>> Thanks,
>> Ravish
>>
>>
>> On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl wrote:
>>
>>> http://wiki.apache.org/solr/SolrQuerySyntax
>>>
>>> Examples:
>>> q=myfield:(xyz OR abc)
>>>
>>> q={!lucene q.op=OR df=myfield}xyz abc
>>>
>>> q=xyz OR abc&defType=edismax&qf=myfield
>>>
>>> PS: If using type="string", you will not match individual words inside the
>>> field, only an exact case sensitive match of whole field. Use some variant
>>> of "text" if this is not what you want.
>>>
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>>
>>> On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote:
>>>
>>> > Hi Guys,
>>> >
>>> > I've been trying various combinations but unable to perform a "OR" query
>>> for
>>> > a specific field in my solr schema.
>>> >
>>> > I have a string field called myfield and I want to return all documents
>>> that
>>> > have this field which either matches "abc" or  "xyz"
>>> >
>>> > So all records that have myfield=abc and all records that have
>>> myfield=xyz
>>> > should be returned (union)
>>> >
>>> > What should my query be?  I have tried (myfield=abc OR myfield=xyz)
>>> which
>>> > works, but only returns all the documents that contain xyz in that
>>> field,
>>> > which I find quite weird. I have tried running this as fq query as well
>>> but
>>> > same result!
>>> >
>>> > It is such a simple thing but I can't find right syntax after going
>>> through
>>> > a lot of documentation and searching.
>>> >
>>> > Will appreciate any quick reply or examples, thanks very much.
>>> >
>>> > Ravish
>>>
>>>
>>
>


Re: Dismax problem

2011-02-15 Thread Tomás Fernández Löbbe
Hi Ezequiel,
The standard query parser works with all the fields you are using with
dismax? Did you change the schema in some way? What version of Solr are you
on?

Tomás
On Tue, Feb 15, 2011 at 10:34 AM, Ezequiel Calderara wrote:

> Hi, im having a problem while trying to do a dismax search.
> For example i have the standard query url like this:
> It returns 1 result.
> But when i try to use the dismax query type i have the following error:
>
> > 15/02/2011 10:27:07 org.apache.solr.common.SolrException log
> > GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28
> > at
> >
> org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721)
> > at
> >
> org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224)
> > at
> >
> org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692)
> > at
> >
> org.apache.solr.search.function.StringIndexDocValues.(StringIndexDocValues.java:35)
> > at
> >
> org.apache.solr.search.function.OrdFieldSource$1.(OrdFieldSource.java:84)
> > at
> >
> org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58)
> > at
> >
> org.apache.solr.search.function.FunctionQuery$AllScorer.(FunctionQuery.java:123)
> > at
> >
> org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93)
> > at
> >
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
> > at
> >
> org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268)
> > at
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258)
> > at org.apache.lucene.search.Searcher.search(Searcher.java:171)
> > at
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
> > at
> >
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
> > at
> >
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
> > at
> >
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
> > at
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203)
> > at
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
> > at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
> > at
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242)
> > at
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> > at
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243)
> > at
> >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201)
> > at
> >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163)
> > at
> >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
> > at
> > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556)
> > at
> >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> > at
> >
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401)
> > at
> >
> org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281)
> > at
> >
> org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579)
> > at
> >
> org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> > Source)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> > at java.lang.Thread.run(Unknown Source)
> >
>
> The Solr instance is running as a replication slave.
> This is the solrconfig.xml: http://pastebin.com/GSv2wBB4
> This is the schema.xml: http://pastebin.com/5VpRT5Jj
>
> Any help? How can i find what is causing this exception? I thought that the
> dismax didn't throw exceptions...
> --
> __
> Ezequiel.
>
> Http://www.ironicnet.com
>


Re: SolrCloud - Example C not working

2011-02-15 Thread Yonik Seeley
On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler  wrote:
> Hi all,
>
> I followed http://wiki.apache.org/solr/SolrCloud and everything worked
> fine till I tried "Example C:".

Verified.  I just tried and it failed for me too.

-Yonik
http://lucidimagination.com


> I start all 4 server but all of them keep looping through:
>
> "java.net.ConnectException: Connection refused
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/127.0.0.1:9983
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9900
> Feb 14, 2011 1:31:16 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:17 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:9983
> Feb 14, 2011 1:31:17 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:19 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:8574
> Feb 14, 2011 1:31:19 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
> Feb 14, 2011 1:31:20 PM org.apache.log4j.Category info
> INFO: Opening socket connection to server localhost/127.0.0.1:8574
> Feb 14, 2011 1:31:20 PM org.apache.log4j.Category warn
> WARNING: Session 0x0 for server null, unexpected error, closing socket
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at org.apache.zookeeper.ClientCnxn
> $SendThread.run(ClientCnxn.java:1078)
>
> The problem seems that the zk instances can not connects to the
> different nodes and so do not get up at all.
>
> I am using revision 1070473 for the tests. Anybody has an idea?
>
> salu2
> --
> Thorsten Scherler 
> codeBusters S.L. - web based systems
> 
> http://www.codebusters.es/
>


Re: very quick question that will help me greatly... OR query syntax when using fields for solr dataset....

2011-02-15 Thread Ravish Bhagdev
Hi Erick,

I've managed to fix the problem, it was to do with not encoding certain
characters.  Escaped with \ and it all works fine now :) .  Sorry I was just
being insane, look at debugQuery output helped.

I know about the string field, this is kind of a uuid field that I am
storing, so it it desired that it always be exact match, so I am being
careful about why I chose that type.

I am going to start looking at all that is available as Analyzer soon,
something that does string distance match would be cool

Ravish

On Tue, Feb 15, 2011 at 2:30 PM, Erick Erickson wrote:

> You might look at the analysis page from the admin console for the
> field in question, it'll show you what various parts of the analysis chain
> do.
>
> But I agree with Jan, having your field as a "string" type is a red flag.
> This
> field is NOT analyzed, parsed, or filtered. For instance, if a doc has
> a value for the field of: [My life], only [My life] will match. Not [my],
> not
> [life], not even [my life] (ignore all brackets, but quotes are often
> confused
> with phrases).
>
> It may well be that this is the exact behavior you want, but this is often
> a point of confusion.
>
> Best
> Erick
>
> On Tue, Feb 15, 2011 at 9:00 AM, Ravish Bhagdev
>  wrote:
> > Arghhh..
> >
> > I think its the regexp parser messing things up (just looked at the
> > debugQuery ouput and its parsing incorrectly some "/" kind of letters I
> had.
> >
> > I think I can clean up the data off these characters or maybe there is  a
> > way to escape them...
> >
> > Ravish
> >
> > On Tue, Feb 15, 2011 at 1:54 PM, Ravish Bhagdev <
> ravish.bhag...@gmail.com>wrote:
> >
> >> Hi Jan,
> >>
> >> Thanks for reply.
> >>
> >> I have tried the first variation in your example (and again after
> reading
> >> your reply).
> >>
> >> It returns no results!
> >>
> >> Note: it is not a multivalued field, I think when you use example 1
> below,
> >> it looks for both xyz and abc in same field for same document, what i'm
> >> trying to get are all records that match either of the two.
> >>
> >> I hope I am making sense.
> >>
> >> Thanks,
> >> Ravish
> >>
> >>
> >> On Tue, Feb 15, 2011 at 1:47 PM, Jan Høydahl  >wrote:
> >>
> >>> http://wiki.apache.org/solr/SolrQuerySyntax
> >>>
> >>> Examples:
> >>> q=myfield:(xyz OR abc)
> >>>
> >>> q={!lucene q.op=OR df=myfield}xyz abc
> >>>
> >>> q=xyz OR abc&defType=edismax&qf=myfield
> >>>
> >>> PS: If using type="string", you will not match individual words inside
> the
> >>> field, only an exact case sensitive match of whole field. Use some
> variant
> >>> of "text" if this is not what you want.
> >>>
> >>> --
> >>> Jan Høydahl, search solution architect
> >>> Cominvent AS - www.cominvent.com
> >>>
> >>> On 15. feb. 2011, at 14.39, Ravish Bhagdev wrote:
> >>>
> >>> > Hi Guys,
> >>> >
> >>> > I've been trying various combinations but unable to perform a "OR"
> query
> >>> for
> >>> > a specific field in my solr schema.
> >>> >
> >>> > I have a string field called myfield and I want to return all
> documents
> >>> that
> >>> > have this field which either matches "abc" or  "xyz"
> >>> >
> >>> > So all records that have myfield=abc and all records that have
> >>> myfield=xyz
> >>> > should be returned (union)
> >>> >
> >>> > What should my query be?  I have tried (myfield=abc OR myfield=xyz)
> >>> which
> >>> > works, but only returns all the documents that contain xyz in that
> >>> field,
> >>> > which I find quite weird. I have tried running this as fq query as
> well
> >>> but
> >>> > same result!
> >>> >
> >>> > It is such a simple thing but I can't find right syntax after going
> >>> through
> >>> > a lot of documentation and searching.
> >>> >
> >>> > Will appreciate any quick reply or examples, thanks very much.
> >>> >
> >>> > Ravish
> >>>
> >>>
> >>
> >
>


Re: Which version of Solr?

2011-02-15 Thread Jeff Schmidt
I guess I'll work with 3.x for now until some 4.0 feature makes me move to 
trunk.  For the next few months, re-indexing is not a problem, but once in 
production one index directly under my control will be updated quarterly (maybe 
monthly) with new content, while other indexes will be updated by 3rd parties 
at arbitrary times. Those indexes will maintain cumulative results of those 
updates and it will be more of an issue to require a 3rd party to provide the 
totality of documents to re-index from scratch. Not impossible, just not 
desirable.

Once I get more comfortable with Solr as a solution, I need to look more into 
index replication, backup etc. :)

Thanks for your suggestions on the versions.

Cheers,

Jeff

On Feb 15, 2011, at 7:23 AM, Yonik Seeley wrote:

> On Tue, Feb 15, 2011 at 9:18 AM, Erick Erickson  
> wrote:
>> I guess it depends on how risk-averse you are. There are people using both
>> trunk and 3x in production.
> 
> Right.  It also depends on how easy it is to re-index your data.  If
> it's hard/impossible, IMO that's the single biggest argument for going
> with 3x (soon 3.1) instead of trunk.  All of the new coolness in trunk
> has come with index format changes along the way.
> 
> -Yonik
> http://lucidimagination.com



--
Jeff Schmidt
535 Consulting
j...@535consulting.com
(650) 423-1068
http://www.535consulting.com









Re: Guidance for event-driven indexing

2011-02-15 Thread Rich Cariens
Thanks Jan,

I was referring to a custom UpdateHandler, not a RequestHandler. You know,
the one that the Solr wiki discourages :).

Best,
Rich

On Tue, Feb 15, 2011 at 8:37 AM, Jan Høydahl  wrote:

> Hi,
>
> You would wire your JMSUpdateRequestHandler into solrconfig.xml as normal,
> and if you want to apply an UpdateChain, that would look like this:
>
>  
>
>  myPipeline
>
>  
>
> See http://wiki.apache.org/solr/SolrRequestHandler for details
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 15. feb. 2011, at 14.30, Rich Cariens wrote:
>
> > Thanks Jan.
> >
> > For the JMSUpdateHandler option, how does one plugin a custom
> UpdateHandler?
> > I want to make sure I'm not missing any important pieces of Solr
> processing
> > pipeline.
> >
> > Best,
> > Rich
> >
> > On Tue, Feb 15, 2011 at 4:36 AM, Jan Høydahl 
> wrote:
> >
> >> Solr is multi threaded, so you are free to send as many parallel update
> >> requests needed to utilize your HW. Each request will get its own
> thread.
> >> Simply configure StreamingUpdateSolrServer from your client.
> >>
> >> If there is some lengthy work to be done, it needs to be done in SOME
> >> thread, and I guess you just have to choose where :)
> >>
> >> A JMSUpdateHandler sounds heavy weight, but does not need to be, and
> might
> >> be the logically best place for such a feature imo.
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >>
> >> On 14. feb. 2011, at 17.42, Rich Cariens wrote:
> >>
> >>> Thanks Jan,
> >>>
> >>> I don't think I want to tie up a thread on two boxes waiting for an
> >>> UpdateRequestProcessor to finish. I'd prefer to offload it all to the
> >> target
> >>> shards. And a special JMSUpdateHandler feels like overkill. I *think*
> I'm
> >>> really just looking for a simple API that allows me to add a
> >>> SolrInputDocument to the index in-process.
> >>>
> >>> Perhaps I just need to use the EmbeddedSolrServer in the Solrj
> packages?
> >> I'm
> >>> worried that this will break all the nice stuff one gets with the
> >> standard
> >>> SOLR webapp (stats, admin, etc).
> >>>
> >>> Best,
> >>> Rich
> >>>
> >>>
> >>> On Mon, Feb 14, 2011 at 11:18 AM, Jan Høydahl 
> >> wrote:
> >>>
>  Hi,
> 
>  One option would be to keep the JMS listener as today but move the
>  downloading
>  and transforming part to a SolrUpdateRequestProcessor on each shard.
> The
>  benefit
>  is that you ship only a tiny little SolrInputDocument over the wire
> with
> >> a
>  reference to the doc to download, and do the heavy lifting on Solr
> side.
> 
>  If each JMS topic/channel corresponds to a particular shard, you could
>  move the whole thing to Solr. If so, a new JMSUpdateHandler could
> >> perhaps
>  be a way to go?
> 
>  --
>  Jan Høydahl, search solution architect
>  Cominvent AS - www.cominvent.com
> 
>  On 14. feb. 2011, at 16.53, Rich Cariens wrote:
> 
> > Hello,
> >
> > I've built a system that receives JMS events containing links to docs
>  that I
> > must download and index. Right now the JMS receiving, downloading,
> and
> > transformation into SolrInputDoc's happens in a separate JVM that
> then
>  uses
> > Solrj javabin HTTP POSTs to distribute these docs across many index
>  shards.
> >
> > For various reasons I won't go into here, I'd like to relocate/deploy
>  most
> > of my processing (JMS receiving, downloading, and transformation into
> > SolrInputDoc's) into the SOLR webapps running on each distributed
> shard
> > host. I might be wrong, but I don't think the request-driven idiom of
> >> the
> > DataImportHandler is not a good fit for me as I'm not kicking off
> full
> >> or
> > delta imports. If that's true, what's the correct way to hook my
>  components
> > into SOLR's update facilities? Should I try to get a reference a
>  configured
> > DirectUpdateHandler?
> >
> > I don't know if this information helps, but I'll put it out there
>  anyways:
> > I'm using Spring 3 components to receive JMS events, wired up via
> >> webapp
> > context hooks. My plan would be to add all that to my SOLR shard
> >> webapp.
> >
> > Best,
> > Rich
> 
> 
> >>
> >>
>
>


Re: Any contribs available for Range field type?

2011-02-15 Thread kenf_nc

I've tried several times to get an active account on
solr-...@lucene.apache.org and the mailing list won't send me a confirmation
email, and therefore won't let me post because I'm not confirmed. Could I
get someone that is a member of Solr-Dev to post either my original request
in this thread, or a link to this thread on the Dev mailing list? I really
was hoping for more response than this to this question. This would be a
terrifically useful field type to just about any solr index.

Thanks,
Ken
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Any-contribs-available-for-Range-field-type-tp2473601p2502203.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Are there any restrictions on what kind of how many fields you can use in Pivot Query? I get ClassCastException when I use some of my string fields, and don't when I use some other sting fields

2011-02-15 Thread Ravish Bhagdev
Looks like its a bug?  Is it not?

Ravish

On Tue, Feb 15, 2011 at 4:03 PM, Ravish Bhagdev wrote:

> When include some of the fields in my search query:
>
> SEVERE: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to
> [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry;
>  at
> org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377)
> at
> org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329)
>  at
> org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144)
> at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131)
>  at
> org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:904)
> at
> org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:121)
>  at
> org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:126)
> at
> org.apache.solr.handler.component.PivotFacetHelper.process(PivotFacetHelper.java:85)
>  at
> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:84)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231)
>  at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
>  at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
>  at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>  at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>  at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>  at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
>  at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
>  at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
> at java.lang.Thread.run(Thread.java:662)
>
> Works with some fields not with others...
>
> What could be the problem?  It is hard to know with just that exception as
> it refers to solr's internal files...any indicators will help me debug.
>
> Thanks,
> Ravish
>


Re: Any contribs available for Range field type?

2011-02-15 Thread Erick Erickson
d...@lucene.apache.org

Solr-dev is deprecated since Lucene and Solr converged. Can you subscribe
to the above list instead?

Best
Erick

On Tue, Feb 15, 2011 at 10:49 AM, kenf_nc  wrote:
>
> I've tried several times to get an active account on
> solr-...@lucene.apache.org and the mailing list won't send me a confirmation
> email, and therefore won't let me post because I'm not confirmed. Could I
> get someone that is a member of Solr-Dev to post either my original request
> in this thread, or a link to this thread on the Dev mailing list? I really
> was hoping for more response than this to this question. This would be a
> terrifically useful field type to just about any solr index.
>
> Thanks,
> Ken
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Any-contribs-available-for-Range-field-type-tp2473601p2502203.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


RE: Errors when implementing VelocityResponseWriter

2011-02-15 Thread McGibbney, Lewis John
Hi Erik thank you for the reply

I have placed all velocity jar files in my /lib directory. As explained below, 
I have added relevant configuration to solrconfig.xml, I am just wondering if 
the config instructions in the wiki are missing something? Can anyone advise on 
this.

As you mentioned, my terminal output suggests that the VelocityResponseWriter 
class is not present and therefore the velocity jar is not present... however 
this is not the case.

I have specified  in solrconfig.xml, is this enough or do I 
need to use an exact path. I have already tried specifying an exact path and it 
does not seem to work either.

Thank you

Lewis

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 15 February 2011 06:48
To: solr-user@lucene.apache.org
Subject: Re: Errors when implementing VelocityResponseWriter

looks like you're missing the Velocity JAR.  It needs to be in some Solr 
visible lib directory.  With 1.4.1 you'll need to put it in /lib.  
In later versions, you can use the  elements in solrconfig.xml to point to 
other directories.

Erik

On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:

> Hello List,
>
> I am currently trying to implement the above in Solr 1.4.1. Having moved 
> velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
> webapp /lib directory, then adding queryResponseWriter name="blah" and 
> class="blah" followed by the responseHandler specifics I am shown the 
> following terminal output. I also added  in solrconfig. 
> Can anyone suggest what I have not included in the config that is still 
> required?
>
> Thanks Lewis
>
> SEVERE: org.apache.solr.common.SolrException: Error loading class 
> 'org.apache.solr.response.VelocityResponseWriter'
>at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
>at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
>at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
>at org.apache.solr.core.SolrCore.(SolrCore.java:547)
>at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:98)
>at 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
>at 
> org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
>at 
> org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.solr.response.VelocityResponseWriter
>at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>at java.security.AccessController.doPrivileged(Native Method)
>at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>at java.lang.Class.forName0(Native Method)
>at java.lang.Class.forName(Class.java:247)
>at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
>... 21 more
>
> Glasgow Caledonian University is a registered Scottish charity, number 
> SC021474
>
> Winner: Times Higher Education’s Widening Participation Initiative of the 
> Year 2009 and Herald Society’s Education Initiative of the Year 2009.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html
>
> Winner: Times Higher Education’s Outstanding Support for Early Career 
> Researchers of the Year 2010, GCU as a lead with Universities Scotland 
> partners.
> http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

Email has been scanned for viru

Re: Are there any restrictions on what kind of how many fields you can use in Pivot Query? I get ClassCastException when I use some of my string fields, and don't when I use some other sting fields

2011-02-15 Thread Erick Erickson
To get meaningful help, you have to post a minimum of:
1> the relevant schema definitions for the field that makes it blow
up. include the  and  tags.
2> the query you used, with some indication of the field that makes it blow up.
3> What version you're using
4> any changes you've made to the standard configurations.
5> whether you've recently installed a new version.

It might help if you reviewed: http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Tue, Feb 15, 2011 at 11:27 AM, Ravish Bhagdev
 wrote:
> Looks like its a bug?  Is it not?
>
> Ravish
>
> On Tue, Feb 15, 2011 at 4:03 PM, Ravish Bhagdev 
> wrote:
>
>> When include some of the fields in my search query:
>>
>> SEVERE: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to
>> [Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry;
>>  at
>> org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377)
>> at
>> org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329)
>>  at
>> org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144)
>> at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131)
>>  at
>> org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:904)
>> at
>> org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:121)
>>  at
>> org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:126)
>> at
>> org.apache.solr.handler.component.PivotFacetHelper.process(PivotFacetHelper.java:85)
>>  at
>> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:84)
>> at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:231)
>>  at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1298)
>>  at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:340)
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
>>  at
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>> at
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>  at
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>> at
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>  at
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
>> at
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>  at
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>> at
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
>>  at
>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
>> at
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
>>  at
>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
>> at java.lang.Thread.run(Thread.java:662)
>>
>> Works with some fields not with others...
>>
>> What could be the problem?  It is hard to know with just that exception as
>> it refers to solr's internal files...any indicators will help me debug.
>>
>> Thanks,
>> Ravish
>>
>


RE: Errors when implementing VelocityResponseWriter

2011-02-15 Thread McGibbney, Lewis John
To add to this (which stupidly, I have not mentioned previously) I am using 
Tomcat 7.0.8 as my servlet container. I have a sneaking suspicion that this is 
what is causing the problem, but as per below, I am unsure as to a solution.

From: McGibbney, Lewis John [lewis.mcgibb...@gcu.ac.uk]
Sent: 15 February 2011 17:04
To: solr-user@lucene.apache.org
Subject: RE: Errors when implementing VelocityResponseWriter

Hi Erik thank you for the reply

I have placed all velocity jar files in my /lib directory. As explained below, 
I have added relevant configuration to solrconfig.xml, I am just wondering if 
the config instructions in the wiki are missing something? Can anyone advise on 
this.

As you mentioned, my terminal output suggests that the VelocityResponseWriter 
class is not present and therefore the velocity jar is not present... however 
this is not the case.

I have specified  in solrconfig.xml, is this enough or do I 
need to use an exact path. I have already tried specifying an exact path and it 
does not seem to work either.

Thank you

Lewis

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 15 February 2011 06:48
To: solr-user@lucene.apache.org
Subject: Re: Errors when implementing VelocityResponseWriter

looks like you're missing the Velocity JAR.  It needs to be in some Solr 
visible lib directory.  With 1.4.1 you'll need to put it in /lib.  
In later versions, you can use the  elements in solrconfig.xml to point to 
other directories.

Erik

On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:

> Hello List,
>
> I am currently trying to implement the above in Solr 1.4.1. Having moved 
> velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
> webapp /lib directory, then adding queryResponseWriter name="blah" and 
> class="blah" followed by the responseHandler specifics I am shown the 
> following terminal output. I also added  in solrconfig. 
> Can anyone suggest what I have not included in the config that is still 
> required?
>
> Thanks Lewis
>
> SEVERE: org.apache.solr.common.SolrException: Error loading class 
> 'org.apache.solr.response.VelocityResponseWriter'
>at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
>at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
>at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
>at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
>at org.apache.solr.core.SolrCore.(SolrCore.java:547)
>at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
>at 
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:98)
>at 
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
>at 
> org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
>at 
> org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.solr.response.VelocityResponseWriter
>at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>at java.security.AccessController.doPrivileged(Native Method)
>at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>at java.lang.Class.forName0(Native Method)
>at java.lang.Class.forName(Class.java:247)
>at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
>... 21 more
>
> Glasgow Caledonian University is a registered Scottish charity, number 
> SC021474
>
> Winner: Times Higher Education’s Widening Participation

Re: schema.xml configuration for file names?

2011-02-15 Thread alan bonnemaison
Erick,

I think you put the finger on the problem. Our XML files (we get from our
suppliers) do *not* look like that.

That's what a typical file looks like

..

Obviously, nothing like 

By the way, querying q=*:* retrieved "HTTP error 500 Null pointer
exception", which leads me to believe that my index is 100% empty.

What I am trying to do cannot be done, correct? I just don't want to waste
anyone's time.

Thanks,

Alan.


On Tue, Feb 15, 2011 at 6:01 AM, Erick Erickson wrote:

> Can we see a small sample of an xml file you're posting? Because it should
> look something like
> 
>   
> R16-500
>more fields here.
>   
> 
>
> Take a look at the Solr admin page after you've indexed data to see what's
> actually in your index, I suspect what's in there isn't what you
> expect.
>
> Try querying q=*:* just for yucks to see what the documents returned look
> like.
>
> I suspect your index doesn't contain anything like what you think, but
> that's only
> a guess...
>
> Best
> Erick
>
> On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison 
> wrote:
> > Hello!
> >
> > We receive from our suppliers hardware manufacturing data in XML files.
> On a
> > typical day, we got 25,000 files. That is why I chose to implement Solr.
> >
> > The file names are made of eleven fields separated by tildas like so
> >
> >
> CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML
> >
> > Our R&D guys want to be able search each field of the file XML file names
> > (OR operation) but they don't care to search the file contents. Ideally,
> > they would like to do a query all files where "stbmodel" equal to
> "R16-500"
> > or "result" is "P" or "filedate" is "20110125"...you get the idea.
> >
> > I defined in schema.xml each data field like so (from left to right --
> sorry
> > for the long list):
> >
> >> stored="true"   multiValued="false"/>
> >> stored="true"   multiValued="false"/>
> >> stored="true"   multiValued="false"/>
> >> stored="false"  multiValued="false"/>
> >> stored="fase"   multiValued="false"/>
> >> stored="true"multiValued="false"/>
> >> stored="true"   multiValued="false"/>
> >> stored="true"multiValued="false"/>
> >> stored="true"multiValued="false"/>
> >> stored="true"   multiValued="false"/>
> >> stored="true"   multiValued="false"/>
> >
> > Also, I defined as unique key the field "receiver". But no results are
> > returned by my queries. I made sure to update my index like so: "java
> -jar
> > apache-solr-1.4.1/example/exampledocs/post.jar *XML".
> >
> > I am obviously missing something. Is there a way to configure schema.xml
> to
> > search for file names? I welcome your input.
> >
> > Al.
> >
>



-- 
AB.

Sent from my Gmail account.


Re: Multicore boosting to only 1 core

2011-02-15 Thread Jonathan Rochkind
No. In fact, there's no way to search over multi-cores at once in Solr 
at all, even before you get to your boosting question. Your different 
cores are entirely different Solr indexes, Solr has no built-in way to 
combine searches accross multiple Solr instances.


[Well, sort of it can, with sharding. But sharding is unlikely to be a 
solution to your problem either, UNLESS you problem is that your solr 
index is so big you want to split it accross multiple machines for 
performance.  That is the problem sharding is meant to solve. People 
trying to use it to solve other problems run into trouble.]



On 2/14/2011 1:59 PM, Tanner Postert wrote:

I have a multicore system and I am looking to boost results by date, but
only for 1 core. Is this at all possible?

Basically one of the core's content is very new, and changes all the time,
and if I boost everything by date, that core's content will almost always be
at the top of the results, so I only want to do the date boosting to the
cores that have older content so that their more recent results get boosted
over the older content.


Re: schema.xml configuration for file names?

2011-02-15 Thread Jonathan Rochkind
You can't just send arbitrary XML to Solr for update, no.  You need to 
send a Solr Update Request in XML. You can write software that 
transforms that arbitrary XML to a Solr update request, for simple cases 
it could even just be XSLT.  There are also a variety of other mediator 
pieces that come with Solr for doing updates; you can send updates in 
comma-seperated-value format, or you can use Direct Import Handler to, 
in some not-too-complicated cases, embed the translation from your 
arbitrary XML to Solr documents in your Solr instance itself.


But you can't just send arbitrary XML to the Solr update handler, no.

No matter what method you use to send documents to solr, you're going to 
have to think about what you want your Solr schema to look like -- what 
fields of what types.  And then map your data to it.  In Solr, unlike in 
an rdbms, what you want your schema to look like has a lot to do with 
what kinds of queries you will want it to support, it can't just be done 
based on the nature of the data alone.


Jonathan

On 2/15/2011 12:45 PM, alan bonnemaison wrote:

Erick,

I think you put the finger on the problem. Our XML files (we get from our
suppliers) do *not* look like that.

That's what a typical file looks like

..

Obviously, nothing like

By the way, querying q=*:* retrieved "HTTP error 500 Null pointer
exception", which leads me to believe that my index is 100% empty.

What I am trying to do cannot be done, correct? I just don't want to waste
anyone's time.

Thanks,

Alan.


On Tue, Feb 15, 2011 at 6:01 AM, Erick Ericksonwrote:


Can we see a small sample of an xml file you're posting? Because it should
look something like

   
 R16-500
more fields here.
   


Take a look at the Solr admin page after you've indexed data to see what's
actually in your index, I suspect what's in there isn't what you
expect.

Try querying q=*:* just for yucks to see what the documents returned look
like.

I suspect your index doesn't contain anything like what you think, but
that's only
a guess...

Best
Erick

On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison
wrote:

Hello!

We receive from our suppliers hardware manufacturing data in XML files.

On a

typical day, we got 25,000 files. That is why I chose to implement Solr.

The file names are made of eleven fields separated by tildas like so



CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML

Our R&D guys want to be able search each field of the file XML file names
(OR operation) but they don't care to search the file contents. Ideally,
they would like to do a query all files where "stbmodel" equal to

"R16-500"

or "result" is "P" or "filedate" is "20110125"...you get the idea.

I defined in schema.xml each data field like so (from left to right --

sorry

for the long list):

   
   
   
   
   
   
   
   
   
   
   

Also, I defined as unique key the field "receiver". But no results are
returned by my queries. I made sure to update my index like so: "java

-jar

apache-solr-1.4.1/example/exampledocs/post.jar *XML".

I am obviously missing something. Is there a way to configure schema.xml

to

search for file names? I welcome your input.

Al.






Re: schema.xml configuration for file names?

2011-02-15 Thread alan bonnemaison
Thank you for your thorough response. Things make more sense now. Back to
the drawing board.

Alan.

On Tue, Feb 15, 2011 at 10:23 AM, Jonathan Rochkind wrote:

> You can't just send arbitrary XML to Solr for update, no.  You need to send
> a Solr Update Request in XML. You can write software that transforms that
> arbitrary XML to a Solr update request, for simple cases it could even just
> be XSLT.  There are also a variety of other mediator pieces that come with
> Solr for doing updates; you can send updates in comma-seperated-value
> format, or you can use Direct Import Handler to, in some not-too-complicated
> cases, embed the translation from your arbitrary XML to Solr documents in
> your Solr instance itself.
>
> But you can't just send arbitrary XML to the Solr update handler, no.
>
> No matter what method you use to send documents to solr, you're going to
> have to think about what you want your Solr schema to look like -- what
> fields of what types.  And then map your data to it.  In Solr, unlike in an
> rdbms, what you want your schema to look like has a lot to do with what
> kinds of queries you will want it to support, it can't just be done based on
> the nature of the data alone.
>
> Jonathan
>
>
> On 2/15/2011 12:45 PM, alan bonnemaison wrote:
>
>> Erick,
>>
>> I think you put the finger on the problem. Our XML files (we get from our
>> suppliers) do *not* look like that.
>>
>> That's what a typical file looks like
>>
>> ...> outcome="PASS">> value="NOVAL" />> />...> name="WorkCenterID" value="PREP" />> value="CTCA" />> />> enable_sfcs_comm="true" enable_param_db_comm="false"
>> force_param_db_update="false" driver_platform="LABVIEW" mode="PROD"
>> driver_revision="2.0">
>>
>> Obviously, nothing like
>>
>> By the way, querying q=*:* retrieved "HTTP error 500 Null pointer
>> exception", which leads me to believe that my index is 100% empty.
>>
>> What I am trying to do cannot be done, correct? I just don't want to waste
>> anyone's time.
>>
>> Thanks,
>>
>> Alan.
>>
>>
>> On Tue, Feb 15, 2011 at 6:01 AM, Erick Erickson> >wrote:
>>
>>  Can we see a small sample of an xml file you're posting? Because it
>>> should
>>> look something like
>>> 
>>>   
>>> R16-500
>>>more fields here.
>>>   
>>> 
>>>
>>> Take a look at the Solr admin page after you've indexed data to see
>>> what's
>>> actually in your index, I suspect what's in there isn't what you
>>> expect.
>>>
>>> Try querying q=*:* just for yucks to see what the documents returned look
>>> like.
>>>
>>> I suspect your index doesn't contain anything like what you think, but
>>> that's only
>>> a guess...
>>>
>>> Best
>>> Erick
>>>
>>> On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison
>>> wrote:
>>>
 Hello!

 We receive from our suppliers hardware manufacturing data in XML files.

>>> On a
>>>
 typical day, we got 25,000 files. That is why I chose to implement Solr.

 The file names are made of eleven fields separated by tildas like so


 CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML
>>>
 Our R&D guys want to be able search each field of the file XML file
 names
 (OR operation) but they don't care to search the file contents. Ideally,
 they would like to do a query all files where "stbmodel" equal to

>>> "R16-500"
>>>
 or "result" is "P" or "filedate" is "20110125"...you get the idea.

 I defined in schema.xml each data field like so (from left to right --

>>> sorry
>>>
 for the long list):

   >>> stored="true"   multiValued="false"/>
   >>> stored="true"   multiValued="false"/>
   >>> stored="true"   multiValued="false"/>
   >>> stored="false"  multiValued="false"/>
   >>> stored="fase"   multiValued="false"/>
   >>> stored="true"multiValued="false"/>
   >>> stored="true"   multiValued="false"/>
   >>> stored="true"multiValued="false"/>
   >>> stored="true"multiValued="false"/>
   >>> stored="true"   multiValued="false"/>
   >>> stored="true"   multiValued="false"/>

 Also, I defined as unique key the field "receiver". But no results are
 returned by my queries. I made sure to update my index like so: "java

>>> -jar
>>>
 apache-solr-1.4.1/example/exampledocs/post.jar *XML".

 I am obviously missing something. Is there a way to configure schema.xml

>>> to
>>>
 search for file names? I welcome your input.

 Al.


>>
>>


-- 
AB.

Sent from my Gmail account.


Question regarding inner entity in dataimporthandler

2011-02-15 Thread Greg Georges
Hello all,

I have searched the forums for the question I am about to ask, never found any 
concrete results. This is my case. I am defining the data config file with the 
document and entity tags. I define with success a basic entity mapped to my 
mysql database, and I then add some inner entities. The problem I have is with 
the one-to-one relationship I have between my "document" entity and its 
"documentcategory" entity. In my document table, the documentcategory foreign 
key is optional. Here is my mapping


   

   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

   

   
   
   
   
   

   



My first document entity in the database does not have a documentcateogry. When 
I run the dataimported I get this error message

Unable to execute query: select CategoryID as id, CategoryName as categoryName, 
MetaTitle as categoryMetaTitle, MetaDescription as categoryMetaDescription, 
MetaKeywords as categoryMetakeywords from documentcategory where CategoryID =  
Processing Document # 1

Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an 
error in your SQL syntax; check the manual that corresponds to your MySQL 
server version for the right syntax to use near '' at line 1

It seems that since the document.categoryId is null it uses an empty string. We 
would say that the importer does not work like a left join thus returning 
results even if one child is null. Anyone know a possible solution? Maybe 
instead of using inner entities, can I define a left join directly in my 
document query? Thanks

BTW: I already tested the config with another child element and everything 
works fine. Only the case with documentcategory which is sometimes null which 
causes problems

Greg



RE: Question regarding inner entity in dataimporthandler

2011-02-15 Thread Greg Georges
OK, I think I found some information, supposedly TemplateTransformer will 
return an empty string if the value of a variable is null. Some people say to 
use the regex transformer instead, can anyone clarify this? Thanks

-Original Message-
From: Greg Georges [mailto:greg.geor...@biztree.com] 
Sent: 15 février 2011 13:38
To: solr-user@lucene.apache.org
Subject: Question regarding inner entity in dataimporthandler

Hello all,

I have searched the forums for the question I am about to ask, never found any 
concrete results. This is my case. I am defining the data config file with the 
document and entity tags. I define with success a basic entity mapped to my 
mysql database, and I then add some inner entities. The problem I have is with 
the one-to-one relationship I have between my "document" entity and its 
"documentcategory" entity. In my document table, the documentcategory foreign 
key is optional. Here is my mapping


   

   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   

   

   
   
   
   
   

   



My first document entity in the database does not have a documentcateogry. When 
I run the dataimported I get this error message

Unable to execute query: select CategoryID as id, CategoryName as categoryName, 
MetaTitle as categoryMetaTitle, MetaDescription as categoryMetaDescription, 
MetaKeywords as categoryMetakeywords from documentcategory where CategoryID =  
Processing Document # 1

Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an 
error in your SQL syntax; check the manual that corresponds to your MySQL 
server version for the right syntax to use near '' at line 1

It seems that since the document.categoryId is null it uses an empty string. We 
would say that the importer does not work like a left join thus returning 
results even if one child is null. Anyone know a possible solution? Maybe 
instead of using inner entities, can I define a left join directly in my 
document query? Thanks

BTW: I already tested the config with another child element and everything 
works fine. Only the case with documentcategory which is sometimes null which 
causes problems

Greg



Any way to get back search query with parsed out stop words

2011-02-15 Thread Tanner Postert
I am trying to see if there is a way to get back the resulting search'd
query to solr excluding the stopwords.  Right now when I search for: "the
year in review" i can see in the debug that the parsed query contains: text:"?
year ? review" but that information is mixed in with all the parsed boosting
queries and isn't easily accessible via that XML (and would require me to
always pass debugQuery=true to my production queries.

I am trying to only get back the natural searched terms so I can highlight
them in the returned search results. I know that solr has built in
highlighting capability, but I can't use it because some of the fields
contain HTML themselves and I need to strip it all out when I display the
search results.

right now, if I pass the full searched phrase to the highlighter it looks a
little strange having every occurrence of "the" or "and" highlighted, so I'd
like to only highlight the non-stopwords if I could.

any thoughts or ideas would be appreciated


Re: Any way to get back search query with parsed out stop words

2011-02-15 Thread Ahmet Arslan
> I am trying to only get back the natural searched terms so
> I can highlight
> them in the returned search results. I know that solr has
> built in
> highlighting capability, but I can't use it because some of
> the fields
> contain HTML themselves and I need to strip it all out when
> I display the
> search results.

I would stick with the solr's highlighting. If you strip html codes with a 
solr.HTMLStripCharFilterFactory, you can highlight html fields without problem. 




  


Re: Multicore boosting to only 1 core

2011-02-15 Thread mike anderson
Could you make an additional date field, call it date_boost, that gets
populated in all of the cores EXCEPT the one with the newest articles, and
then boost on this field? Then when you move articles from the 'newest' core
to the rest of the cores you copy over the date to the date_boost field. (I
haven't used boosting before so I don't know what happens if you try to
boost a field that's empty)

This would boost documents in each index (locally, as desired). Keep in mind
when you get your results back from a distributed shard query that the IDF
is not distributed so your scores aren't reliable for sorting.

-mike


On Tue, Feb 15, 2011 at 1:19 PM, Jonathan Rochkind  wrote:

> No. In fact, there's no way to search over multi-cores at once in Solr at
> all, even before you get to your boosting question. Your different cores are
> entirely different Solr indexes, Solr has no built-in way to combine
> searches accross multiple Solr instances.
>
> [Well, sort of it can, with sharding. But sharding is unlikely to be a
> solution to your problem either, UNLESS you problem is that your solr index
> is so big you want to split it accross multiple machines for performance.
>  That is the problem sharding is meant to solve. People trying to use it to
> solve other problems run into trouble.]
>
>
> On 2/14/2011 1:59 PM, Tanner Postert wrote:
>
>> I have a multicore system and I am looking to boost results by date, but
>> only for 1 core. Is this at all possible?
>>
>> Basically one of the core's content is very new, and changes all the time,
>> and if I boost everything by date, that core's content will almost always
>> be
>> at the top of the results, so I only want to do the date boosting to the
>> cores that have older content so that their more recent results get
>> boosted
>> over the older content.
>>
>


Re: Any way to get back search query with parsed out stop words

2011-02-15 Thread Tanner Postert
ok, I will look at using that filter factory on my content.

But I was also looking at the stop filter number so I could adjust my mm
parameter based on the number of non-stopwords in the search parameter so I
don't run into the dismax stopword issue. any way around that other than
using a very low mm?

On Tue, Feb 15, 2011 at 1:45 PM, Ahmet Arslan  wrote:

> > I am trying to only get back the natural searched terms so
> > I can highlight
> > them in the returned search results. I know that solr has
> > built in
> > highlighting capability, but I can't use it because some of
> > the fields
> > contain HTML themselves and I need to strip it all out when
> > I display the
> > search results.
>
> I would stick with the solr's highlighting. If you strip html codes with a
> solr.HTMLStripCharFilterFactory, you can highlight html fields without
> problem.
>
>
>
>
>
>


Re: solr.HTMLStripCharFilterFactory not working

2011-02-15 Thread Tanner Postert
nevermind, I think I found my answer here:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html

I
will add the HTML stripper to the data importer and see how that goes

On Tue, Feb 15, 2011 at 3:43 PM, Tanner Postert wrote:

> I have several fields defined and one of the field types includes a
> solr.HTMLStripCharFilterFactory field in the analyzer but it doesn't
> appear to be affecting the field as I would expect.
> I have tried a simple:
>
> 
> followed by the tokenizer
> 
>
> or the combined factory
>
> 
>
> but neither seems to work.
>
> Returned search results from the webtitle & webdescription as well as text
> include the original HTML characters that the title & description fields
> have.
>
> The relevant schema:
>
> 
>  omitNorms="true"/>
>
> 
>   
> 
>
>  words="stopwords.txt" enablePositionIncrements="true"/>
>
>  generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1"/>
> 
>  protected="protwords.txt"/>
>   
>   
>
> 
>
>  ignoreCase="true" expand="true"/>
>
>  words="stopwords.txt" enablePositionIncrements="true"/>
>
>  generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="1"/>
>
> 
>  protected="protwords.txt"/>
>   
> 
>
>  positionIncrementGap="100" omitNorms="true">
>   
> 
>  words="stopwords.txt"/>
> 
> 
>   
>   
> 
>  ignoreCase="true" expand="true"/>
>  words="stopwords.txt"/>
> 
> 
>   
> 
> 
>
> 
> stored="true"   multiValued="false" />
> stored="true"   multiValued="false" />
> 
>
> stored="true"   multiValued="false" compressed="true" />
> stored="true"   mutliValued="false" compressed="true" />
> 
>
>stored="true"   multiValued="true" />
> 
> 
>
>multiValued="true" />
> 
> 
>
> 
>
>


Re: solr.HTMLStripCharFilterFactory not working

2011-02-15 Thread Tanner Postert
I am using the data import handler and using the HTMLStripTransformer
doesn't seem to be working either.

I've changed webtitle and webdescription to not by copied from title and
description in the schema.xml file then set them both to just but duplicates
of title and description in the data importer query:


 
  
 


On Tue, Feb 15, 2011 at 3:49 PM, Tanner Postert wrote:

> nevermind, I think I found my answer here:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html
>
> I
> will add the HTML stripper to the data importer and see how that goes
>
>
> On Tue, Feb 15, 2011 at 3:43 PM, Tanner Postert 
> wrote:
>
>> I have several fields defined and one of the field types includes a
>> solr.HTMLStripCharFilterFactory field in the analyzer but it doesn't
>> appear to be affecting the field as I would expect.
>> I have tried a simple:
>>
>> 
>> followed by the tokenizer
>> 
>>
>> or the combined factory
>>
>> 
>>
>> but neither seems to work.
>>
>> Returned search results from the webtitle & webdescription as well as text
>> include the original HTML characters that the title & description fields
>> have.
>>
>> The relevant schema:
>>
>> 
>> > omitNorms="true"/>
>>
>> 
>>   
>> 
>>
>> > words="stopwords.txt" enablePositionIncrements="true"/>
>>
>> > generateNumberParts="1" catenateWords="1" catenateNumbers="1"
>> catenateAll="0" splitOnCaseChange="1"/>
>> 
>> > protected="protwords.txt"/>
>>   
>>   
>>
>> 
>>
>> > ignoreCase="true" expand="true"/>
>>
>> > words="stopwords.txt" enablePositionIncrements="true"/>
>>
>> > generateNumberParts="1" catenateWords="0" catenateNumbers="0"
>> catenateAll="0" splitOnCaseChange="1"/>
>>
>> 
>> > protected="protwords.txt"/>
>>   
>> 
>>
>> > positionIncrementGap="100" omitNorms="true">
>>   
>> 
>> > words="stopwords.txt"/>
>> 
>> 
>>   
>>   
>> 
>> > ignoreCase="true" expand="true"/>
>> > words="stopwords.txt"/>
>> 
>> 
>>   
>> 
>> 
>>
>> 
>>   >  stored="true"   multiValued="false" />
>>   >  stored="true"   multiValued="false" />
>> 
>>
>>   >  stored="true"   multiValued="false" compressed="true" />
>>   >  stored="true"   mutliValued="false" compressed="true" />
>> 
>>
>>   > stored="true"   multiValued="true" />
>> 
>> 
>>
>>   > multiValued="true" />
>> 
>> 
>>
>> 
>>
>>
>


Passing parameters to DataImportHandler

2011-02-15 Thread Jason Rutherglen
It'd be nice to be able to pass HTTP parameters into DataImportHandler
that'd be passed into the SQL as parameters, is this possible?


Re: Passing parameters to DataImportHandler

2011-02-15 Thread Tanner Postert
yes it is possible via ${dataimporter.request.param}

see

http://wiki.apache.org/solr/DataImportHandler#Accessing_request_parameters

On Tue, Feb 15, 2011 at 4:45 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> It'd be nice to be able to pass HTTP parameters into DataImportHandler
> that'd be passed into the SQL as parameters, is this possible?
>


Spellchecking with some misspelled words in index source

2011-02-15 Thread Tanner Postert
I'm building my spellcheck index from my content and it seems to be working,
but my problem is that there are a few misspelled words in my content.  For
example: the word Sheriff is improperly misspelled Sherrif in my content a
couple dozen times (but spelled correctly a couple thousand times). The
results of the spellcheck at first glance indicate that the word is spelled
correctly because it is found in the spellcheck dictionary and has valid
search results. Adding a spellcheck.onlyMorePopular=true to the query
results in the spellcheck returning additional suggestions, but none of them
are for the correct spelling of the word:




sherriff


10




sherri


2319




sherril


155




sherif


19




sherric


4




is this just a strange glitch in my spellcheck dictionary based on my
content? What is strange, is sending the spellcheck sherriff (which is
another misspelling that has results in the index) results in the spellcheck
sending back the correct spelling as the top result.


SolrCloud - how mature it is now?

2011-02-15 Thread go canal
Hello,
Recently I saw some news about Elastic Search, which claims to be the best in 
terms of supporting distributed search.

Looking at Solandra and SolrCloud http://wiki.apache.org/solr/SolrCloud, it 
seems that they are not mature enough yet 
(http://wiki.apache.org/solr/DistributedSearch 'Distributed Search 
Limitations')- I hope I am wrong though. 

I am looking for a solution which can support both on-premise and Cloud 
deployment (which needs better scalability).

Do anyone know the roadmap/milestone plan for SolrCloud? Any real-life 
deployment?
thanks,
canal


  

Re: solr.HTMLStripCharFilterFactory not working

2011-02-15 Thread Koji Sekiguchi

(11/02/16 8:03), Tanner Postert wrote:

I am using the data import handler and using the HTMLStripTransformer
doesn't seem to be working either.

I've changed webtitle and webdescription to not by copied from title and
description in the schema.xml file then set them both to just but duplicates
of title and description in the data importer query:


  
   
  




Just for input (I'm not sure that I could help you), I'm using 
HTMLStripTransformer
with PlainTextEntityProcessor and it works fine with me:


  http://lucene.apache.org/"/>
  

  

  


Koji
--
http://www.rondhuit.com/en/


Reminder: Lucene Revolution 2011 Call For Papers Closing March 2

2011-02-15 Thread Michael Bohlig
Please submit your Call For Participation (CFP) proposals for Lucene Revolution 
2011 by March 2. If you have a great Solr or Lucene talk, this is a fantastic 
opportunity to share it with the community at the largest worldwide conference 
dedicated to Lucene and Solr which will take place at the San Francisco Airport 
Hyatt Regency May 25-26. 

To submit a proposal for a 45-minute presentation, complete the form at: 
http://www.lucidimagination.com/revolution/2011/cfp 

Topics of interest include: 
- Lucene and Solr in the Enterprise (case studies, implementation, return on 
investment, etc.) 
- Use of LucidWorks Enterprise 
- “How We Did It” development case studies 
- Lucene/Solr technology deep dives: features, how to use, etc. 
- Spatial/Geo/local search 
- Lucene and Solr in the Cloud 
- Scalability and performance tuning 
- Large Scale Search 
- Real Time Search (or NRT search) 
- Data Integration/Data Management 
- Lucene & Solr for Mobile Applications 
- Associated technologies: Mahout, Nutch, NLP, etc. 

All accepted speakers will get complimentary conference passes. Financial 
assistance is available for speakers that qualify. 

Submissions must be received by Wednesday , March 2 , 2011 , 12 Midnight PST 












Registration is now open for Lucene Revolution 2011 at: 
http://lucenerevolution.com/register 

Interested in conference news? Want to be added to the conference mailing list? 
Is your organization interested in sponsorship opportunities? Please send an 
email to: i...@lucenerevolution.org 

Regards, 
Mike 




Michael Bohlig | Lucid Imagination 
Enterprise Marketing 
p +1 650 353 4057 x132 
m+1 650 703 8383 
www.lucidimagination.com 




how to control which hosts can access Solr?

2011-02-15 Thread go canal
Hello,
Looking for advices re: how to control which hosts can access Solr.

Need to configure a list of IP addresses or host names into Solr so that we can 
control which hosts can access Solr.

I can not find the configuration in solrconfig.xml, so wondering how we can 
support this?
 thanks,
canal



  

Re: Passing parameters to DataImportHandler

2011-02-15 Thread Adam Estrada
Yep...Take a look at this example. Map your SQL query to the appropriate fields 
in your index. create a directory under called DataImportHandler and reference 
it in your update command using curl or whatever. example
/solr/conf/dataimporthandler



 
 
   

 
 
 
 
 
 
 
 
 
 
 
 
 
   


On Feb 15, 2011, at 6:45 PM, Jason Rutherglen wrote:

> It'd be nice to be able to pass HTTP parameters into DataImportHandler
> that'd be passed into the SQL as parameters, is this possible?



"Cloning" SolrInputDocument

2011-02-15 Thread Jeff Schmidt
In the process of handling a type of web service request, I need to create a 
series of documents for indexing. They differ by only a couple of field values, 
but share quite a few others.  I would like to clone SolrInputDocument and 
adjust a couple of fields, index that, lather, rinse, repeat.

However, org.apache.solr.common.SolrInputDocument (branch_3x) does  not 
implement Cloneable, override clone() to make a deep-copy etc. Also observed by 
looking at the source code is the fact that SolrInputDocument keeps all fields 
in a LinkedHashMap, and also exposes a Map interface.

So, does this sound like a workable idea?  I define all my fields in a 
Map for the first document, and then tweak and re-use 
it. E.g.:

Collection docs = new ArrayList();
Map fields = ...;  //Set up fields for 1st document

SolrInputDocument doc = new SolrInputDocument();
doc.putAll(fields);
docs.add(doc);

//Update values for fields (keys) a and b in fields Map.

doc = new SolrInputDocument();
doc.putAll(fields);
docs.add(doc);

//Update values for fields (keys) a and b in fields Map.

doc = new SolrInputDocument();
doc.putAll(fields);
docs.add(doc);

and so forth. Then:

SolrServer solrServer = getSolrServer();
solrServer.add(docs);
solrServer.commit();

Map.putAll "Copies all of the mappings from the specified map to this map", so 
each document will have its own copy of the fields. I will, or course, have to 
have map values of SolrInputField and instantiate those etc.  Perhaps this is 
not worth the effort and I should be be satisfied repeating the same 
doc.addField() method calls.

Thanks,

Jeff
--
Jeff Schmidt
535 Consulting
j...@535consulting.com
(650) 423-1068
http://www.535consulting.com








Re: how to control which hosts can access Solr?

2011-02-15 Thread Grijesh

You  can not do this kind of configuration by solrsonfig ,you have to
configure it with the help of your network administrator. 

-
Thanx:
Grijesh
http://lucidimagination.com
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-control-which-hosts-can-access-Solr-tp2506270p2507048.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Spellchecking with some misspelled words in index source

2011-02-15 Thread Grijesh

You have to correct the misspelled terms in your content to work properly
because spell checker will find the term and supposed as  right term.

spell checker will return suggestion when word not found in its dictionary.

-
Thanx:
Grijesh
http://lucidimagination.com
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Spellchecking-with-some-misspelled-words-in-index-source-tp2505722p2507110.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Dismax problem

2011-02-15 Thread Grijesh

what type of query you are searching?
what is the type of the field?

-
Thanx:
Grijesh
http://lucidimagination.com
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Dismax-problem-tp2501263p2507147.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: rollback to other versions of index

2011-02-15 Thread Tri Nguyen
Hi,

Wanted to explain my situation in more detail.

I have a master which never adds or deletes documents incrementally.  I just 
run 
the dataimport with autocommit.

Seems like I'll need to make a custom DeletionPolicy to keep more than one 
index 
around.

I'm accessing indices from Solr.  How do I tell solr to use a particular index?

Thanks,

Tri





From: Michael McCandless 
To: solr-user@lucene.apache.org
Sent: Tue, February 15, 2011 5:36:49 AM
Subject: Re: rollback to other versions of index

Lucene is able to do this, if you make a custom DeletionPolicy (which
controls when commit points are deleted).

By default Lucene only saves the most recent commit
(KeepOnlyLastCommitDeletionPolicy), but if your policy keeps more
around, then you can open an IndexReader or IndexWriter on any
IndexCommit.

Any changes (including optimize, and even opening a new IW with
create=true) are safe within a commit; Lucene is fully transactional.

For example, I use this for benchmarking: I save 4 commit points in a
single index.  First is a multi-segment index, second is the same
index with 5% deletions, third is an optimized index, and fourth is
the optimized index with 5% deletions.  This gives me a single index
w/ 4 different commit points, so I can then benchmark searching
against any of those 4.

Mike

On Tue, Feb 15, 2011 at 4:43 AM, Jan Høydahl  wrote:
> Yes and no. The index grows like an onion adding new segments for each commit.
> There is no API to remove the newly added segments, but I guess you could 
> hack 
>something.
> The other problem is that as soon as you trigger an optimize() all history is 
>gone as the segments are merged into one. Optimize normally happens 
>automatically behind the scenes. You could turn off merging but that will 
>badly 
>hurt your performance after some time and ultimately crash your OS.
>
> Since you only need a few versions back, you COULD write your own custom 
>mergePolicy, always preserving at least N versions. But beware that a 
>"version" 
>may be ONE document or 1 documents, depending on how you commit or if 
>autoCommit is active. so if you go this route you also need strict control 
>over 
>your commits.
>
> Perhaps best option is to handle this on feeding client side, where you keep 
> a 
>buffer of N last docs. Then you can freely roll back or re-index as you 
>choose, 
>based on time, number of docs etc.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> On 15. feb. 2011, at 01.21, Tri Nguyen wrote:
>
>> Hi,
>>
>> Does solr version each index build?
>>
>> We'd like to be able to rollback to not just a previous version but maybe a 
>few
>> version before the current one.
>>
>> Thanks,
>>
>> Tri
>
>


Solr the right thing for me?

2011-02-15 Thread Chris

Hello all,

I'm searching for a possibility to:

- Receive an email when a site changed/was added to a web.
- Only index sites, that contain a reg exp in the content.
- Receive the search results in machine readable way (RSS/SOAP/..)

This should be possible to organize in sets.
(set A with 40 Websites, set B with 7 websites)

Does it sound possible with SOLR?
Do I have to expect custom development? If so, how much?

Thank you in advance
Bye, Chris


Re: Solr the right thing for me?

2011-02-15 Thread Gora Mohanty
On Wed, Feb 16, 2011 at 12:18 PM, Chris  wrote:
[...]
> - Receive an email when a site changed/was added to a web.
> - Only index sites, that contain a reg exp in the content.

I think that you might be confused about what Solr does. It is
a search engine, and does not crawl websites. A good possibility
for you might be Nutch, which has built-in search capabilities,
but also interfaces with Solr.

> - Receive the search results in machine readable way (RSS/SOAP/..)

Solr gives you XML/JSON.

> This should be possible to organize in sets.
> (set A with 40 Websites, set B with 7 websites)

Yes, this can be done with separate indexes.

> Does it sound possible with SOLR?
> Do I have to expect custom development? If so, how much?
[...]

Nutch and Solr should meet your needs. There will be a fair
amount of learning to do at the beginning, but there should
not be a need for too much of a customisation.

Regards,
Gora


clustering with tomcat

2011-02-15 Thread Isha Garg

hi,
i am  using  solr1.4  with apache tomcat. to enable the 
clustering feature

i follow the link
http://wiki.apache.org/solr/ClusteringComponent
Plz help me how to   add-Dsolr.clustering.enabled=true to $CATALINA_OPTS.
after that which steps be will required.


Re: how to control which hosts can access Solr?

2011-02-15 Thread Gora Mohanty
On Wed, Feb 16, 2011 at 10:52 AM, Grijesh  wrote:
>
> You  can not do this kind of configuration by solrsonfig ,you have to
> configure it with the help of your network administrator.
[...]

The normal way to do this on Linux is with rules for iptables that only
allow access to the Solr port for certain hosts.

Regards,
Gora


Re: how to control which hosts can access Solr?

2011-02-15 Thread go canal
thank you,
so firewall then

I also saw Slor authentication, maybe I should add that also.
 thanks,
canal





From: Gora Mohanty 
To: solr-user@lucene.apache.org
Sent: Wed, February 16, 2011 3:10:25 PM
Subject: Re: how to control which hosts can access Solr?

On Wed, Feb 16, 2011 at 10:52 AM, Grijesh  wrote:
>
> You  can not do this kind of configuration by solrsonfig ,you have to
> configure it with the help of your network administrator.
[...]

The normal way to do this on Linux is with rules for iptables that only
allow access to the Solr port for certain hosts.

Regards,
Gora