How to get solr synonyms in result set.

2013-05-03 Thread Suneel Pandey
Hi, I want to get specific solr synonyms terms list during query time in result set based on filter criteria. I have implemented synonyms in .txt file. Thanks - Regards, Suneel Pandey Sr. Software Developer -- View this message in context: http://lucene.472066.n3.nabble.com/How-to

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Jason Hellman
I have to imagine I'm quibbling with the original assertion that "Solr 4.x is architected with a dependency on Zookeeper" when I say the following: Solr 4.x is not architected with a dependency on Zookeeper. SolrCloud, however, is. As such, if a line of reasoning drives greater concern about

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Gopal Patwa
agree with Anshum and Netflix has very nice supervisor system for ZooKeeper if they goes down it will restart them automatically http://techblog.netflix.com/2012/04/introducing-exhibitor-supervisor-system.html https://github.com/Netflix/exhibitor On Fri, May 3, 2013 at 6:53 PM, Anshum Gupta w

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Anshum Gupta
In case all your Zk nodes go down, the querying would continue to work fine (as far as no nodes fail) but you'd not be able to add docs. Sent from my iPhone On 03-May-2013, at 17:52, Shawn Heisey wrote: > On 5/3/2013 6:07 PM, Walter Underwood wrote: >> Ideally, the Solr nodes should be able to

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Shawn Heisey
On 5/3/2013 6:07 PM, Walter Underwood wrote: > Ideally, the Solr nodes should be able to continue as long as no node fails. > Failure of a leader would be bad, failure of non-leader replicas might cause > some timeouts, but could be survivable. > > Of course, nodes could not be added. I have re

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Walter Underwood
Ideally, the Solr nodes should be able to continue as long as no node fails. Failure of a leader would be bad, failure of non-leader replicas might cause some timeouts, but could be survivable. Of course, nodes could not be added. wunder On May 3, 2013, at 5:05 PM, Otis Gospodnetic wrote: > I

Re: disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Otis Gospodnetic
I *think* at this point SolrCloud without ZooKeeper is like a . body without a head? Otis -- Solr & ElasticSearch Support http://sematext.com/ On Fri, May 3, 2013 at 3:21 PM, Dennis Haller wrote: > Hi, > > Solr 4.x is architected with a dependency on Zookeeper, and Zookeeper is > expecte

Re: Configure Shingle Filter to ignore ngrams made of tokens with same start and end

2013-05-03 Thread Steve Rowe
An issue exists for this problem: https://issues.apache.org/jira/browse/LUCENE-3475 On May 3, 2013, at 11:00 AM, Walter Underwood wrote: > The shingle filter should respect positions. If it doesn't, that is worth > filing a bug so we know about it. > > wunder > > On May 3, 2013, at 10:50 AM,

Re: Duplicated Documents Across shards

2013-05-03 Thread Iker Mtnz. Apellaniz
We are currently using version 4.2. We have made tests with a single document and it gives us a 2 document count. But if we force to shard into te first machine, the one with a unique shard, the count gives us 1 document. I've tried using distrib=false parameter, it gives us no duplicate documents,

disaster recovery scenarios for solr cloud and zookeeper

2013-05-03 Thread Dennis Haller
Hi, Solr 4.x is architected with a dependency on Zookeeper, and Zookeeper is expected to have a very high (perfect?) availability. With 3 or 5 zookeeper nodes, it is possible to manage zookeeper maintenance and online availability to be close to %100. But what is the worst case for Solr if for som

Re: transientCacheSize doesn't seem to have any effect, except on startup

2013-05-03 Thread didier deshommes
On Fri, May 3, 2013 at 11:18 AM, Erick Erickson wrote: > The cores aren't loaded (or at least shouldn't be) for getting the status. > The _names_ of the cores should be returned, but those are (supposed) to be > retrieved from a list rather than loaded cores. So are you sure that's not > what > yo

custom tokenizer error

2013-05-03 Thread Sarita Nair
I am using a custom Tokenizer, as part of analysis chain, for a Solr (4.2.1) field. On trying to index, Solr throws a NullPointerException.  The unit tests for the custom tokenizer work fine. Any ideas as to what is it that I am missing/doing incorrectly will be appreciated. Here is the relevant

Re: Configure Shingle Filter to ignore ngrams made of tokens with same start and end

2013-05-03 Thread Walter Underwood
The shingle filter should respect positions. If it doesn't, that is worth filing a bug so we know about it. wunder On May 3, 2013, at 10:50 AM, Jack Krupansky wrote: > In short, no. I don't think you want to use the shingle filter on a token > stream that has multiple tokens at the same positi

SV: Solr 4 reload failed core

2013-05-03 Thread Peter Kirk
Thanks - I had just found the CREATE command, and I think that's the easiest path for us to take. It will actually basically function as our "reload" workaround works now. Fra: Erick Erickson [erickerick...@gmail.com] Sendt: 3. maj 2013 19:22 Til: solr-u

Re: Configure Shingle Filter to ignore ngrams made of tokens with same start and end

2013-05-03 Thread Jack Krupansky
In short, no. I don't think you want to use the shingle filter on a token stream that has multiple tokens at the same position, otherwise, you will get confused "suggestions", as you've encountered. -- Jack Krupansky -Original Message- From: Rounak Jain Sent: Friday, May 03, 2013 7:3

Re: Delete from Solr Cloud 4.0 index..

2013-05-03 Thread Erick Erickson
Anette: Be a little careful with the index size savings, they really don't mean much for _searching_. The sotred field compression significantly reduces the size on disk, but only for the stored data which is only accessed when returning the top N docs. In terms of how many docs you can fit on you

Re: Solr 4 reload failed core

2013-05-03 Thread Erick Erickson
It seems odd, but consider "create" rather than "reload". Create will load up an existing core, think of it as "create in memory" rather than "create on disk" for the case where there's already an index. Best Erick On Fri, May 3, 2013 at 6:27 AM, Peter Kirk wrote: > Hi > > I have a multi-core in

Re: Duplicated Documents Across shards

2013-05-03 Thread Erick Erickson
What version of Solr? The custom routing stuff is quite new so I'm guessing 4x? But this shouldn't be happening. The actual index data for the shards should be in separate directories, they just happen to be on the same physical machine. Try querying each one with &distrib=false to see the counts

Re: Query across multiple shards - key fields have different names

2013-05-03 Thread Erick Erickson
I don't think you can. Problem is that the "pseudo join" capability can work "cross core", which meas with two separate cores, but last I knew distributed joins aren't supported which is what you're asking for. Really think about flattening your data if at all possible. Best Erick On Thu, May 2,

Re: The HttpSolrServer "add(Collection docs)" method is not atomic.

2013-05-03 Thread Erick Erickson
bq: Is there a way to commit multiple documents/beans in a transaction/together in a way that it succeeds completely or fails completely? Not that I know of. I've seen various "divide and conquer" strategies to identify _which_ document failed, but the general process is usually to re-index the d

Re: commit in solr4 takes a longer time

2013-05-03 Thread Shawn Heisey
On 5/3/2013 9:28 AM, vicky desai wrote: Hi, When a auto commit operation is fired I am getting the following logs INFO: start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} setting the openSearcher to false definetly gave m

Re: socket write error

2013-05-03 Thread Dmitry Kan
After some more debugging I have found out, that one of the requests had a size of 4,4MB. The default maxPostSize in tomcat6 is 2MB ( http://tomcat.apache.org/tomcat-6.0-doc/config/ajp.html). Changing that to 10MB has greatly improved situation on the solr side. Dmitry On Fri, May 3, 2013 at 9:

Re: transientCacheSize doesn't seem to have any effect, except on startup

2013-05-03 Thread Erick Erickson
The cores aren't loaded (or at least shouldn't be) for getting the status. The _names_ of the cores should be returned, but those are (supposed) to be retrieved from a list rather than loaded cores. So are you sure that's not what you are seeing? How are you determining whether the cores are actual

Re: any plans to remove int32 limitation on the number of the documents in the index?

2013-05-03 Thread Erick Erickson
My off the cuff thought is that there are significant costs trying to do this that would be paid by 99.999% of setups out there. Also, usually you'll run into other issues (RAM etc) long before you come anywhere close to 2^31 docs. Lucene/Solr often allocates int[maxDoc] for various operations. wh

Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi, When a auto commit operation is fired I am getting the following logs INFO: start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} setting the openSearcher to false definetly gave me a lot of performance improvement but wa

Re: commit in solr4 takes a longer time

2013-05-03 Thread Gopal Patwa
Since you have define commit option as "Auto Commit" for hard and soft commit then you don't have to explicitly call commit from SolrJ client. And openSearcher=false for hard commit will make hard commit faster since it is only makes sure that recent changes are flushed to disk (for durability) an

Re: Solr metrics in Codahale metrics and Graphite?

2013-05-03 Thread Furkan KAMACI
Does anybody tested Ganglia with JMXTrans at production environment for SolrCloud? 2013/4/26 Dmitry Kan > Alan, Shawn, > > If backporting to 3.x is hard, no worries, we don't necessarily require the > patch as we are heading to 4.x eventually. It is just much easier within > our organization to

Re: Does Near Real Time get not supported at SolrCloud?

2013-05-03 Thread Timothy Potter
yes, absolutely - NRT was a big driver for the leader to replica distribution approach in Solr Cloud On Fri, May 3, 2013 at 1:14 AM, Furkan KAMACI wrote: > Does soft commits distributes into nodes of SolrCloud? > > 2013/5/3 Otis Gospodnetic > >> NRT works with SolrCloud. >> >> Otis >> Solr & Ela

Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi, After using the following config 500 1000 5000 false When a commit operation is fired I am getting the follow

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Edd Grant
Aah I see - very useful. Thanks! On 3 May 2013 15:49, Shawn Heisey wrote: > On 5/3/2013 8:35 AM, Edd Grant wrote: > > Thanks, that's exactly what I was worried about. If I take your suggested > > approach of using SolrCloudServer and the feeder learns which shard > leader > > to target, then if

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Shawn Heisey
On 5/3/2013 8:35 AM, Edd Grant wrote: > Thanks, that's exactly what I was worried about. If I take your suggested > approach of using SolrCloudServer and the feeder learns which shard leader > to target, then if the shard leader goes down midway through indexing then > I've lost my ability to index

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Edd Grant
Thanks, that's exactly what I was worried about. If I take your suggested approach of using SolrCloudServer and the feeder learns which shard leader to target, then if the shard leader goes down midway through indexing then I've lost my ability to index. Whereas if I take the route of making all up

Re: Delete from Solr Cloud 4.0 index..

2013-05-03 Thread Shawn Heisey
On 5/3/2013 3:22 AM, Annette Newton wrote: > One question Shawn - did you ever get any costings around Zing? Did you > trial it? I never did do a trial. I asked them for a cost and they didn't have an immediate answer, wanted to do a phone call and get a lot of information about my setup. The pr

Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
Hi All, setting opensearcher flag to true solution worked and it give me visible improvement in commit time. One thing to make note of is that while using solrj client we have to call server.commit(false,false) which i was doing incorrectly and hence was not able to see the improvement earliear.

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Furkan KAMACI
If you index them with SolrCloudServer, your server will learn where data will go from Zookeeper and send data to that shard leader. However if you use another random processes or something like data will go any of nodes and after that will be routed into the right place within cluster. This extra

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Edd Grant
Hi, No we're actually POSTing them over plain old http. Our "feeder" process simply points at the HAProxy box and posts merrily away. Cheers, Edd On 3 May 2013 13:17, Furkan KAMACI wrote: > Do you use CloudSolrServer when you push documnts into SolrCloud to be > indexed? > > 2013/5/3 Edd Gra

Solr 4 reload failed core

2013-05-03 Thread Peter Kirk
Hi I have a multi-core installation, with 2 cores. Sometimes, when Solr starts up, one of the cores fails (due to an extension to Solr I have, which is waiting on an external service which has yet to initialise). In previous versions of Solr, I could subsequently issue a RELOAD to this core, e

Re: Good Desktop Search?

2013-05-03 Thread Savia Beson
Thanks Paul, I missed that one. On May 3, 2013, at 2:27 PM, Paul Libbrecht wrote: > Savia, > > maybe not very mature yet, but someone on java-us...@lucene.apache.org > announced such a tool the other day. > I'm copying it below. > I do not know of many otherwise. > > paul > >> Hi everybod

Duplicated Documents Across shards

2013-05-03 Thread Iker Mtnz. Apellaniz
Hi, We have currently a solrCloud implementation running 5 shards in 3 physical machines, so the first machine will have the shard number 1, the second machine shards 2 & 4, and the third shards 3 & 5. We noticed that while queryng numFoundDocs decreased when we increased the start param. After

Re: Good Desktop Search?

2013-05-03 Thread Paul Libbrecht
Savia, maybe not very mature yet, but someone on java-us...@lucene.apache.org announced such a tool the other day. I'm copying it below. I do not know of many otherwise. paul > Hi everybody, > just a simple question > is there any solr/lucene based desktop search project around someone might

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Furkan KAMACI
Do you use CloudSolrServer when you push documnts into SolrCloud to be indexed? 2013/5/3 Edd Grant > Hi all, > > I have been playing with Solr Cloud recently and am enjoying the > distributed indexing capability. > > At the moment my SolrCloud consists of 2 leaders and 2 replicas which are > fro

Good Desktop Search?

2013-05-03 Thread Savia Beson
Hi everybody, just a simple question is there any solr/lucene based desktop search project around someone might recommend? I am looking for something for personal use that is kind of mature, at least stable, runs on java and does not require admin rights to install. Nothing too fancy. Thank

Performance considerations when using distributed indexing + loadbalancing with Solr cloud

2013-05-03 Thread Edd Grant
Hi all, I have been playing with Solr Cloud recently and am enjoying the distributed indexing capability. At the moment my SolrCloud consists of 2 leaders and 2 replicas which are fronted by an HAProxy instance. I want to maximise performance for indexing and it occurred to me that the model I us

Re: Delete from Solr Cloud 4.0 index..

2013-05-03 Thread Annette Newton
One question Shawn - did you ever get any costings around Zing? Did you trial it? Thanks. On 3 May 2013 10:03, Annette Newton wrote: > Thanks Shawn. > > I have played around with Soft Commits before and didn't seem to have any > improvement, but with the current load testing I am doing I will

Re: Delete from Solr Cloud 4.0 index..

2013-05-03 Thread Annette Newton
Thanks Shawn. I have played around with Soft Commits before and didn't seem to have any improvement, but with the current load testing I am doing I will give it another go. I have researched docValues and came across the fact that it would increase the index size. With the upgrade to 4.2.1 the i

Re: Rearranging Search Results of a Search?

2013-05-03 Thread Furkan KAMACI
I think this looks like what I search for: https://issues.apache.org/jira/browse/SOLR-4465 How about post filter for Lucene, can it help me for my purpose? 2013/5/3 Otis Gospodnetic > Hi, > > You should use search more often :) > > http://search-lucene.com/?q=scriptable+collector&sort=newestOnT

Re: Does Near Real Time get not supported at SolrCloud?

2013-05-03 Thread Furkan KAMACI
Does soft commits distributes into nodes of SolrCloud? 2013/5/3 Otis Gospodnetic > NRT works with SolrCloud. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > > On May 2, 2013 5:34 AM, "Furkan KAMACI" wrote: > > > > Does Near Real Time get not supported at SolrCloud? > > > > I me

Re: What Happens to Consistency if I kill a Leader and Startup it again?

2013-05-03 Thread Furkan KAMACI
Shawn thanks for detailed answer, it explains everything. I think that there is no problem. I will use 4.3. when it is available and if I see a situation something like that I will report. 2013/5/3 Shawn Heisey > On 5/2/2013 2:19 PM, Furkan KAMACI wrote: > > I see that at my admin page: > > > >

Re: commit in solr4 takes a longer time

2013-05-03 Thread vicky desai
My solrconfig.xml is as follows LUCENE_40 2147483647 simple true 500 1000

Re: commit in solr4 takes a longer time

2013-05-03 Thread Sandeep Mestry
That's not ideal. Can you post solrconfig.xml? On 3 May 2013 07:41, "vicky desai" wrote: > Hi sandeep, > > I made the changes u mentioned and tested again for the same set of docs > but > unfortunately the commit time increased. > > > > -- > View this message in context: > http://lucene.472066.n3