Re: Facet count mismatch between solr simple facet and Json facet API.

2015-11-29 Thread rks_lucene
I am facing the same issue. Thanks for letting me know about the JIRA. I think this is a very big issue especially for those looking at Solr as a NoSQL analytics engine. Ritesh -- View this message in context: http://lucene.472066.n3.nabble.com/Facet-count-mismatch-between-solr-simple-facet-a

Soft commit and hard commit

2015-11-29 Thread Midas A
Machine configuration RAM: 48 GB CPU: 8 core JVM : 36 GB We are updating 70 , 000 docs / hr . what should be our soft commit and hard commit time to get best results. Current configuration : 6 false 60 There are no read on master server.

Re: Facet count mismatch between solr simple facet and Json facet API.

2015-11-29 Thread Vishnu Mishra
Yes we are using distributed search using shard approach. -- View this message in context: http://lucene.472066.n3.nabble.com/Facet-count-mismatch-between-solr-simple-facet-and-Json-facet-API-tp4242461p4242646.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Setting up Solr on multiple machines

2015-11-29 Thread Salman Ansari
Also I am interested in knowing how to create a collection where the replica and the same shard do not reside on the same machine. So, basically, shard1 with replica2 in one machine and shard2 with replica1 on the other machine. Is that by default when creating a collection of 2 shards and 2 replic

Re: Setting up Solr on multiple machines

2015-11-29 Thread Walter Underwood
Connecting to one Zookeeper node is fine. Until that node fails. Then what does Solr do for cluster information? The entire point of Zookeeper is to share that information in a reliable, fault-tolerant way. Solr can talk to any Zookeeper node and get the same information. wunder Walter Underwo

Re: Setting up Solr on multiple machines

2015-11-29 Thread Salman Ansari
Correct me if I am wrong but my understanding is that even connecting to one zookeeper should be enough as internally that zookeeper will sync Solr server info to other zookeepers in the ensemble (as long as that zookeeper belongs to an ensemble). Having said that, if that particular zookeeper goes

Re: Setting up Solr on multiple machines

2015-11-29 Thread Walter Underwood
Why would that link answer the question? Each Solr connects to one Zookeeper node. If that node goes down, Zookeeper is still available, but the node will need to connect to a new node. Specifying only one zk node is a single point of failure. If that node goes down, Solr cannot continue opera

Re: Setting up Solr on multiple machines

2015-11-29 Thread Don Bosco Durai
This should answer your question: https://zookeeper.apache.org/doc/r3.2.2/zookeeperOver.html#sc_designGoals On 11/29/15, 12:04 PM, "Salman Ansari" wrote: >my point is that what is the exact difference between the whole list and >one zookeeper? Moreover, I think this issue is related to Windo

Re: Setting up Solr on multiple machines

2015-11-29 Thread Salman Ansari
my point is that what is the exact difference between the whole list and one zookeeper? Moreover, I think this issue is related to Windows command as mentioned here http://stackoverflow.com/questions/28837827/solr-5-0-unable-to-start-solr-with-zookeeper-ensemble On Sun, Nov 29, 2015 at 10:55 PM,

Re: Setting up Solr on multiple machines

2015-11-29 Thread Don Bosco Durai
It is highly recommended to list all, but for testing, you might be able to get away giving only one. If the list doesn’t work, then you might even want to look into zookeeper and see whether they are setup properly. Bosco On 11/29/15, 11:51 AM, "Salman Ansari" wrote: >but the point is:

Re: Setting up Solr on multiple machines

2015-11-29 Thread Salman Ansari
but the point is: do I really need to list all the zookeepers in the ensemble when starting solr or I can just specify one of them? On Sun, Nov 29, 2015 at 10:45 PM, Don Bosco Durai wrote: > You might want to check the logs for why solr is not starting up. > > > Bosco > > > On 11/29/15, 11:30 AM

Re: Setting up Solr on multiple machines

2015-11-29 Thread Don Bosco Durai
You might want to check the logs for why solr is not starting up. Bosco On 11/29/15, 11:30 AM, "Salman Ansari" wrote: >Thanks for your reply. > > > >Actually I am following the official guide to start solr using (on Windows >machines) > > > >bin/solr start -e cloud -z zk1:2181,zk2:2182,zk3:2

Migrating from cores to collections

2015-11-29 Thread William Bell
OK. Been using Cores for 4 years. Want to migrate to collections / Cloud. Do we have to change our queries? http://loadbalancer:8983/solr/corename/select?q=*:* What does this become once we have the collection sharded? Do we need a Load Balancer or just point to one box and run the new query? Or

Re: Setting up Solr on multiple machines

2015-11-29 Thread Salman Ansari
Thanks for your reply. Actually I am following the official guide to start solr using (on Windows machines) bin/solr start -e cloud -z zk1:2181,zk2:2182,zk3:2183 (it is listed here https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble ) However, I am f

Re: Group by function in SolrCloud - when specifying exact shard with composite router (_route_ param)

2015-11-29 Thread Gili Nachum
Yeah, if I hit the right shard I get the right response back (regardless of distrib=T/F). But then, as you said, I'm not suppose to know which shard I hit (requires knowing shard URL, then add replicas mess to that). Bummer. On Sun, Nov 29, 2015 at 8:55 PM, Erick Erickson wrote: > Not quite sure

Re: Single-sharded SolrCloud vs Lucene indexing speed

2015-11-29 Thread Erick Erickson
Of course Lucene will be faster in all cases when replicas are present. Solr is built on Lucene so any overhead at all that Solr adds will cause the total round-trip to be slower. Lucene doesn't have to concern itself with distributing updates to replicas for instance as happens in your first two

Re: Setting up Solr on multiple machines

2015-11-29 Thread Don Bosco Durai
For 2a, assuming you want to tell Solr where to store the local indexes, in SolrCloud I have generally updated the solr.in.sh for the variable SOLR_HOME=. Make sure solr.xml is in that folder and optionally zoo.cfg. For 2b, as Erick mentioned, you need all three params. Solr by default will exp

Re: Setting up Solr on multiple machines

2015-11-29 Thread Erick Erickson
1> I'll pass 2a> yes. 2b> This should be automatic when you create the collection. You should specify numShards=2, replicationFactor=2 and maxShardsPerNode=2. Solr tries hard to distribute the shards and replicas on different machines. If you _really_ require exact placement, you can specify

Re: Solrcloud with Zookeeper in production

2015-11-29 Thread Erick Erickson
Note that you don't need to restart cores, just use the Collections API to reload the entire collection at once and get the config changes. The article you're referencing is from 2013, and things have changed. All of the start/stop operations should be done with the script located in bin/solr. See

Re: Group by function in SolrCloud - when specifying exact shard with composite router (_route_ param)

2015-11-29 Thread Erick Erickson
Not quite sure if I'm reading this right, but a non cloud request with &distrib=false might do the trick. Although you sake you're not supposed to know which shard, so I'm not sure this applies... On Sun, Nov 29, 2015 at 4:47 AM, Gili Nachum wrote: > Adding: > >1. Currently, when I query I on

Re: Block Joins

2015-11-29 Thread Mikhail Khludnev
Hello Rick, If I got you right, it's worth to have a look at [child] https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents Let me know if it works. On Sun, Nov 29, 2015 at 5:47 PM, Rick Leir wrote: > Hi all, > I am new to Block Joins, and am trying to follow > > https:/

Fwd: Block Joins

2015-11-29 Thread Rick Leir
Hi all, I am new to Block Joins, and am trying to follow https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers This page shows two forms of block join syntax for this parser q={!child of=}. example queryq={!child of="content_type:parentDoc

Re: Group by function in SolrCloud - when specifying exact shard with composite router (_route_ param)

2015-11-29 Thread Gili Nachum
Adding: 1. Currently, when I query I only get results from the particular share I happened to hit (normally I'm not suppose to know which shard I hit). 2. Running Solr 4.7.2 On Sun, Nov 29, 2015 at 2:44 PM, Gili Nachum wrote: > Hi, I'm attempting result grouping >

Group by function in SolrCloud - when specifying exact shard with composite router (_route_ param)

2015-11-29 Thread Gili Nachum
Hi, I'm attempting result grouping with custom function in SolrCloud, by providing a _route_ , without success :~( I know that group.func isn't supported in distributed searches, but in my case *I only need the query to gather data

Re: Solrcloud with Zookeeper in production

2015-11-29 Thread GOURAUD Emmanuel
Hi, bootstrap_confdir is used to send config files into zookeeper at the start of the solr instance, however you should notice that if you modify config files into zookeeper you'll need t restart all cores to load changes. Moreover, you must restart the designed core (with bootstrap_cmd) first

Re: Solrcloud with Zookeeper in production

2015-11-29 Thread Mugeesh Husain
Hi, I was following this article http://jayant7k.blogspot.in/2013/06/step-by-step-setting-up-solr-cloud.html for solr.5.3 +zk-3.4.6 version. in the above linke, there is an step for configuration files root@solr1$ java -DzkHost=solr1:2181,solr2:2181 -Dbootstrap_confdir=solr/collection1/conf/ -Dnu