Shahzad - As Shawn mentioned you can get lot of inputs from the folks who are using joins in Solr cloud if you start a new thread and i would suggest to take a look at Solr Streaming expressions and Parallel SQL Interface which covers joining use cases as well.
Thanks, Susheel On Tue, Feb 9, 2016 at 9:17 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 2/8/2016 10:10 PM, Shahzad Masud wrote: > > Due to distributed search feature, I might not be able to run > > SolrCloud. I would appreciate, if you please share that way of setting > > solr home for a specific context in Jetty-Solr. Its good to seek more > > information for comparison purposes. Do you think having multiple JVMs > > would increase or decrease performance. My document base is around 20 > > million rows (in 24 shards), with document size ranging from 100KB - > > 400 MB. SM > > For most people, the *entire point* of running SolrCloud is to do > distributed search, so to hear that you can't run SolrCloud because of > distributed search is very confusing to me. > > I admit to ignorance when it comes to the join feature in Solr ... but > it is my understanding that all you need to make joins work properly is > to have both of the indexes that you are joining running in the same JVM > and the same Solr instance. If you arrange your SolrCloud replicas so a > copy of every index is loaded on every server, I think that would > satisfy this requirement. I may be wrong, but I believe there are > SolrCloud users that use the join feature. > > When you create a config file for a Solr context, whether it's Jetty, > Tomcat, or some other container, you can set the solr/home JNDI variable > in the context fragment to set the solr home for that context. I found > a specific example for Tomcat. I know Jetty can do the same, but I do > not know how to actually create the context fragment. > > > https://wiki.apache.org/solr/SolrTomcat#Installing_Solr_instances_under_Tomcat > > I need to reiterate one point again. You should only run one Solr > container per server, with exactly one Solr context installed in that > server. This is recommended whether you're running SolrCloud or not, > and whether you're using distributed search or not. One Solr context > can handle a LOT of indexes. > > Running multiple Solr instances per server is only recommended in one > case: Extremely large indexes where you would need a very large heap. > Running two JVMs with smaller heaps *might* be more efficient ... but in > that case, it is usually better to split those indexes between two > separate servers, each one running only one instance of Solr. > > Thanks, > Shawn > >