Re: SolrCloud Feedback

Pulkit Singhal Fri, 09 Sep 2011 10:10:20 -0700

I think I understand it a bit better now but wouldn't mind some validation.


1) solr.xml does not become part of ZooKeeper
2) The default looks like this out-of-box:
  <cores adminPath="/admin/cores" defaultCoreName="collection1">
    <core name="collection1" instanceDir="." shard="shard1"/>
  </cores>
so that may leave one wondering where the core's association to a
collection name is made?

It can be made like so:
a) statically in a file:
<core name="collection1" instanceDir="." shard="shard1" collection="myconf" />
b) at start time via java:
java ... -Dcollection.configName=myconf ... -jar start.jar

And I'm guessing that since the core's name ("collection1") for shard1
has already been associated with -Dcollection.configname=myconf in
http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster
once already, adding an additional shard2 with the same core name
("collection1"), automatically throws it in with the collection name
("myconf") without any need to specify anything at startup via -D or
statically in solr.xml file.

Validate away otherwise I'll just accept any hate mail after making
edits to the Solr wiki directly.

- Pulkit

On Fri, Sep 9, 2011 at 11:38 AM, Pulkit Singhal <pulkitsing...@gmail.com> wrote:
> Hello Jan,
>
> You've made a very good point in (b). I would be happy to make the
> edit to the wiki if I understood your explanation completely.
>
> When you say that it is "looking up what collection that core is part
> of" ... I'm curious how a core is being put under a particular
> collection in the first place? And what that collection is named?
> Obviously you've made it clear that colelction1 is really the name of
> the core itself. And where this association is being stored for the
> code to look it up?
>
> If not Jan, then perhaps the gurus who wrote Solr Cloud could answer :)
>
> Thanks!
> - Pulkit
>
> On Thu, Feb 10, 2011 at 9:10 AM, Jan Høydahl <jan....@cominvent.com> wrote:
>> Hi,
>>
>> I have so far just tested the examples and got a N by M cluster running. My 
>> feedback:
>>
>> a) First of all, a major update of the SolrCloud Wiki is needed, to clearly 
>> state what is in which version, what are current improvement plans and get 
>> rid of outdated stuff. That said I think there are many good ideas there.
>>
>> b) The "collection" terminology is too much confused with "core", and should 
>> probably be made more distinct. I just tried to configure two cores on the 
>> same Solr instance into the same collection, and that worked fine, both as 
>> distinct shards and as same shard (replica). The wiki examples give the 
>> impression that "collection1" in 
>> localhost:8983/solr/collection1/select?distrib=true is some magic collection 
>> identifier, but what it really does is doing the query on the *core* named 
>> "collection1", looking up what collection that core is part of and 
>> distributing the query to all shards in that collection.
>>
>> c) ZK is not designed to store large files. While the files in conf are 
>> normally well below the 1M limit ZK imposes, we should perhaps consider 
>> using a lightweight distributed object or k/v store for holding the /CONFIGS 
>> and let ZK store a reference only
>>
>> d) How are admins supposed to update configs in ZK? Install their favourite 
>> ZK editor?
>>
>> e) We should perhaps not be so afraid to make ZK a requirement for Solr in 
>> v4. Ideally you should interact with a 1-node Solr in the same manner as you 
>> do with a 100-node Solr. An example is the Admin GUI where the "schema" and 
>> "solrconfig" links assume local file. This requires decent tool support to 
>> make ZK interaction intuitive, such as "import" and "export" commands.
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>>
>> On 19. jan. 2011, at 21.07, Mark Miller wrote:
>>
>>> Hello Users,
>>>
>>> About a little over a year ago, a few of us started working on what we 
>>> called SolrCloud.
>>>
>>> This initial bit of work was really a combination of laying some base work 
>>> - figuring out how to integrate ZooKeeper with Solr in a limited way, 
>>> dealing with some infrastructure - and picking off some low hanging search 
>>> side fruit.
>>>
>>> The next step is the indexing side. And we plan on starting to tackle that 
>>> sometime soon.
>>>
>>> But first - could you help with some feedback?ISome people are using our 
>>> SolrCloud start - I have seen evidence of it ;) Some, even in production.
>>>
>>> I would love to have your help in targeting what we now try and improve. 
>>> Any suggestions or feedback? If you have sent this before, I/others likely 
>>> missed it - send it again!
>>>
>>> I know anyone that has used SolrCloud has some feedback. I know it because 
>>> I've used it too ;) It's too complicated to setup still. There are still 
>>> plenty of pain points. We accepted some compromise trying to fit into what 
>>> Solr was, and not wanting to dig in too far before feeling things out and 
>>> letting users try things out a bit. Thinking that we might be able to 
>>> adjust Solr to be more in favor of SolrCloud as we go, what is the ideal 
>>> state of the work we have currently done?
>>>
>>> If anyone using SolrCloud helps with the feedback, I'll help with the 
>>> coding effort.
>>>
>>> - Mark Miller
>>> -- lucidimagination.com
>>
>>
>

Re: SolrCloud Feedback

Reply via email to