I can see your point, though I think edge cases would be one concern, if
someone *can* create a very large synonyms file, someone *will* create that
file.  What  would you set the zookeeper max data size to be? 50MB? 100MB?
Someone is going to do something bad if there's nothing to tell them not to.
Today solr cloud just crashes if you try to create a modest sized synonyms
file, clearly at a minimum some zookeeper settings should be configured out
of the box.  Any reasonable setting you come up with for zookeeper is
virtually guaranteed to fail for some percentage of users over a reasonably
sized user-base (which solr has).

What if I plugged in a 200MB synonyms file just for testing purposes (I
don't care about performance implications)?  I don't think most users would
catch the footnote in the docs that calls out a max synonyms file size.

Dave


-----Original Message-----
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Tuesday, May 07, 2013 11:53 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud with large synonyms.txt

I'm not so worried about the large file in zk issue myself.

The concern is that you start storing and accessing lots of large files in
ZK. This is not what it was made for, and everything stays in RAM, so they
guard against this type of usage.

We are talking about a config file that is loaded on Core load though. It's
uploaded and read very rarely. On modern hardware and networks, making that
file 5MB rather than 1MB is not going to ruin your day. It just won't. Solr
does not use ZooKeeper heavily - in a steady state cluster, it doesn't read
or write from ZooKeeper at all to any degree that registers. I'm going to
have to see problems loading these larger config files from ZooKeeper before
I'm worried that it's a problem.

- Mark

On May 7, 2013, at 12:21 PM, Son Nguyen <s...@trancorp.com> wrote:

> Mark,
> 
> I tried to set that property on both ZK (I have only one ZK instance) and
Solr, but it still didn't work.
> But I read somewhere that ZK is not really designed for keeping large data
files, so this solution - increasing jute.maxbuffer (if I can implement it)
should be just temporary.
> 
> Son
> 
> -----Original Message-----
> From: Mark Miller [mailto:markrmil...@gmail.com] 
> Sent: Tuesday, May 07, 2013 9:35 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Cloud with large synonyms.txt
> 
> 
> On May 7, 2013, at 10:24 AM, Mark Miller <markrmil...@gmail.com> wrote:
> 
>> 
>> On May 6, 2013, at 12:32 PM, Son Nguyen <s...@trancorp.com> wrote:
>> 
>>> I did some researches on internet and found out that because Zookeeper
znode size limit is 1MB. I tried to increase the system property
"jute.maxbuffer" but it won't work.
>>> Does anyone have experience of dealing with it?
>> 
>> Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer,
though you have to do it for each ZK instance.
>> 
>> - Mark
>> 
> 
> "the system property must be set on all servers and clients otherwise
problems will arise."
> 
> Make sure you try passing it both to ZK *and* to Solr.
> 
> - Mark
> 

Reply via email to