I can see your point, though I think edge cases would be one concern, if someone *can* create a very large synonyms file, someone *will* create that file. What would you set the zookeeper max data size to be? 50MB? 100MB? Someone is going to do something bad if there's nothing to tell them not to. Today solr cloud just crashes if you try to create a modest sized synonyms file, clearly at a minimum some zookeeper settings should be configured out of the box. Any reasonable setting you come up with for zookeeper is virtually guaranteed to fail for some percentage of users over a reasonably sized user-base (which solr has).
What if I plugged in a 200MB synonyms file just for testing purposes (I don't care about performance implications)? I don't think most users would catch the footnote in the docs that calls out a max synonyms file size. Dave -----Original Message----- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Tuesday, May 07, 2013 11:53 PM To: solr-user@lucene.apache.org Subject: Re: Solr Cloud with large synonyms.txt I'm not so worried about the large file in zk issue myself. The concern is that you start storing and accessing lots of large files in ZK. This is not what it was made for, and everything stays in RAM, so they guard against this type of usage. We are talking about a config file that is loaded on Core load though. It's uploaded and read very rarely. On modern hardware and networks, making that file 5MB rather than 1MB is not going to ruin your day. It just won't. Solr does not use ZooKeeper heavily - in a steady state cluster, it doesn't read or write from ZooKeeper at all to any degree that registers. I'm going to have to see problems loading these larger config files from ZooKeeper before I'm worried that it's a problem. - Mark On May 7, 2013, at 12:21 PM, Son Nguyen <s...@trancorp.com> wrote: > Mark, > > I tried to set that property on both ZK (I have only one ZK instance) and Solr, but it still didn't work. > But I read somewhere that ZK is not really designed for keeping large data files, so this solution - increasing jute.maxbuffer (if I can implement it) should be just temporary. > > Son > > -----Original Message----- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: Tuesday, May 07, 2013 9:35 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Cloud with large synonyms.txt > > > On May 7, 2013, at 10:24 AM, Mark Miller <markrmil...@gmail.com> wrote: > >> >> On May 6, 2013, at 12:32 PM, Son Nguyen <s...@trancorp.com> wrote: >> >>> I did some researches on internet and found out that because Zookeeper znode size limit is 1MB. I tried to increase the system property "jute.maxbuffer" but it won't work. >>> Does anyone have experience of dealing with it? >> >> Perhaps hit up the ZK list? They doc it as simply raising jute.maxbuffer, though you have to do it for each ZK instance. >> >> - Mark >> > > "the system property must be set on all servers and clients otherwise problems will arise." > > Make sure you try passing it both to ZK *and* to Solr. > > - Mark >