David, have you seen the finite state automata the synonym lookup is built on? The lookup is very efficient and fast. You have a point though, it is going to fail for someone. Roman On 8 May 2013 03:11, "David Parks" <davidpark...@yahoo.com> wrote:
> I can see your point, though I think edge cases would be one concern, if > someone *can* create a very large synonyms file, someone *will* create that > file. What would you set the zookeeper max data size to be? 50MB? 100MB? > Someone is going to do something bad if there's nothing to tell them not > to. > Today solr cloud just crashes if you try to create a modest sized synonyms > file, clearly at a minimum some zookeeper settings should be configured out > of the box. Any reasonable setting you come up with for zookeeper is > virtually guaranteed to fail for some percentage of users over a reasonably > sized user-base (which solr has). > > What if I plugged in a 200MB synonyms file just for testing purposes (I > don't care about performance implications)? I don't think most users would > catch the footnote in the docs that calls out a max synonyms file size. > > Dave > > > -----Original Message----- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: Tuesday, May 07, 2013 11:53 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Cloud with large synonyms.txt > > I'm not so worried about the large file in zk issue myself. > > The concern is that you start storing and accessing lots of large files in > ZK. This is not what it was made for, and everything stays in RAM, so they > guard against this type of usage. > > We are talking about a config file that is loaded on Core load though. It's > uploaded and read very rarely. On modern hardware and networks, making that > file 5MB rather than 1MB is not going to ruin your day. It just won't. Solr > does not use ZooKeeper heavily - in a steady state cluster, it doesn't read > or write from ZooKeeper at all to any degree that registers. I'm going to > have to see problems loading these larger config files from ZooKeeper > before > I'm worried that it's a problem. > > - Mark > > On May 7, 2013, at 12:21 PM, Son Nguyen <s...@trancorp.com> wrote: > > > Mark, > > > > I tried to set that property on both ZK (I have only one ZK instance) and > Solr, but it still didn't work. > > But I read somewhere that ZK is not really designed for keeping large > data > files, so this solution - increasing jute.maxbuffer (if I can implement it) > should be just temporary. > > > > Son > > > > -----Original Message----- > > From: Mark Miller [mailto:markrmil...@gmail.com] > > Sent: Tuesday, May 07, 2013 9:35 PM > > To: solr-user@lucene.apache.org > > Subject: Re: Solr Cloud with large synonyms.txt > > > > > > On May 7, 2013, at 10:24 AM, Mark Miller <markrmil...@gmail.com> wrote: > > > >> > >> On May 6, 2013, at 12:32 PM, Son Nguyen <s...@trancorp.com> wrote: > >> > >>> I did some researches on internet and found out that because Zookeeper > znode size limit is 1MB. I tried to increase the system property > "jute.maxbuffer" but it won't work. > >>> Does anyone have experience of dealing with it? > >> > >> Perhaps hit up the ZK list? They doc it as simply raising > jute.maxbuffer, > though you have to do it for each ZK instance. > >> > >> - Mark > >> > > > > "the system property must be set on all servers and clients otherwise > problems will arise." > > > > Make sure you try passing it both to ZK *and* to Solr. > > > > - Mark > > > >