Otis, The index is currently 236GB. I don't know which particular segments were being merged when I reported this problem, but my largest segment now (_8nm) is taking up 133GB, and the largest single file in the index is _8nm.prx, 71GB.
I'm using a custom C# indexing client, so no Solrj/StreamingUpdateSolrServer. I'm submitting only one document per HTTP POST, and posts are to ExtractingRequestHandler (aka Solr Cell). The indexing client is multi-threaded. (Multi-threading may help hit the 200-thread limit sooner, but it seems like reducing the # of threads would just postpone hitting that limit, rather than eliminate the problem.) I also have Solr set to auto-commit every 30 minutes or so, to try to keep the index adequately live. On Fri, Apr 23, 2010 at 1:38 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote: > Chris, > > It looks like Mike already offered several solutions.... though I don't know > what Solr does without looking at the code. > > But I'm curious: > * how big is your index? and do you know how large the segments being merged > are? > * do you batch docs or do you make use of Streaming SolrServer? > I'm curious, because I've never encountered this problem before... > > Thanks, > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > ----- Original Message ---- >> From: Chris Harris <rygu...@gmail.com> >> To: solr-user@lucene.apache.org >> Sent: Thu, April 22, 2010 6:28:29 PM >> Subject: Best way to prevent this search lockup (apparently caused during >> big segment merges)? >> >> I'm running Solr 1.4+ under Tomcat 6, with indexing and searching > requests >> simultaneously hitting the same Solr machine. Sometimes Solr, > Tomcat, and my >> (C#) indexing process conspire to render search > inoperable. So far I've only >> noticed this while big segment merges > (i.e. merges that take multiple >> minutes) are taking place. > > Let me explain the situation as best as I >> understand it. > > My indexer has a main loop that looks roughly like >> this: > > while true: > try: > >> submit a new add or delete request to Solr via HTTP > catch >> timeoutException: > sleep a few seconds > > When things >> are going wrong (i.e., when a large segment merge is > happening), this loop is >> problematic: > > * When the indexer's request hits Solr, then the >> corresponding thread > in Tomcat blocks. (It looks to me like the thread is >> destined to block > until the entire merge is complete. I'll paste in what the >> Java stack > traces look like at the end of the message if they can help >> diagnose > things.) > * Because the Solr thread stays blocked for so long, >> eventually the > indexer hits a timeoutException. (That is, it gives up on >> Solr.) > * Hitting the timeout exception doesn't cause the corresponding >> Tomcat > thread to die or unblock. Therefore, each time through the >> loop, > another Solr-handling thread inside Tomcat enters a blocked state. > * >> Eventually so many threads (maxThreads, whose Tomcat default is 200) > are >> blocked that Tomcat starts rejecting all new Solr HTTP requests -- > including >> those coming in from the web tier. > * Users are unable to search. The problem >> might self-correct once the > merge is complete, but that could be quite a >> while. > > What are my options for changing Solr settings or changing my >> indexing > process to avoid this lockup scenario? Do you agree that the >> segment > merge is helping cause the lockup? Do adds and deletes really need >> to > block on segment merges?