On 10/7/2016 6:19 PM, Mike Lissner wrote: > Soft commits seem to be exactly the thing for this, but whenever I open a > new searcher (which soft commits seem to do), the external file is > reloaded, and all queries are halted until it finishes loading. When I just > measured, this took about 30 seconds to complete. Most soft commit > documentation talks about setting up soft commits with <maxtime> of about a > second.
IMHO any documentation that recommends autoSoftCommit with a maxTime of one second is bad documentation, and needs to be fixed. Where have you seen such a recommendation? Unless the index is extremely small and has been thoroughly optimized for NRT (which usually means *no* autowarming), achieving commit times of less than one second is usually not possible. This is the page that usually comes out when people start talking about commits: http://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ On the topic of one-second commit latency, that page has this to say: "Set your soft commit interval to as long as you can stand. Don’t listen to your product manager who says “we need no more than 1 second latency”. Really. Push back hard and see if the /user/ is best served or will even notice. Soft commits and NRT are pretty amazing, but they’re not free." The kind of intervals for autocommit and autosoftcommit that I like to see is at LEAST one minute, and preferably longer if you can stand it to be longer. > Is there anything I can do to make the external file field not get reloaded > constantly? It only changes about once a month, and I want to use soft > commits to power the alerts feature. Anytime you want changes to show up in your index, you need a new searcher. When you're using an external file field, part of that info will come from that external source, and right now Solr/Lucene has no way of knowing that your external file has not changed, so it must read the file every time it builds a searcher. I doubt this feature was designed to deal well with an extremely large external file like yours. The code looks like it goes line by line reading the file, and although I'm sure that process has been optimized as far as it can be, it still takes a lot of time when there are millions of lines. If the info changes that infrequently, can you just incorporate it directly into the index with a standard field, with the info coming in as a part of your normal indexing process? I'm sure the performance would be MUCH better if Solr didn't have to reference the external file. It seems unlikely that Solr would stop serving queries while setting up a new searcher. The old searcher should continue to serve requests until the new searcher is ready. If this is happening, that definitely seems like a bug. Thanks, Shawn