On 10/27/2011 1:36 AM, Michael Kuhlmann wrote:
Why do you first query for these documents? Why don't you just delete them? Solr won't harm if no documents are affected by your delete query, and you'll get the number of affected documents in your response anyway. When deleting, Solrj nearly does nothing on its own, it just sends the POST request and analyzes the simple response. The behaviour in a get request is similar. We do thousands of update, delete and get requests in a minute using Solrj without problems, your timing problems must come frome somewhere else. -Kuli
When you do a delete blind, you have to follow it up with a commit. On my larger shards containing data older than approximately one week, a commit is resource intensive and takes 10 to 30 seconds. As much as 75% of the time, there are no updates to my larger shards (10.7 million records each), most of the activity happens on the small shard with the newest data (usually under 500000 records), which I call the incremental. On almost every update run, there are changes to the incremental, but doing a commit on that shard rarely takes more than a second or two.
The long commit times on the larger indexes is a result of cache warming, and almost all of the time is spent warming the filter cache. The answer to the next obvious question: autowarmCount=4 on that cache, with a maximum size of 64. We are working as fast as we can on reducing the complexity and size of our filter queries. It will require significant changes in our application.
Thanks, Shawn