On 4/29/06, Marcus Stratmann <[EMAIL PROTECTED]> wrote:
Yes, on a delete operation. I'm not doing any commits until the end of all delete operations.
I assume this is a delete-by-id and not a delete-by-query? They work very differently. There is some state stored for each pending delete-by-id... there is a HashMap<String,Integer> with an entry for each id that needs to be deleted. This state shouldn't be that large though. If fact, delete-by-id does nothing with a Lucene index at all until <commit/>
After reading this I was curious if using commits during deleting would have any effect. So I tested doing a commit after 10,000 deletes at a time (which, I know, is not recommended). But that simply didn't change anything.
Strange... that suggests it's not the state kept in the HaspMap.
Meanwhile I found out that I can gain 10,000 documents more to delete (before getting an OOM) by increasing the heap space by 500M.
Something else is going on.
Unfortunately we need to delete about 200,000 documents on each update which would need 10G to be added to the heap space. Not to speak of the same number of inserts.
If you are first deleting so you can re-add a newer version of the document, you don't need too... overwriting older documents based on the uniqueKeyField is something Solr does for you!
Yes, I thought so, too. And in fact I get OOM even if I just submit search queries.
Is it possible to use a profiler to see where all the memory is going? It sounds like you may have uncovered a memory leak somewhere. Also what OS, what JVM, what appserver are you using? -Yonik