On Mar 14, 2008, at 11:37 AM, Yonik Seeley wrote:
During the add (in DirectUpdateHandler2) docs are kept track of, and during a commit they are checked for dups. That code has been very well tested though, and I've only seen duplicates on a JVM crash/restart. That's because docs are added to the index, but Solr never gets a chance to remove the duplicates (and that info is only kept in RAM).
This is certainly possible. The dupes are all within the same indexed time period, and yep, just checked the logs, Resin crashed right around then (Lock obtain timed out: SingleInstanceLock: write.lock).
So, if we sent 4000 "updates" (replacements of docs already existing) to the index that has a 30m autocommit, and it crashed before the commit could happen, when it started back up I'd see both copies like that?