On Thu, 2006-08-03 at 23:53 -0700, Chris Hostetter wrote: > 1) as new docs come in, add them to a purely in memory index > 2) when it becomes time to "commit" the new documents, test all queries > in the cache against this in memory index. > 3) any query in the cache which has a hit on this in memory index should > be invalidated, any query which does not have a hit is still valid.
You got it. > ...this could probably work if the index was purely additive > check if one of the cached queries matched on the deleted document Hmm, didn't see that one coming. Quick and dirt would be to rebuild the document for original source. Have to think of a better solution than that though. > the next segment merge could collapse doc ids above deleted docs which > were totally unrelated to any docs that were added or deleted -- so > you would think they are still valid even though the doc ids in the > cache don't correspond to the same documents anymore. This is not the first time I think of low level hooks in the index. If an optimization could report changes this would not be a problem, or? > while the "old" IndexSearcher is still being used by external requests > (and still using it's cache) a new "on deck" IndexSearcher is opened, > and an internal thread is running queries against it (the results of I do something similar to that. But all them queries (in some cases tens of thousands and a frequently updated index) hogs more CPU than I think it has to. I'm low on CPU (spent on real time collaborative filtering et.c.) but have more or less an unlimited amount of RAM.