In the meantime, I had imagined that, although clumsy, federated
search could
be used for this purpose - posting the new documents to a group of
servers
('latest updates servers') with v limited amount of documents with
v. fast
"reload / refresh" times, and sending them again (on a work queue,
possibly), to
the 'core servers'. Regularly cleaning the 'latest updates servers'
of the
already posted documents to 'core servers' would keep them lean...
of course,
this approach sucks compared to a proper solution like what James
is suggesting
:)
Otis - is there an issue I should be looking at for more information
on this?
Yes, in principle, sending updates both to a fresh, forgetful and fast
index and a larger, slower index is what I'm thinking of doing.
The only difference is that I'm talking about having the fresh index
be implemented as a RAMDirectory in the same JVM as the large index.
This means that I can avoid the slowness of cross-disk or cross-
machine replication, I can avoid having to index all documents in two
places and I cut out the extra moving part of federated search.
On the other hand, I am going to have to write my own piece to handle
the index flushes and federate searches to the fast and large indices.
Thanks for your input!
James