Solr index as multiple separate index directories

Jason Rutherglen Tue, 21 Jul 2009 17:25:00 -0700

I'd like to be able to define within a single Solr core, a set
of indexes in multiple directories. This is really useful for
indexing in Hadoop or integrating with Katta where an
EmbeddedSolrServer is distributed to the Hadoop cluster and
indexes are generated in parallel and returned to Solr slave
servers. It seems like this could be done using a custom
IndexReaderFactory that opens a MultiReader over the
directories. SolrIndexWriter usage in this context would be
limited to incremental updates (if anything).


It would be great for Solr docSet caching to operate at the
SegmentReader level so the small incremental updates don't cause
a massive cache regeneration. Maybe there's a way to trick Solr
into doing this today by using multiple EmbeddedSolrServer
instances for each large segment/shard, and executing a local
distributed query to them? This way each EmbeddedSolrServer
maintains caches that are not disturbed by shard updates.
Ideally if I had to use multiple cores, I'd rather not have
maintain separate instances of /conf on disk but could pass the
same in memory rep of solrconfig and schema into the core?

Solr index as multiple separate index directories

Reply via email to