Hi Michal,
Is there a particular reason to shard your collections like that? If it
was mainly for ease of operations, I'd consider just using CompositeId
to prevent specific types of queries hotspotting particular nodes.
If your ingest rate is fast, you might also consider making each
"collection" an alias that points to many actual collections, and
periodically closing off a collection and starting a new one. This
prevents cache churn and the impact of large merges.
Michael
On 11/10/14 08:03, Michal Krajňanský wrote:
Hi All,
I have been working on a project that has long employed Lucene indexer.
Currently, the system implements a proprietary document routing and index
plugging/unplugging on top of the Lucene and of course contains a great
body of indexes. Recently an idea came up to migrate from Lucene to
Solrcloud, which appears to be more powerfull that our proprietary system.
Could you suggest the best way to seamlessly migrate the system to use
Solrcloud, when the reindexing is not an option?
- all the existing indexes represent a single collection in terms of
Solrcloud
- the documents are organized in "shards" according to date (integer) and
language (a possibly extensible discrete set)
- the indexes are disjunct
I have been able to convert the existing indexes to the newest Lucene
version and plug them individually into the Solrcloud. However, there is
the question of routing, sharding etc.
Any insight appreciated.
Best,
Michal Krajnansky