David Smiley created SOLR-15051:
-----------------------------------
Summary: Shared storage -- BlobDirectory (de-duping)
Key: SOLR-15051
URL: https://issues.apache.org/jira/browse/SOLR-15051
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Reporter: David Smiley
Assignee: David Smiley
This proposal is a way to accomplish shared storage in SolrCloud with a few key
characteristics: (A) using a Directory implementation, (B) delegates to a
backing local file Directory as a kind of read/write cache (C) replicas have
their own "space", (D) , de-duplication across replicas via reference counting,
(E) uses ZK but separately from SolrCloud stuff.
The Directory abstraction is a good one, and helps isolate shared storage from
the rest of SolrCloud that doesn't care. Using a backing normal file Directory
is faster for reads and is simpler than Solr's HDFSDirectory's BlockCache.
Replicas having their own space solves the problem of multiple writers (e.g. of
the same shard) trying to own and write to the same space, and it implies that
any of Solr's replica types can be used along with what goes along with them
like peer-to-peer replication (sometimes faster/cheaper than pulling from
shared storage). A de-duplication feature solves needless duplication of files
across replicas and from parent shards (i.e. from shard splitting). The
de-duplication feature requires a place to cache directory listings so that
they can be shared across replicas and atomically updated; this is handled via
ZooKeeper. Finally, some sort of Solr daemon / auto-scaling code should be
added to implement "autoAddReplicas", especially to provide for a scenario
where the leader is gone and can't be replicated from directly but we can
access shared storage.
For more about shared storage concepts, consider looking at the description in
SOLR-13101.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]