[ https://issues.apache.org/jira/browse/SOLR-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264214#comment-17264214 ]
Jason Gerlowski commented on SOLR-15051: ---------------------------------------- Some overall thoughts on the proposal: # I really like the conceptual lines that this draws. Relying strictly on the Directory/DirectoryFactory interface is promising in terms of keeping storage concerns and SolrCloud concerns separate. We've tried Directory-based abstractions before of course (with HdfsDirectory), but this proposal improves on that in concrete ways: index file deduplication/ref-counting, removal of "BlockCache" concept, etc. (This isn't a knock on HdfsDirectory - TLOG/PULL replica types weren't around when HdfsDirectory was introduced, which is really what makes the BlobDirectory design feasible afaict.) # At the risk of counting unhatched chickens - I also think it's promising that this design can piggy-back on some of the SIP-12 work: especially the concrete BackupRepository implementations that SIP-12 proposes for common blob stores. # One specific worry I have is that BlobDirectory methods might be insufficient for accurately refcounting files. e.g. If a replica is deleted while the hosting Solr node is down, what will delete the corresponding "space" for that replica in the blob store, decrement the refcounts of shared files, etc? The proposal describes this being done by BlobDirectory - but when the hosting node is down seemingly the relevant BlobDirectory won't be instantiated anywhere to perform those actions. That said I don't know enough about how Directory objects are instantiated and used to say whether this is actually a real concern or whether existing SolrCloud logic will handle these cases appropriately. Overall I'm in favor of the proposal here as a more flexible alternative (replacement?) for HdfsDirectory. So, +1. > Shared storage -- BlobDirectory (de-duping) > ------------------------------------------- > > Key: SOLR-15051 > URL: https://issues.apache.org/jira/browse/SOLR-15051 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: David Smiley > Assignee: David Smiley > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This proposal is a way to accomplish shared storage in SolrCloud with a few > key characteristics: (A) using a Directory implementation, (B) delegates to a > backing local file Directory as a kind of read/write cache (C) replicas have > their own "space", (D) , de-duplication across replicas via reference > counting, (E) uses ZK but separately from SolrCloud stuff. > The Directory abstraction is a good one, and helps isolate shared storage > from the rest of SolrCloud that doesn't care. Using a backing normal file > Directory is faster for reads and is simpler than Solr's HDFSDirectory's > BlockCache. Replicas having their own space solves the problem of multiple > writers (e.g. of the same shard) trying to own and write to the same space, > and it implies that any of Solr's replica types can be used along with what > goes along with them like peer-to-peer replication (sometimes faster/cheaper > than pulling from shared storage). A de-duplication feature solves needless > duplication of files across replicas and from parent shards (i.e. from shard > splitting). The de-duplication feature requires a place to cache directory > listings so that they can be shared across replicas and atomically updated; > this is handled via ZooKeeper. Finally, some sort of Solr daemon / > auto-scaling code should be added to implement "autoAddReplicas", especially > to provide for a scenario where the leader is gone and can't be replicated > from directly but we can access shared storage. > For more about shared storage concepts, consider looking at the description > in SOLR-13101 and the linked Google Doc. > *[PROPOSAL > DOC|https://docs.google.com/document/d/1kjQPK80sLiZJyRjek_Edhokfc5q9S3ISvFRM2_YeL8M/edit?usp=sharing]* -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org