[ https://issues.apache.org/jira/browse/SOLR-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170108#comment-17170108 ]
Noble Paul commented on SOLR-14613: ----------------------------------- {quote}Would be happy to get a bit more context on the value this proposal brings compared to the ongoing effort in PR 1684 because I don't see it. {quote} The proposal is to provide a consistent view of Solr Cluster across the codebase as plain simple interfaces . It is not intended to just serve the assign framework. I have been seeing and working on Solr codebase for very long and this is one thing that is sorely missing. {quote}Also, unclear how the statement in the SimpleMap Javadoc "It is designed to support large datasets without consuming lot of memory" is backed by reality. SimpleMap is a LinkedHashMap {quote} The purpose of SimpleMap is to replace all of NamedList, SimpleOrderedMap and Map used in Solr. This will also be implemented by POJOs in the future. The LinkedSimpleHashMap is provided as a an example to see how one can be implemented. {quote}SimpleMap does not implement Map and this makes it harder for new developers to approach (and its name is misleading). {quote} Yes, It was done on purpose. The idea is to have an interface that is efficient (Memory and performance) Map is a very bad interface to be used generically .It has too many methods and hard to implement as a generic Map like interface. New developers fo not have to create implementations of SimpleMap. They'll use one of the readily provided implementations like LinkedSimpleHashMap (which is nothing but a LinkehashMap) . You are seeing SimpleMap as a HashMap. I'm looking at a potential impl which may be working off of a byte[], POJO, a file etc. Trust me, I have tried implementing a Map from these and it was terrible {quote}This interface does not refine the general contracts of the equals and hashCode methods. {quote} This framework will never have to worry about the keys in CharSequence. It will always work with Strings. There is a small subset of cases where the keys could be never deserialized to String Objects for . (That is for very efficient desrialization and serialization ) We will never have to worry about it {quote}{quote} {quote} I believe it's not the responsibility of the plugin to return the final state {quote} It's possible to implement it in other ways. Any plugin that implements AssignStratgey will have a memory model of the SolrCluster after the placement is computed. What we need is the next computation to start from the previous state. A very simple naive impl can easily return the original SolrCluster Object. From my old experience, it is pretty easy to give a synthetic SolrCluster that represents the new state. May be we can provide a utility method to recreate this SolrCluster from the decisions. But, it is sub-optimal. But any serious implementation MUST provide a final state because the executing the decisions take time and usually other computations need to start computing immediately bq,Why do placement code need a URL to the node? Are we planning to allow plugin code to go do whatever they want to do or are we targeting a controlled (and simpler) environment? You are missing the larger picture. This API is to be used across the Solr codebase. The implemetations of these interfaces will most likely live in the {{org.apache.solr.common.cloud}} packages and you will just consume them in the placement. The placement framework will never need it not should it use it {quote}A collection has a name so should likely have a getName() method. {quote} Thanks. I just added it. It was supposed to be there. {quote}Replica would likely be a better and simpler name {quote} Yes, Replica was my preferred name. It was already taken and most likely this will be a problem as we will see this all over the codebase and people reading the code will be confused as to whiich one it is {quote}NodeMetrics: This interface actually exposes a large part of the implementation {quote} Yes. There is nothing wrong in concrete impls and not always interfaces. Let's keep in mind that there are 2 types of interfaces * The ones plugins implement. They should be purely interfaces * The ones implemented by Solr and handed over to the plugin. it does not matter if they are interfaces/classes/enums. What matters is the API surface area is minimal and easily understood. We should not dogmatically go with "only interfaces" {quote}WorkOrder:Rather than having a notion of WorkOrder Type, {quote} Implementing the previous framework, there will only be a limited no:of of WorkOrder types. I seean enum as more suitable (I'm ambivalent on this, we can go either way) {quote}AssignContext .A system in which everything can be fetched by a minimal number of round trips to remote nodes (i.e. one per node) is preferable. {quote} Exactly. This is designed with that in mind. You should not make multiple calls to the same node. All attributes of a node is fetched in one call (including system properties {{NodeMetrics.Property.SYSPROP.getMetrics("someproperty");}}) {quote}As a general note, I believe sample code of how these interfaces are to be used would be very useful {quote} Yes. They're coming. > Provide a clean API for pluggable replica assignment implementations > -------------------------------------------------------------------- > > Key: SOLR-14613 > URL: https://issues.apache.org/jira/browse/SOLR-14613 > Project: Solr > Issue Type: Improvement > Components: AutoScaling > Reporter: Andrzej Bialecki > Assignee: Ilan Ginzburg > Priority: Major > Time Spent: 10h 10m > Remaining Estimate: 0h > > As described in SIP-8 the current autoscaling Policy implementation has > several limitations that make it difficult to use for very large clusters and > very large collections. SIP-8 also mentions the possible migration path by > providing alternative implementations of the placement strategies that are > less complex but more efficient in these very large environments. > We should review the existing APIs that the current autoscaling engine uses > ({{SolrCloudManager}} , {{AssignStrategy}} , {{Suggester}} and related > interfaces) to see if they provide a sufficient and minimal API for plugging > in alternative autoscaling placement strategies, and if necessary refactor > the existing APIs. > Since these APIs are internal it should be possible to do this without > breaking back-compat. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org