[
https://issues.apache.org/jira/browse/SOLR-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170108#comment-17170108
]
Noble Paul commented on SOLR-14613:
-----------------------------------
{quote}Would be happy to get a bit more context on the value this proposal
brings compared to the ongoing effort in PR 1684 because I don't see it.
{quote}
The proposal is to provide a consistent view of Solr Cluster across the
codebase as plain simple interfaces . It is not intended to just serve the
assign framework. I have been seeing and working on Solr codebase for very long
and this is one thing that is sorely missing.
{quote}Also, unclear how the statement in the SimpleMap Javadoc "It is designed
to support large datasets without consuming lot of memory" is backed by
reality. SimpleMap is a LinkedHashMap
{quote}
The purpose of SimpleMap is to replace all of NamedList, SimpleOrderedMap and
Map used in Solr. This will also be implemented by POJOs in the future. The
LinkedSimpleHashMap is provided as a an example to see how one can be
implemented.
{quote}SimpleMap does not implement Map and this makes it harder for new
developers to approach (and its name is misleading).
{quote}
Yes, It was done on purpose. The idea is to have an interface that is efficient
(Memory and performance) Map is a very bad interface to be used generically .It
has too many methods and hard to implement as a generic Map like interface. New
developers fo not have to create implementations of SimpleMap. They'll use one
of the readily provided implementations like LinkedSimpleHashMap (which is
nothing but a LinkehashMap) .
You are seeing SimpleMap as a HashMap. I'm looking at a potential impl which
may be working off of a byte[], POJO, a file etc. Trust me, I have tried
implementing a Map from these and it was terrible
{quote}This interface does not refine the general contracts of the equals and
hashCode methods.
{quote}
This framework will never have to worry about the keys in CharSequence. It will
always work with Strings. There is a small subset of cases where the keys could
be never deserialized to String Objects for . (That is for very efficient
desrialization and serialization ) We will never have to worry about it
{quote}{quote}
{quote}
I believe it's not the responsibility of the plugin to return the final state
{quote}
It's possible to implement it in other ways. Any plugin that implements
AssignStratgey will have a memory model of the SolrCluster after the placement
is computed. What we need is the next computation to start from the previous
state. A very simple naive impl can easily return the original SolrCluster
Object. From my old experience, it is pretty easy to give a synthetic
SolrCluster that represents the new state. May be we can provide a utility
method to recreate this SolrCluster from the decisions. But, it is sub-optimal.
But any serious implementation MUST provide a final state because the executing
the decisions take time and usually other computations need to start computing
immediately
bq,Why do placement code need a URL to the node? Are we planning to allow
plugin code to go do whatever they want to do or are we targeting a controlled
(and simpler) environment?
You are missing the larger picture. This API is to be used across the Solr
codebase. The implemetations of these interfaces will most likely live in the
{{org.apache.solr.common.cloud}} packages and you will just consume them in the
placement. The placement framework will never need it not should it use it
{quote}A collection has a name so should likely have a getName() method.
{quote}
Thanks. I just added it. It was supposed to be there.
{quote}Replica would likely be a better and simpler name
{quote}
Yes, Replica was my preferred name. It was already taken and most likely this
will be a problem as we will see this all over the codebase and people reading
the code will be confused as to whiich one it is
{quote}NodeMetrics: This interface actually exposes a large part of the
implementation
{quote}
Yes. There is nothing wrong in concrete impls and not always interfaces. Let's
keep in mind that there are 2 types of interfaces
* The ones plugins implement. They should be purely interfaces
* The ones implemented by Solr and handed over to the plugin. it does not
matter if they are interfaces/classes/enums. What matters is the API surface
area is minimal and easily understood. We should not dogmatically go with "only
interfaces"
{quote}WorkOrder:Rather than having a notion of WorkOrder Type,
{quote}
Implementing the previous framework, there will only be a limited no:of of
WorkOrder types. I seean enum as more suitable (I'm ambivalent on this, we can
go either way)
{quote}AssignContext .A system in which everything can be fetched by a minimal
number of round trips to remote nodes (i.e. one per node) is preferable.
{quote}
Exactly. This is designed with that in mind. You should not make multiple calls
to the same node. All attributes of a node is fetched in one call (including
system properties {{NodeMetrics.Property.SYSPROP.getMetrics("someproperty");}})
{quote}As a general note, I believe sample code of how these interfaces are to
be used would be very useful
{quote}
Yes. They're coming.
> Provide a clean API for pluggable replica assignment implementations
> --------------------------------------------------------------------
>
> Key: SOLR-14613
> URL: https://issues.apache.org/jira/browse/SOLR-14613
> Project: Solr
> Issue Type: Improvement
> Components: AutoScaling
> Reporter: Andrzej Bialecki
> Assignee: Ilan Ginzburg
> Priority: Major
> Time Spent: 10h 10m
> Remaining Estimate: 0h
>
> As described in SIP-8 the current autoscaling Policy implementation has
> several limitations that make it difficult to use for very large clusters and
> very large collections. SIP-8 also mentions the possible migration path by
> providing alternative implementations of the placement strategies that are
> less complex but more efficient in these very large environments.
> We should review the existing APIs that the current autoscaling engine uses
> ({{SolrCloudManager}} , {{AssignStrategy}} , {{Suggester}} and related
> interfaces) to see if they provide a sufficient and minimal API for plugging
> in alternative autoscaling placement strategies, and if necessary refactor
> the existing APIs.
> Since these APIs are internal it should be possible to do this without
> breaking back-compat.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]