[jira] [Commented] (SOLR-13661) A package management system for Solr

Jira Thu, 26 Sep 2019 10:26:09 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938830#comment-16938830
 ]


Jan Høydahl commented on SOLR-13661:
------------------------------------

I have just looked at some of the code and will not have time for a more 
thorough review until week after next.

Here is a list of my main concerns so far:
 # My main concern is that too many decisions seem to be made with too few 
eyes, combined with a goal of merging very soon.
 # One example of "too few eyes" is that the "package" concept seems to be 
designed for ONE use case only, customer's internal custom packages, with 
arbitrary local naming of repos and packages. I think before such a feature 
goes mainstream, the design should also include converting some of our contrib 
modules to packages that we release as separate binaries in the mirrors, and 
enable an "apache" Repo as default. That requires some more thought behind 
stable name-spacing, so that e.g. “bin/solr install ltr” will mean the same for 
all customers. Perhaps that would mean some name spacing or name collision 
resolution, so if you have a custom local repo with a package also called 
"ltr", then you get an error which can be resolved by qualifying the package 
name like e.g. "apache:ltr" or "mylocalrepo:ltr".
 # We need a plan for how 3rd party plugin developers can publish their plugins 
on their own web site or on GitHub in a well defined way. The use of 
pf4j-update lib takes care of much of this, and this is also something that can 
be added incrementally, but the design needs to allow for this. My POC has a 
RepositoryFactory class that parses the repo URL (e.g. "bin/solr plugin repo 
add myrepo [https://host.com/repo/name]";) and selects the 
GitHubUpdateRepository if it is a GitHub URL, the ApacheMirrorsUpdateRepository 
if it is an apache.org address, and the default site/FS repo else. Each of 
these handle the download process and signature verification in a different way.
 # Hot/Cold deploy. I don't like systems where you, as part of the install need 
to spin up a server. We already have this with setting urlScheme in ZK for 
HTTPS. But ideally it should be possible to do a Solr install including plugins 
before you need to spin up Solr. Elasticsearch uses such a static plugin 
installer (but also don't support hot install). Having a "staging" folder where 
you can drop package ZIP files (or JARs) where the node can self-install 
packages during first startup could be one way to handle this.
 # Robustness during upgrades is another concern. I don't see mentioned in the 
design doc what happens during a Solr upgrade. We should think through the 
scenario for both minor and major version upgrade for Solr, and then I mean 
rolling upgrade. Having ZK as only master for what version of a plugin should 
be used is probably not sufficient, as during a rolling Solr upgrade, you could 
have one node on 8.3 and another node on, say, 9.0. And you could have 
packageA:1.0 installed but Solr9 requires v2.0 due to removal of some APIs or 
what not. In the cold scenario (as in POC) you'd shut down a Solr node, upgrade 
Solr, then run "bin/solr plugin upgrade outdated" before starting that node 
again, and that would make sure it has the correct plugin version. Since you 
cannot upgrade Solr while it is running, perhaps we need to hook in some 
validation on node startup that it does not have any packages that won't work 
with that Solr version, and refuse to start. And some way to have two versions 
of a package installed at the same time, and then instead of using the latest, 
the Solr node will select the newest version that is compatible. Then when that 
node is upgraded it will select the new version of the plugin automatically 
based on Version.java.
 # Package system deserves its own Znode in Zookeeper instead of abusing 
clusterProps
 # I don't like the concept of an admin needing to "deploy" a package to a 
collection using a command. Rather, the collections should require a set of 
packages (optionally with min version) and fail to start if it is not available 
in the system. If the package is available in the system, the collection should 
gain access to the package(s) it required without running a deploy command.
 # Simplicity should be front seat. Don't force users to have to add 
{{package="my-pkg"}} wherever they today can say 
{{class="com.example.MyPlugin"}}. This is what we have ResourceLoader and class 
loaders for. If we cannot find {{com.example.MyPlugin}} in main class loader, 
then hunt through every package class loader until you have a match, if no 
match then throw ClassNotFound. (I never liked the {{runtimeLib=true}} 
equivalent in the old blob store.)
 # The package design says that a manifest is not required for a package and 
that any plain jar can function as a package just by registering it manually. 
That is ok as an alternative workflow, but most packages (and all official 
Apache and from official public 3rd party repos) should be required to have a 
manifest with package name, version, she-hash, signature and 
version-compatibility.
 # Some packages (contribs) have a ton of jars, thus we should support both 
PF4J's JAR and ZIP plugins. Other plugin formats can be added later (such as 
jar + list of maven coordinates for dependencies)
 # The plugin initialization commands seem complex and unnecessary for a first 
version. An alternative solution to the install/upgrade problem that I have 
been thinking of is to design a well-defined {{Package.java}} base-class, and 
if Solr finds that class in a package/plugin, then it will execute pre-defined 
methods on that class, such as {{upgrade(Version from, Version to)}} and 
{{uninstall()}} what will be called in different phases. PF4J defines a 
Plugin.java class 
([https://github.com/pf4j/pf4j/blob/master/pf4j/src/main/java/org/pf4j/Plugin.java)]
 with start(), stop() and delete() methods for the different life cycles. We 
could of course use and extend that, or implement something similar for the 
plugin to be able to do anything it likes as part of installing, upgrading, 
uninstalling... That's why I think we should take out the "setup-command" stuff 
from v1.
 # The FSBlobstore will be very user-unfriendly to browse with sha256 as file 
names. Appending file names has been suggested. Other solutions could be an 
additional <sha256>.properties file alongside the blob file that contains the 
real file name, an possibly other metadata as well, such as what package it is 
part of.

Puuh. This is what I got so far on the architecture and design level. This 
aspect is more important than code review at this stage, so I'll leave it there 
for now.

Please do not take this feedback as criticism Noble/Ishan. None of these 
bullets are absolutes, and I do not think that I have the only possible 
solution. But I put a lot of careful thought into the POC which I feel is 
largely lacking here. I'm pretty sure we as a community will arrive at a better 
design by considering and discussing more use cases than the one use case/need 
of your employer.

> A package management system for Solr
> ------------------------------------
>
>                 Key: SOLR-13661
>                 URL: https://issues.apache.org/jira/browse/SOLR-13661
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Noble Paul
>            Assignee: Ishan Chattopadhyaya
>            Priority: Major
>              Labels: package
>         Attachments: plugin-usage.png, repos.png
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Here's the design doc:
> https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-13661) A package management system for Solr

Reply via email to