[
https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938830#comment-16938830
]
Jan Høydahl commented on SOLR-13661:
------------------------------------
I have just looked at some of the code and will not have time for a more
thorough review until week after next.
Here is a list of my main concerns so far:
# My main concern is that too many decisions seem to be made with too few
eyes, combined with a goal of merging very soon.
# One example of "too few eyes" is that the "package" concept seems to be
designed for ONE use case only, customer's internal custom packages, with
arbitrary local naming of repos and packages. I think before such a feature
goes mainstream, the design should also include converting some of our contrib
modules to packages that we release as separate binaries in the mirrors, and
enable an "apache" Repo as default. That requires some more thought behind
stable name-spacing, so that e.g. “bin/solr install ltr” will mean the same for
all customers. Perhaps that would mean some name spacing or name collision
resolution, so if you have a custom local repo with a package also called
"ltr", then you get an error which can be resolved by qualifying the package
name like e.g. "apache:ltr" or "mylocalrepo:ltr".
# We need a plan for how 3rd party plugin developers can publish their plugins
on their own web site or on GitHub in a well defined way. The use of
pf4j-update lib takes care of much of this, and this is also something that can
be added incrementally, but the design needs to allow for this. My POC has a
RepositoryFactory class that parses the repo URL (e.g. "bin/solr plugin repo
add myrepo [https://host.com/repo/name]") and selects the
GitHubUpdateRepository if it is a GitHub URL, the ApacheMirrorsUpdateRepository
if it is an apache.org address, and the default site/FS repo else. Each of
these handle the download process and signature verification in a different way.
# Hot/Cold deploy. I don't like systems where you, as part of the install need
to spin up a server. We already have this with setting urlScheme in ZK for
HTTPS. But ideally it should be possible to do a Solr install including plugins
before you need to spin up Solr. Elasticsearch uses such a static plugin
installer (but also don't support hot install). Having a "staging" folder where
you can drop package ZIP files (or JARs) where the node can self-install
packages during first startup could be one way to handle this.
# Robustness during upgrades is another concern. I don't see mentioned in the
design doc what happens during a Solr upgrade. We should think through the
scenario for both minor and major version upgrade for Solr, and then I mean
rolling upgrade. Having ZK as only master for what version of a plugin should
be used is probably not sufficient, as during a rolling Solr upgrade, you could
have one node on 8.3 and another node on, say, 9.0. And you could have
packageA:1.0 installed but Solr9 requires v2.0 due to removal of some APIs or
what not. In the cold scenario (as in POC) you'd shut down a Solr node, upgrade
Solr, then run "bin/solr plugin upgrade outdated" before starting that node
again, and that would make sure it has the correct plugin version. Since you
cannot upgrade Solr while it is running, perhaps we need to hook in some
validation on node startup that it does not have any packages that won't work
with that Solr version, and refuse to start. And some way to have two versions
of a package installed at the same time, and then instead of using the latest,
the Solr node will select the newest version that is compatible. Then when that
node is upgraded it will select the new version of the plugin automatically
based on Version.java.
# Package system deserves its own Znode in Zookeeper instead of abusing
clusterProps
# I don't like the concept of an admin needing to "deploy" a package to a
collection using a command. Rather, the collections should require a set of
packages (optionally with min version) and fail to start if it is not available
in the system. If the package is available in the system, the collection should
gain access to the package(s) it required without running a deploy command.
# Simplicity should be front seat. Don't force users to have to add
{{package="my-pkg"}} wherever they today can say
{{class="com.example.MyPlugin"}}. This is what we have ResourceLoader and class
loaders for. If we cannot find {{com.example.MyPlugin}} in main class loader,
then hunt through every package class loader until you have a match, if no
match then throw ClassNotFound. (I never liked the {{runtimeLib=true}}
equivalent in the old blob store.)
# The package design says that a manifest is not required for a package and
that any plain jar can function as a package just by registering it manually.
That is ok as an alternative workflow, but most packages (and all official
Apache and from official public 3rd party repos) should be required to have a
manifest with package name, version, she-hash, signature and
version-compatibility.
# Some packages (contribs) have a ton of jars, thus we should support both
PF4J's JAR and ZIP plugins. Other plugin formats can be added later (such as
jar + list of maven coordinates for dependencies)
# The plugin initialization commands seem complex and unnecessary for a first
version. An alternative solution to the install/upgrade problem that I have
been thinking of is to design a well-defined {{Package.java}} base-class, and
if Solr finds that class in a package/plugin, then it will execute pre-defined
methods on that class, such as {{upgrade(Version from, Version to)}} and
{{uninstall()}} what will be called in different phases. PF4J defines a
Plugin.java class
([https://github.com/pf4j/pf4j/blob/master/pf4j/src/main/java/org/pf4j/Plugin.java)]
with start(), stop() and delete() methods for the different life cycles. We
could of course use and extend that, or implement something similar for the
plugin to be able to do anything it likes as part of installing, upgrading,
uninstalling... That's why I think we should take out the "setup-command" stuff
from v1.
# The FSBlobstore will be very user-unfriendly to browse with sha256 as file
names. Appending file names has been suggested. Other solutions could be an
additional <sha256>.properties file alongside the blob file that contains the
real file name, an possibly other metadata as well, such as what package it is
part of.
Puuh. This is what I got so far on the architecture and design level. This
aspect is more important than code review at this stage, so I'll leave it there
for now.
Please do not take this feedback as criticism Noble/Ishan. None of these
bullets are absolutes, and I do not think that I have the only possible
solution. But I put a lot of careful thought into the POC which I feel is
largely lacking here. I'm pretty sure we as a community will arrive at a better
design by considering and discussing more use cases than the one use case/need
of your employer.
> A package management system for Solr
> ------------------------------------
>
> Key: SOLR-13661
> URL: https://issues.apache.org/jira/browse/SOLR-13661
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Noble Paul
> Assignee: Ishan Chattopadhyaya
> Priority: Major
> Labels: package
> Attachments: plugin-usage.png, repos.png
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Here's the design doc:
> https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]