[jira] [Comment Edited] (SOLR-15080) Apache Zeppelin Sandbox Integration

Jason Gerlowski (Jira) Fri, 26 Feb 2021 10:09:04 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291849#comment-17291849
 ]


Jason Gerlowski edited comment on SOLR-15080 at 2/26/21, 6:08 PM:
------------------------------------------------------------------

I've attached an updated version of this patch.  This version massages the CLI 
syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some 
issues around starting/stopping Zeppelin and puts better help text in place 
(which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect 
world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* 
interpreters, and one that only includes a minimal set.  I assumed I'd 
accidentally used the former instead of the latter, but it turns out that the 
patch *does* use the minimal download (it's just not all that minimal).  I'm 
going to open a Zeppelin ticket to discuss making the minimal distribution 
moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the 
{{update_interpreter}} subcommand.  If I can clear those away soon I'll be 
looking to merge in the next week or so, so I'd love any testing help that 
people could offer on their own systems.  Eric, I think I addressed most of 
your feedback (other than creating additional zeppelin-solr commands, which we 
can handle independently of the integration here), but if I missed something or 
you've got more suggestions let me know!

----

To get back to the question around making the nyc311 dataset available.  I def 
agree that we should allow that, but I'm unsure about the approach so I'd 
rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e 
example}} mechanism, but on second thought I'm less sure of this approach.  
Currently, Solr "examples" couple together the node/core topology with the 
dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  
Which is less than ideal.  Ideally you could run something like {{bin/solr 
example}} to set up a particular topology or deployment config, and then have a 
command like {{bin/solr exampledata}} capable of loading datasets into any of 
the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting 
that out.


was (Author: gerlowskija):
I've attached an updated version of this patch.  This version massages the CLI 
syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some 
issues around starting/stopping Zeppelin and puts better help text in place 
(which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect 
world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* 
interpreters, and one that only includes a minimal set.  I assumed I'd 
accidentally used the former instead of the latter, but it turns out that the 
patch *does* use the minimal download (it's just not all that minimal).  I'm 
going to open a Zeppelin ticket to discuss making the minimal distribution 
moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the 
{{update_interpreter}} subcommand.  If I can clear those away soon I'll be 
looking to merge in the next week or so, so I'd love any testing help that 
people could offer on their own systems.

----

To get back to the question around making the nyc311 dataset available.  I def 
agree that we should allow that, but I'm unsure about the approach so I'd 
rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e 
example}} mechanism, but on second thought I'm less sure of this approach.  
Currently, Solr "examples" couple together the node/core topology with the 
dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  
Which is less than ideal.  Ideally you could run something like {{bin/solr 
example}} to set up a particular topology or deployment config, and then have a 
command like {{bin/solr exampledata}} capable of loading datasets into any of 
the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting 
that out.

> Apache Zeppelin Sandbox Integration  
> -------------------------------------
>
>                 Key: SOLR-15080
>                 URL: https://issues.apache.org/jira/browse/SOLR-15080
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Jason Gerlowski
>            Assignee: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the steady expansion of Solr's "Math Expression" and "Streaming 
> Expression" libraries, Solr has a lot of analytics and data exploration 
> capabilities to show off in a "notebook" environment.  Case in point - the 
> "Visual Guide to Math Expressions" being worked on in SOLR-13105.  These docs 
> make heavy use of screenshots taken from Zeppelin, a popular notebook project 
> run by the ASF.  Interested readers are going to want to try their own hand 
> at replicating the specific visualizations showed off in those docs, and in 
> using Solr's analytics capabilities more broadly.
> Zeppelin isn't hard to set up and run, but there are a few steps that might 
> deter or thwart unfamiliar users.  I'd love to see Solr make this easier by 
> offering some sort of integration point with Zeppelin to get users up and 
> running.
> I'm still up in the air on what form would be best for such an integration.  
> But as a strawman I've attached a patch that creates a "zeppelin" tool for 
> "bin/solr".
> This tool is in the same spirit as our Solr "examples" in that it sets a user 
> up to play with a particular use case without any fuss or configuration on 
> their part.  It will install Zeppelin, the Zeppelin "interpreter" needed to 
> talk to Solr, and the Zeppelin configs necessary to talk to a local Solr.  It 
> contains other commands to start/stop Zeppelin and clean out the Zeppelin 
> sandbox, but draws the line there in terms of exposing Zeppelin functionality 
> more broadly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-15080) Apache Zeppelin Sandbox Integration

Reply via email to