I've got an initial PR for this here: 
https://github.com/apache/incubator-livy/pull/367

It uses Ubuntu Xenial (similar to our Travis environment) and installs the 
necessary Python packages and R. I have not added Spark to the package as it 
gets pulled down as a dependency, but am considering doing that since it's such 
a large download.

I've got another version of the Dockerfile based off of "maven:3-jdk-8-alpine" 
that works as well. It's quite a bit smaller than the xenial version (521MB vs. 
1.18GB), but still figuring out exactly how I want the Docker image to work. A 
couple open questions

- Is it important to have the Dockerfile as close to travis as possible
- Will alpine-based images be tough to maintain/support
- Will we publish some version of the resulting images, if so size could be 
important

Damon

On 2022/12/03 01:20:45 Damon Cortesi wrote:
> Coming back to this as I get my dev environment up and running, there's 
> definitely an intermix of dependencies between Spark, Python, and R that I'm 
> still working out.
> 
> For example, when I try to start sparkR I get an error message that "package 
> ‘SparkR’ was built under R version 4.0.4", but locally I have R version 3.5.2 
> installed. Spark 3.3.1 says you need R 3.5+. That said, think my version of R 
> works with Spark2 (at least the tests indicate that...)
> 
> It'd be great to have a minimum viable environment with specific versions and 
> I hope to have that in a Docker environment by early next week. :) 
> 
> Currently I'm just basing it off a debian image with Java8, although there 
> are Spark images that could be useful...
> 
> Damon
> 
> On 2022/11/20 18:55:35 larry mccay wrote:
> > Considering there is no download for anything older than 3.2.x on the
> > referred download page, we likely need some change to the README.md to
> > reflect a more modern version.
> > We also need more explicit instructions for installing Spark than just the
> > download. Whether we detail this or point to Spark docs that are sufficient
> > is certainly a consideration.
> > 
> > At the end of the day, we are missing any sort of quick start guide for
> > devs to be able to successfully build and/or run tests.
> > 
> > Thoughts?
> > 
> > On Sat, Nov 19, 2022 at 6:23 PM larry mccay <[email protected]> wrote:
> > 
> > > Hey Folks -
> > >
> > > Our Livy README.md indicates the following:
> > >
> > > To run Livy, you will also need a Spark installation. You can get Spark
> > > releases at https://spark.apache.org/downloads.html.
> > >
> > > Livy requires Spark 2.4+. You can switch to a different version of Spark
> > > by setting the SPARK_HOME environment variable in the Livy server
> > > process, without needing to rebuild Livy.
> > >
> > > Do we have any variation on this setup at this point in the real world?
> > >
> > > What do your dev environments actually look like and how are you
> > > installing what versions of Spark as a dependency?
> > >
> > > Thanks!
> > >
> > > --larry
> > >
> > 
> 

Reply via email to