Integration tests is a great idea.

Yea I guess going forward a direct git to download templates may be a
viable option.

Simon

On Friday, July 22, 2016, Donald Szeto <[email protected]> wrote:

> Hey guys,
>
> This proposal of adding integration tests is awesome!
>
> Echoing Xusen, I recall Pat suggested we could remove pio template get, so
> it should be okay to just git clone the template from somewhere. I think
> Marcin is including the template now because currently templates still use
> artifacts under the old io.prediction package namespace.
>
> Regards,
> Donald
>
> On Friday, July 22, 2016, Xusen Yin <[email protected] <javascript:;>>
> wrote:
>
> > Hi Marcin,
> >
> > Personally I vote for adding integration tests. Thanks for the proposal.
> > One suggestion is about the test scenarios. IMHO there is no need to add
> > the recommendation-engin template inside the predictionio codebase. Why
> not
> > use pio template to download them from Github when testing?
> >
> > Another concern is the docker pull ziemin/pio-testing in the travis
> config
> > file. And there is also a testing/Dockerfile which starts from ubuntu.
> So I
> > think either we should use docker pull ubuntu, or we use a pre-built
> > testing Docker image instead of the Dockerfile.
> >
> > Best
> > Xusen Yin
> >
> > > On Jul 22, 2016, at 2:52 PM, Marcin Ziemiński <[email protected]
> <javascript:;>
> > <javascript:;>> wrote:
> > >
> > > Hi!
> > >
> > > I have a feeling that PredictionIO is lacking integration tests.
> TravisCI
> > > is executed only on unit tests residing in the repository. Not only
> > better
> > > tests are important for keeping quality of the project, but also for
> the
> > > sheer comfort of development. Therefore, I tried to come up with some
> > > simple basis for adding and building tests.
> > >
> > >   - Integration tests should be agnostic to environment settings (it
> > >   should not matter whether we use Postgres or HBase)
> > >   - They should be easy to run for developers and the configuration
> > should
> > >   not pollute their working space
> > >
> > > I have pushed a sequence of commits to my personal fork and ran travis
> > > builds on them - Diff with upstream
> > > <
> >
> https://github.com/apache/incubator-predictionio/compare/develop...Ziemin:testing-infrastructure
> > >
> > >
> > > The following changes were introduced:
> > >
> > >   - Dedicated Docker image was prepared. This image fetches and
> prepares
> > >   some possible dependencies for PredictionIO - postgres, hbase, spark,
> > >   elasticsearch.
> > >   Upon container initialization all services are started including
> spark
> > >   standalone cluster. The best way to start it is to use
> > >   testing/run_docker.sh script, which binds relevant ports, mounts
> shared
> > >   directories with ivy2 cache and PredictionIO's code repository. More
> > >   importantly it sets up pio's configuration, e.g.:
> > >   $ /run_docker.sh PGSQL HBASE HDFS ~/projects/incubator-predictionio
> > >   '/pio_host/testing/simple_scenario/run_scenario.sh'
> > >   This command should set metadata repo to PGSQL, event data to HBASE
> and
> > >   model data to HDFS. The last two arguments are path to repo and a
> > command
> > >   to run from inside the container.
> > >   An important thing to note is that container expects a tar with the
> > >   built distribution to be found in shared /pio_host directory, which
> is
> > >   later unpacked.
> > >   User can then safely execute all pio ... commands. By default
> container
> > >   pop up a bash shell if not given any other commands.
> > >   - Currently there is only one simple test added, which is just a copy
> > of
> > >   the steps mentioned in the quickstart tutorial.
> > >   - .travis.yml was modified to run 4 concurrent builds: one for unit
> > >   tests as previously and three integration tests for various
> > combinations of
> > >   services
> > >   env:
> > >     global:
> > >       - PIO_HOME=`pwd`
> > >
> > >     matrix:
> > >       - BUILD_TYPE=Unit
> > >       - BUILD_TYPE=Integration METADATA_REP=PGSQL EVENTDATA_REP=PGSQL
> > >   MODELDATA_REP=PGSQL
> > >       - BUILD_TYPE=Integration METADATA_REP=ELASTICSEARCH
> > >   EVENTDATA_REP=HBASE MODELDATA_REP=LOCALFS
> > >       - BUILD_TYPE=Integration METADATA_REP=ELASTICSEARCH
> > >   EVENTDATA_REP=PGSQL MODELDATA_REP=HDFS
> > >   Here you can find the build logs: travis logs
> > >   <
> https://travis-ci.org/Ziemin/incubator-predictionio/builds/146753806>
> > >   What is more, to make build times shorter, ivy jars are cached on
> > travis
> > >   now, so that they are included faster in subsequent tests.
> > >
> > > The current setup let developers have an easy way to run tests for
> > > different environment settings in a deterministic way, as well as use
> > > travis or other CI tools in more convenient way. What is left to do now
> > is
> > > to prepare a sensible set of different tests written in a concise and
> > > extensible way. I think that ideally we could use python API and add a
> > > small library to it focused strictly on our testing purposes.
> > >
> > > Any insights would be invaluable.
> > >
> > >
> > > Regards,
> > >
> > > -- Marcin
> >
> >
>

Reply via email to