Re: Stateless Builds

Chan Lee Tue, 10 Jan 2017 10:42:09 -0800

I'm making additional changes to the stateless build(
https://issues.apache.org/jira/browse/PIO-47), particularly to ensure that
the change does not affect running production servers using PIO.


First, with regards to @Pat's concerns: You'd still be able to run training
on a separate (possibly ephemeral) machine and use the trained ModelData by
providing engine-instance-id as CLI argument. PIO would automatically look
for the latest trained model with the same filepath (engineVersion). I will
add some tests to ensure that this process works.

Also @Pat, I'd appreciate it if you could briefly explain the setup and
common tasks for your production servers to make sure I don't miss anything.
Some immediate questions:
1. When you copy the engine directory on multiple machines, do you include
the .jar files (autogenerated during `pio build` in target/) for all the
machines?
2. Are there any immediate needs/tasks running pio commands outside of the
engine directory?

Thanks,
Chan

On Sun, Dec 11, 2016 at 1:35 PM, Mars Hall <[email protected]> wrote:

> Pat,
>
> Since I represent one of the containerized platforms (Heroku) that needs
> stateless builds (specifically the ability to run a build without a
> database attached) to deploy PIO at its full potential, I would love to be
> able to contribute more to this discussion. Unfortunately I do not
> understand most of the technicalities described here.
>
> How does someone like me learn about this aspect of PIO? Would you wise
> folks step back and describe what the metadata does today?
>
> Are the three PIO_STORAGE types (data, model, & metadata) documented
> clearly anywhere? What are the metadata options, what does it store, & how
> does it effect the engine lifecycle today?
>
> One of the main sources of confusion I've seen from people trying to work
> with PIO is the multiple storage aspects (data, model, & metadata) combined
> with multiple services types (eventserver & engine) and how these all
> interconnect. Plus some engines have a requirement like Elasticsearch, but
> it's not clear where that's required in the grand scheme.
>
> Thanks for all the efforts to move forward with these changes,
>
> *Mars
>
> On Mon, Dec 12, 2016 at 03:13 Pat Ferrel <[email protected]> wrote:
>
>> OK, so that was too much detail. My immediate question is how to train on
>> one machine and deploy on several others—all referencing the same instance
>> data (model)? Before it was by copying the manifest, now there is no
>> manifest.
>>
>>
>>
>> On Dec 7, 2016, at 5:43 PM, Pat Ferrel <[email protected]> wrote:
>>
>> My first question is how to train on an ephemeral machine to swap models
>> into an already deployed prediction server, because this is what i do all
>> the time. The only way to do this now is train first on dummy data then
>> deploy and re-train as data comes in, but there are other issues and
>> questions below. Some may be slightly off topic of this specific PR.
>>
>>
>> > On Dec 5, 2016, at 10:00 AM, Donald Szeto <[email protected]> wrote:
>> >
>> > Hi all,
>> >
>> > I am moving the discussion of stateless build (
>> > https://github.com/apache/incubator-predictionio/pull/328) here.
>> Replying
>> > Pat:
>> >
>> >> BTW @chanlee514 @dszeto Are we thinking of a new command, something
>> like
>> > pio register that would add metadata to the metastore? This would need
>> to
>> > be run every time the engine.json changed for instance? It would also do
>> > not compile? Is there an alternative? What state does this leave us in?
>> >
>> > I imagine we would need pio register after this. Something like what
>> docker
>> > push would do for you today. Changes of engine.json will not require
>> > registration because it is consumed during runtime by pio train and pio
>> > deploy. We are phasing out pio build so that engine templates will be
>> more
>> > friendly with different IDEs.
>>
>> I’m all for removing the manifest and stateless build but I’m not sure we
>> mean the same thing by stateless. My issue is more with stateless commands,
>> or put differently as a fully flexible workflow. Which means all commands
>> read metadata from the metastore, and only one, very explicitly sets
>> metadata into the metastore. Doing the write in train doesn't consider the
>> the deploy before train and multi-tenancy use case.
>>
>> Deploy then train:
>> 1) pio eventserver to start ES on any machine
>> 2) pio deploy to get the query server (prediction server) on any machine
>> 3) pio train at any time on any machine and have a mechanism for deployed
>> engines to discover the metadata they need when they need it or have it
>> automatically updated when changed (pick a method push for deployed engines
>> and pull for train)
>> 4) send input an any time
>>
>> Multi-tenancy:
>> This seems to imply a user visible id for an engine instance id in
>> today’s nomenclature. For multi-tenancy, the user is going to want to set
>> this instance id somewhere and should have stateless commands, not only
>> stateless build.
>>
>> >
>> >> After the push, what action create binary (I assume pio build) what
>> > action adds metadata to the metastore (I assume pio train) So does this
>> > require they run on the same machine? They often do not.
>> > pio build will still create the binary at this point (and hopefully
>> phased
>> > out as mentioned). Right now the only metadata that is disappearing are
>> > engine manifests. Engine instances will still be written after pio
>> train,
>> > and used by pio deploy.
>> >
>> >> One more question. After push how do we run the PredictionServer or
>> train
>> > on multiple machines? In the past this required copying the
>> manifest.json
>> > and making sure binaries are in the same location on all machines.
>> > "In the same location" is actually a downside IMO of the manifest.json
>> > design. Without manifest.json now, you would need to run pio commands
>> from
>> > a location with a built engine, because instead of looking at engine
>> > manifests, it will now look locally for engine JARs. So deployment would
>> > still involve copying engine JARs to a remote deployment machine,
>> running
>> > pio commands at the engine template location with engine-id and
>> > engine-version arguments.
>>
>> I guess I also don't understand the need for engine-id and
>> engine-version. Let’s do away with them. There is one metadata object that
>> points to input data id, params, model id, and binary. This id can be
>> assigned by the user.
>>
>> With the above in place we are ready to imagine an EventServer where you
>> POST to pio-ip/dataset/resource-id (no keys) and GET from
>> pio-ip/model/resource-id to do queries. This would allow multi-tenancy and
>> merge the EventServer and PredictionServer under the well understood banner
>> of REST. Extending this a little further we have all the commands
>> implemented as REST APIs. The CLI becomes some simple scripts or binaries
>> that hit the REST interface and an admin server that hits the same
>> interface.
>>
>> This is compatible with the simple stateless build as a first step as
>> long as we don’t perpetuate hidden instance ids and stateful commands like
>> a train that creates the hidden id. But maybe I misunderstand the code or
>> plans for next steps?
>>
>>
>> >
>> > Regards,
>> > Donald
>>
>>

Re: Stateless Builds

Reply via email to