I'm making additional changes to the stateless build( https://issues.apache.org/jira/browse/PIO-47), particularly to ensure that the change does not affect running production servers using PIO.
First, with regards to @Pat's concerns: You'd still be able to run training on a separate (possibly ephemeral) machine and use the trained ModelData by providing engine-instance-id as CLI argument. PIO would automatically look for the latest trained model with the same filepath (engineVersion). I will add some tests to ensure that this process works. Also @Pat, I'd appreciate it if you could briefly explain the setup and common tasks for your production servers to make sure I don't miss anything. Some immediate questions: 1. When you copy the engine directory on multiple machines, do you include the .jar files (autogenerated during `pio build` in target/) for all the machines? 2. Are there any immediate needs/tasks running pio commands outside of the engine directory? Thanks, Chan On Sun, Dec 11, 2016 at 1:35 PM, Mars Hall <[email protected]> wrote: > Pat, > > Since I represent one of the containerized platforms (Heroku) that needs > stateless builds (specifically the ability to run a build without a > database attached) to deploy PIO at its full potential, I would love to be > able to contribute more to this discussion. Unfortunately I do not > understand most of the technicalities described here. > > How does someone like me learn about this aspect of PIO? Would you wise > folks step back and describe what the metadata does today? > > Are the three PIO_STORAGE types (data, model, & metadata) documented > clearly anywhere? What are the metadata options, what does it store, & how > does it effect the engine lifecycle today? > > One of the main sources of confusion I've seen from people trying to work > with PIO is the multiple storage aspects (data, model, & metadata) combined > with multiple services types (eventserver & engine) and how these all > interconnect. Plus some engines have a requirement like Elasticsearch, but > it's not clear where that's required in the grand scheme. > > Thanks for all the efforts to move forward with these changes, > > *Mars > > On Mon, Dec 12, 2016 at 03:13 Pat Ferrel <[email protected]> wrote: > >> OK, so that was too much detail. My immediate question is how to train on >> one machine and deploy on several others—all referencing the same instance >> data (model)? Before it was by copying the manifest, now there is no >> manifest. >> >> >> >> On Dec 7, 2016, at 5:43 PM, Pat Ferrel <[email protected]> wrote: >> >> My first question is how to train on an ephemeral machine to swap models >> into an already deployed prediction server, because this is what i do all >> the time. The only way to do this now is train first on dummy data then >> deploy and re-train as data comes in, but there are other issues and >> questions below. Some may be slightly off topic of this specific PR. >> >> >> > On Dec 5, 2016, at 10:00 AM, Donald Szeto <[email protected]> wrote: >> > >> > Hi all, >> > >> > I am moving the discussion of stateless build ( >> > https://github.com/apache/incubator-predictionio/pull/328) here. >> Replying >> > Pat: >> > >> >> BTW @chanlee514 @dszeto Are we thinking of a new command, something >> like >> > pio register that would add metadata to the metastore? This would need >> to >> > be run every time the engine.json changed for instance? It would also do >> > not compile? Is there an alternative? What state does this leave us in? >> > >> > I imagine we would need pio register after this. Something like what >> docker >> > push would do for you today. Changes of engine.json will not require >> > registration because it is consumed during runtime by pio train and pio >> > deploy. We are phasing out pio build so that engine templates will be >> more >> > friendly with different IDEs. >> >> I’m all for removing the manifest and stateless build but I’m not sure we >> mean the same thing by stateless. My issue is more with stateless commands, >> or put differently as a fully flexible workflow. Which means all commands >> read metadata from the metastore, and only one, very explicitly sets >> metadata into the metastore. Doing the write in train doesn't consider the >> the deploy before train and multi-tenancy use case. >> >> Deploy then train: >> 1) pio eventserver to start ES on any machine >> 2) pio deploy to get the query server (prediction server) on any machine >> 3) pio train at any time on any machine and have a mechanism for deployed >> engines to discover the metadata they need when they need it or have it >> automatically updated when changed (pick a method push for deployed engines >> and pull for train) >> 4) send input an any time >> >> Multi-tenancy: >> This seems to imply a user visible id for an engine instance id in >> today’s nomenclature. For multi-tenancy, the user is going to want to set >> this instance id somewhere and should have stateless commands, not only >> stateless build. >> >> > >> >> After the push, what action create binary (I assume pio build) what >> > action adds metadata to the metastore (I assume pio train) So does this >> > require they run on the same machine? They often do not. >> > pio build will still create the binary at this point (and hopefully >> phased >> > out as mentioned). Right now the only metadata that is disappearing are >> > engine manifests. Engine instances will still be written after pio >> train, >> > and used by pio deploy. >> > >> >> One more question. After push how do we run the PredictionServer or >> train >> > on multiple machines? In the past this required copying the >> manifest.json >> > and making sure binaries are in the same location on all machines. >> > "In the same location" is actually a downside IMO of the manifest.json >> > design. Without manifest.json now, you would need to run pio commands >> from >> > a location with a built engine, because instead of looking at engine >> > manifests, it will now look locally for engine JARs. So deployment would >> > still involve copying engine JARs to a remote deployment machine, >> running >> > pio commands at the engine template location with engine-id and >> > engine-version arguments. >> >> I guess I also don't understand the need for engine-id and >> engine-version. Let’s do away with them. There is one metadata object that >> points to input data id, params, model id, and binary. This id can be >> assigned by the user. >> >> With the above in place we are ready to imagine an EventServer where you >> POST to pio-ip/dataset/resource-id (no keys) and GET from >> pio-ip/model/resource-id to do queries. This would allow multi-tenancy and >> merge the EventServer and PredictionServer under the well understood banner >> of REST. Extending this a little further we have all the commands >> implemented as REST APIs. The CLI becomes some simple scripts or binaries >> that hit the REST interface and an admin server that hits the same >> interface. >> >> This is compatible with the simple stateless build as a first step as >> long as we don’t perpetuate hidden instance ids and stateful commands like >> a train that creates the hidden id. But maybe I misunderstand the code or >> plans for next steps? >> >> >> > >> > Regards, >> > Donald >> >>
