deploy` outside of engine directory

Pat Ferrel (JIRA) Sat, 21 Jan 2017 12:39:55 -0800

    [ 
https://issues.apache.org/jira/browse/PIO-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833136#comment-15833136
 ]


Pat Ferrel commented on PIO-51:
-------------------------------

I suggest we not use the path-to-template-code in this way. I'd suggest for 
several reasons, perhaps the biggest reason is that several engine-instances 
may use the same code. I have several deployments that make user of this. Only 
the engine.json is unique to the engine-instance. Please read the other reasons 
in this suggested CLI: 

1) pio register/unregister --engine-id <some-id> --engine <path/to/engine.json>
2) add the path-to-template-code to engine.json so it is explicitly part of the 
metadata, it is put there now by `pio build`, which will soon be replaced by 
`sbt build` and this way the source of the path is not a hidden thing it is 
explicit. Making this path hidden to the user is confusing and using the code 
path for more that one engine.json becomes impossible.
3) all commands: sbt build, pio train and deploy read from the shared metadata 
in the metastore because they all would require the engine-instance-id in the 
CLI (well maybe not sbt build).

If it is used in the above form it is important that the engine-instance-id 
*not* be the path to the template code or a hash thereof. It must be settable 
by the user. If we want to create an id, when it is not supplied, hashing the 
*path-to-engine.json*--that would be OK but the id should still be visible and 
used by the user required in all commands. The path-to-code is not uniquely 
1-to-1 for each engine-instance, only the engine.json, or engine2.json, or 
xyz-engine.json is unique. The code may be the same for all these engines and 
so should be in the metadata and therefore in engine.json. I have existing 
deployments where this is true. 

The benefits:

1) every change to metadata is obvious and well understood
2) no command but pio register touches the metadata, all others read from it
3) all commands run from anywhere and theoretically in any order
4) the id we are creating can be used in REST to identify the engine instance 
even when the code to be run is the same
5) code is attached to metadata, not used to identify metadata so the same 
class can service several engine-instances with different metadata.

#4 is not a pipedream, we are already creating a REST microservice (to be in a 
feature branch) to support the use of this new id. 

> Enable `pio build/train/deploy` outside of engine directory
> -----------------------------------------------------------
>
>                 Key: PIO-51
>                 URL: https://issues.apache.org/jira/browse/PIO-51
>             Project: PredictionIO
>          Issue Type: Improvement
>            Reporter: Chan
>
> Users can now provide the engine directory path as —engine-dir or -ed, and 
> call `pio build/train/deploy` from anywhere.
> The “engineVersion” used to identify a prediction engine is created using the 
> hash of the engine directory path. As a result, the filepath of the engine 
> had to be kept the same in a distributed setup, with multiple machines using 
> the same trained model. This was a point of confusion for some users, which 
> led to this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIO-51) Enable `pio build/train/deploy` outside of engine directory

Reply via email to