[ 
https://issues.apache.org/jira/browse/PIO-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15747066#comment-15747066
 ] 

ASF GitHub Bot commented on PIO-47:
-----------------------------------

Github user pferrel commented on the issue:

    https://github.com/apache/incubator-predictionio/pull/328
  
    The file path is required to be the same on any 2 machines and is 
explicitly in the metadata. If you have the deploy machine dir in a different 
path than the train machine dir (actually the build dir in 0.10.0) you'll get 
the completely incomprehensible "one for one" failure. At least this is one 
reason for it. 
    
    This IMO is  a major confusion point in PIO. Therefore I still think the ID 
should be explicit and engine-id and engine-version should be eradicated 
completely. 
    
    Can someone please tell me why we need anything more than 
    
     - a dataset-id  (app)
     - a model-id, which points to metadata used to create a model from a 
dataset
     - the metadata would contain the specific engine.json data used and the 
dataset-id (actually the appName in the engine but we need to remove this too) 
so it would connect the model-id to the dataset-id with all data needed to 
create the model from the dataset.
    
    In all commands (but one) this would allow the metadata to be retrieved 
from the metastore.
    
    We can then re-do the CLI to be completely stateless, and not require 
anything be copied between machines except a common knowledge of IDs.
    
    I will write up a proposal for this, maybe it will make sense but at least 
we can discuss it in concrete terms.
    



> Remove engine manifest for stateless build
> ------------------------------------------
>
>                 Key: PIO-47
>                 URL: https://issues.apache.org/jira/browse/PIO-47
>             Project: PredictionIO
>          Issue Type: New Feature
>            Reporter: Chan
>
> As discussed in the dev mailing list, removing engine manifest would be the 
> first step in improving the workflow towards a more modular design. 
> - Remove manifest.json completely. `pio build` will be stateless, and will 
> not write anything to the database. This will make it easier to compile/build 
> on PaaS platforms such as Heroku. Later, we can remove `pio build` command 
> entirely, so that PIO is independent of the build tool (sbt).
> - An immediate major disadvantage would be not being able to run pio commands 
> outside of the engine directory. This can be resolved in the next step of 
> creating a general metadata registry.
> - Meanwhile, we can use engineFactory as *engineId* , and SHA-1 hash of 
> engine filepath as *engineVersion* (as before). We can improve this when 
> designing a metadata registry, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to