Kartik

Ok yes so your reply is definitely in the nifi wheelhouse.

For your original case whereby you want to copy but retain the original
object there are a few ways to do it.  One is to actually pull the data
from its original location and send a copy to your analytic system and also
give a copy back to the original system.

If you truly must keep the original where it was then there are really only
'ok' options.  You need nifi then to act as an idempotent receiver which
means it will keep state about what it has grabbed a copy of and will avoid
sending it through more than once.  Sounds like no big deal but it means
some database and constantly checking the same things and tension on
clustering.  It is in many ways something which isnt conducive to healthy
dataflow.  It can be done but isnt fun.

So before walking that path is putting back a copy of the data in the
original system but not in a directory you are polling an option?

Please feel free to subscribe to the mailing list so your notes will get
through without delay.

Thanks
Joe
On Apr 7, 2015 11:36 PM, "Kartik Veerepalli" <[email protected]>
wrote:

> Corey,
>
>
> My apologies for not making myself clear. But, the points you listed are
> exactly what I meant.
>
>
> Joe: I did checkout RSync, but we are planning to establish a continuos
> data flow pipeline from wide range of servers, message bus, etc. to HDFS.
> We think Apache Nifi can be integrated/used as a data flow system with our
> Analytics as a Service Platform that we are building. Thanks for the help.
>
>
> Kartik
>

Reply via email to