Kartik,

Thanks for your interest in NiFi!

I know you've gotten a few responses to this already, but you're right - this is something we should address. I think the basic idea is that many people just pick up from a temp directory and push it back to a permanent directory.

But if that doesn't work for you, we could update the processor to do something a bit smarter. One idea that might make sense is to pick up the oldest files first. Then, we can keep track of the "last modified date" of the last file that it has picked up. This way, we can keep minimal state about what has been pulled in but still pull in only new data and avoid deleting it.

Do you think this solution would help you?

Thanks
-Mark



------ Original Message ------
From: "Kartik Veerepalli" <[email protected]>
To: "[email protected]" <[email protected]>
Sent: 4/7/2015 10:46:11 PM
Subject: Re: Conflict Resolution Strategy

Corey,


My apologies for not making myself clear. But, the points you listed are exactly what I meant.


Joe: I did checkout RSync, but we are planning to establish a continuos data flow pipeline from wide range of servers, message bus, etc. to HDFS. We think Apache Nifi can be integrated/used as a data flow system with our Analytics as a Service Platform that we are building. Thanks for the help.


Kartik

Reply via email to