Thanks! What about Kafka with Flume? And also I would like to tell that
everyday data intake is in millions and can't afford to loose even a single
piece of data. Which makes a need of  high availablity.

Warm Regards

Sidharth Kumar | Mob: +91 8197 555 599/7892 192 367 |  LinkedIn:
www.linkedin.com/in/sidharthkumar2792






On 30-Jun-2017 10:04 AM, "JP gupta" <[email protected]> wrote:

> The ideal sequence should be:
>
> 1.      Ingress using Kafka -> Validation and processing using Spark ->
> Write into any NoSql DB or Hive.
>
> From my recent experience, writing directly to HDFS can be slow depending
> on the data format.
>
>
>
> Thanks
>
> JP
>
>
>
> *From:* Sudeep Singh Thakur [mailto:[email protected]]
> *Sent:* 30 June 2017 09:26
> *To:* Sidharth Kumar
> *Cc:* Maggy; [email protected]
> *Subject:* Re: Kafka or Flume
>
>
>
> In your use Kafka would be better because you want some transformations
> and validations.
>
> Kind regards,
> Sudeep Singh Thakur
>
>
>
> On Jun 30, 2017 8:57 AM, "Sidharth Kumar" <[email protected]>
> wrote:
>
> Hi,
>
>
>
> I have a requirement where I have all transactional data injestion into
> hadoop in real time and before storing the data into hadoop, process it to
> validate the data. If the data failed to pass validation process , it will
> not be stored into hadoop. The validation process also make use of
> historical data which is stored in hadoop. So, my question is which
> injestion tool will be best for this Kafka or Flume?
>
>
>
> Any suggestions will be a great help for me.
>
>
> Warm Regards
>
> Sidharth Kumar | Mob: +91 8197 555 599/7892 192 367 |  LinkedIn:
> www.linkedin.com/in/sidharthkumar2792
>
>
>
>
>
>

Reply via email to