Store them within a sequencefile

On Thursday, 18 August 2016, Madhav Sharan <[email protected]> wrote:

> Hi , can someone please recommend a fast way in hadoop to store and
> retrieve matrix of double values?
>
> As of now we store values in text files and the read it in java using HDFS
> inputstream and Scanner. *[0]* These files are actually vectors
> representing a video file. Each vector is 883 X 200 and for one map job we
> read 4 such vectors so *job is to convert 706,400 values to double*.
>
> Using this approach we take ~ 1.5 second to convert all these values. I
> can use a external cache server to avoid repeated conversion but I am
> looking for a better solution.
>
> [0] - https://github.com/USCDataScience/hadoop-pot/
> blob/master/src/main/java/org/pooledtimeseries/PoT.java#L596
>
> --
> Madhav Sharan
>
>

Reply via email to