Hi , can someone please recommend a fast way in hadoop to store and
retrieve matrix of double values?

As of now we store values in text files and the read it in java using HDFS
inputstream and Scanner. *[0]* These files are actually vectors
representing a video file. Each vector is 883 X 200 and for one map job we
read 4 such vectors so *job is to convert 706,400 values to double*.

Using this approach we take ~ 1.5 second to convert all these values. I can
use a external cache server to avoid repeated conversion but I am looking
for a better solution.

[0] -
https://github.com/USCDataScience/hadoop-pot/blob/master/src/main/java/org/pooledtimeseries/PoT.java#L596


--
Madhav Sharan

Reply via email to