"^A" is used as delimiter in the file.
However, I don't think this is the reason causing the problem, because
there are files also using "^A" as delimiter but with no problem.
BTW, the reason using "^A" as delimiter is these files are hive data.

On Sat, Jan 7, 2017 at 12:17 AM, Ravi Prakash <[email protected]> wrote:
> Is there a carriage return / new line / some other whitespace which `cat`
> may be appending?
>
> On Thu, Jan 5, 2017 at 6:09 PM, Mungeol Heo <[email protected]> wrote:
>>
>> Hello,
>>
>> Suppose, I name the HDFS file which cause the problem as A.
>>
>> hdfs dfs -ls A
>> -rw-r--r--   3 web_admin hdfs  868003931 2017-01-04 09:05 A
>>
>> hdfs dfs -get A AFromGet
>> hdfs dfs -cat A > AFromCat
>>
>> ls -l
>> -rw-r--r-- 1 hdfs hadoop 883715443 Jan  5 18:32 AFromGet
>> -rw-r--r-- 1 hdfs hadoop 883715443 Jan  5 18:32 AFromCat
>>
>> hdfs dfs -put AFromGet
>>
>> diff <(hdfs dfs -cat  A) <(hdfs dfs -cat AFromGet)
>> (no output, which means the contents of two files are same. At least,
>> after "cat")
>>
>> hdfs dfs -checksum A
>> A   MD5-of-262144MD5-of-512CRC32C
>> 000002000000000000040000e667fb4f0dda78101feb2b689af8260b
>>
>> hdfs dfs -checksum AFromGet
>> AFromGet   MD5-of-262144MD5-of-512CRC32C
>> 0000020000000000000400007284759249ff98c7395e6a4bb59343dc
>>
>> As I listed some results above. I wonder why is the size of the file
>> changed.
>> Any help will be GREAT!
>>
>> Thank you.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to