[ 
https://issues.apache.org/jira/browse/HADOOP-12047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981646#comment-14981646
 ] 

Kai Zheng commented on HADOOP-12047:
------------------------------------

You raised two good points. 
1) to cache and reuse the buffers sounds good, I didn't do it here because this 
Java RS coder will be deprecated soon when the new Java coder is in. Please 
look at HADOOP-12041. It wouldn't be really used in a production system, so for 
now what I'm doing is keeping its logic right, please note it's already 
complicated now. How about marking the idea as a TODO comment for future 
consideration when it's really found necessary?
2) Changing input buffers' position is intended. For input byte buffers, when 
their data are consumed, their position will be moved forward, though the 
contents in the byte buffers are may be not affected if ALLOW_CHANGE_INPUTS is 
false. I thought it should clarify about this in a comment to avoid such 
confusion.

Sounds good? If yes I will rebase and update the patch then. Thanks!

> Indicate preference not to affect input buffers during coding in erasure coder
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-12047
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12047
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Kai Zheng
>             Fix For: HDFS-7285
>
>         Attachments: HADOOP-12047-HDFS-7285-v1.patch, HADOOP-12047-v2.patch, 
> HADOOP-12047-v3.patch, initial-poc.patch
>
>
> It's good to define and ensure input buffers are not affected during coding 
> process in raw erasure coders. Below are copied from discussion with 
> [~jingzhao] in HDFS-8481:
> bq. In that case we cannot reuse the source buffers I guess? Then do we need 
> to expose this information in the decoder?
> bq. Good catch Jing! Yes in this case we can't reuse the source buffers here 
> as they need to be passed to caller/applications without being changed. I'm 
> planning to re-implement the Java coders in HADOOP-12041 and related, when 
> done it's possible to ensure the input buffers not to be affected. Benefits 
> of doing this in coder layer: 1) a more clear contract between coder and 
> caller in more general sense for the inputs; 2) concrete coder may have 
> specific tweak to optimize in the aspect, ideally no input data copying at 
> all, worst, make the copy, but all transparent to callers; 3) allow new 
> coders (LRC, HH) to be layered on other primitive coders (RS, XOR) more 
> easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to