Jeff, Would Hadoop encryption zone/Transparent Data Encryption (TDE) <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html> address this use case? Files within encryption zone are encrypted transparently. Data is encrypted on DataNodes and are decrypted at client side. Or would Data Transfer Encryption <https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SecureMode.html#Data_Encryption_on_Block_data_transfer.> work for you? These are pretty mature these days so probably worth trying.
Definitely let me know if the encryption system I mentioned above doesn't work for you. I know there are assumptions behind the design and doesn't work for all use cases (it doesn't support per column encryption key in HBase) On Mon, Jun 10, 2019 at 8:07 PM Jeff Hubbs <[email protected]> wrote: > Hi, Wei-Chiu - > > I don't know if this is something already in the pipeline for 3.x, but I'd > like to see a mechanism in HDFS that encrypts blocks pre-storage such that > I'd only have to manage keys in one place (NameManager?). If that > capability existed, then I could move blocks around an unsafe network > and/or not have to worry about my worker nodes having volume-level or > whole-disk-level encryption. Even if I have Hadoop traffic only crossing a > LAN that's captive to the cluster, I might still have to worry about worker > nodes being stolen outright or having the drive(s) taken out of them. > > - Jeff > > On 6/10/19 8:40 PM, Wei-Chiu Chuang wrote: > > > Thank you Sudeep for the feedback, > > To be more specific, what sort of examples are??you looking for??? > > On another note, I had written some docs of extended length about Hadoop > code base and internal designs. I should probably make those public to > share the knowledge (or fix my grammar errors, for that matter) > > On Mon, Jun 10, 2019 at 12:11 PM Sudeep Singh Thakur < > [email protected]> wrote: > >> Hi , >> >> Examples are most helpful for developer. Please add examples as much as >> we can. >> >> Thanks?? >> Sudeep Thakur >> >> On Mon, Jun 10, 2019, 10:38 PM Wei-Chiu Chuang >> <[email protected]> <[email protected]> wrote: >> >>> Hi! >>> >>> I am soliciting feedbacks for HDFS roadmap items and wish list in the >>> future Hadoop releases. A community meetup >>> <https://www.meetup.com/Hadoop-Contributors/events/262055924/?rv=ea1_v2&_xtd=gatlbWFpbF9jbGlja9oAJGJiNTE1ODdkLTY0MDAtNDFiZS1iOTU5LTM5ZWYyMDU1N2Q4Nw> >>> is happening soon, and perhaps we can use this thread to converge on things >>> we should talk about there. >>> >>> I am aware of several major features that merged into trunk, such as >>> RBF, Consistent Standby Serving Reads, as well as some recent features that >>> merged into 3.2.0 release (storage policy satisfier). >>> >>> What else should we be doing? I have a laundry list of supportability >>> improvement projects, mostly about improving performance or making >>> performance diagnostics easier. I can share the list if folks are >>> interested. >>> >>> Are there things we should do to make developer's life easier or things >>> that would be nice to have for downstream applications? I know??Sahil >>> Takiar made a series of improvements in HDFS for Impala recently, and those >>> improvements are applicable to other downstreamers such as HBase. Or would >>> it help if we provide more Hadoop API examples??? >>> >> >
