A little ping - does anyone have any ideas about why libHDFS zero copy would fail on a single node cluster?
On Sat, 11 Dec 2021 at 19:55, Pratyush Das <[email protected]> wrote: > Thanks for the reply! > > In my case, the client process should be on the same machine as the single > datanode because I have set up HDFS on my local machine itself. What other > reason could there be for zero copy read not being possible? > > Some logs about how my datanode is set up - > > reikdas@reikdas-HP-Pavilion-x360-Convertible-14-dh1xxx:~$ hdfs dfsadmin > -report > Configured Capacity: 143422484480 (133.57 GB) > Present Capacity: 65070809088 (60.60 GB) > DFS Remaining: 61662220288 (57.43 GB) > DFS Used: 3408588800 (3.17 GB) > DFS Used%: 5.24% > Replicated Blocks: > Under replicated blocks: 0 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > Missing blocks (with replication factor 1): 0 > Low redundancy blocks with highest priority to recover: 0 > Pending deletion blocks: 0 > Erasure Coded Block Groups: > Low redundancy block groups: 0 > Block groups with corrupt internal blocks: 0 > Missing block groups: 0 > Low redundancy blocks with highest priority to recover: 0 > Pending deletion blocks: 0 > > ------------------------------------------------- > Live datanodes (1): > > Name: 127.0.0.1:9866 (localhost) > Hostname: reikdas-HP-Pavilion-x360-Convertible-14-dh1xxx > Decommission Status : Normal > Configured Capacity: 143422484480 (133.57 GB) > DFS Used: 3408588800 (3.17 GB) > Non DFS Used: 70994866176 (66.12 GB) > DFS Remaining: 61662220288 (57.43 GB) > DFS Used%: 2.38% > DFS Remaining%: 42.99% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 0 > Last contact: Sat Dec 11 19:51:03 EST 2021 > Last Block Report: Sat Dec 11 19:19:57 EST 2021 > Num of Blocks: 55 > > Regards, > > On Fri, 10 Dec 2021 at 23:33, Chris Nauroth <[email protected]> wrote: > >> Hello Pratyush, >> >> Zero-copy read is only possible when the client process attempting the >> read is located on the same machine as a DataNode that hosts a replica of >> the HDFS block. The implementation relies on the mmap syscall to map the >> underlying block file directly into the client process's address space. >> This isn't possible across machine boundaries. >> >> If the client process is not co-located with the block, then it falls >> back to a buffer-copying read, using a ByteBufferPool as a factory for >> producing buffers to receive copies of the data. This path does not >> benefit from the performance enhancements of zero-copy read, but at least >> it still works functionally. In practice, it would be rare that all blocks >> of a multi-block file would be co-located with a single client, hence the >> importance of applications scheduling work with locality in mind. >> >> When hadoopReadZero fails, it also sets errno. I suspect that if you >> were to check errno, you would find the error is EPROTONOSUPPORT. LibHDFS >> uses this error code when 1) zero-copy read is not possible (e.g. client is >> not co-located with the block replica), and 2) the caller has not provided >> any ByteBufferPool for the fallback copying implementation. >> >> I suggest adding this to your code: >> >> hadoopRzOptionsSetByteBufferPool(opts, ELASTIC_BYTE_BUFFER_POOL_CLASS); >> >> This function is documented in the header here: >> >> >> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h#L978-L994 >> >> With this in place, your client will fall back to a buffer copy when the >> process is not co-located with the HDFS block. Some of this is >> demonstrated in the test suite for LibHDFS zero copy too: >> >> >> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c >> >> I see you are already configuring the client to skip checksums. The >> other alternative is to use Centralized Cache Management >> <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html>, >> which will perform an eager checksum validation before calling mlock to pin >> the block into physical memory explicitly at the DataNode host. >> >> Chris Nauroth >> >> >> On Sun, Oct 24, 2021 at 1:42 PM Pratyush Das <[email protected]> wrote: >> >>> Hi, >>> >>> I can successfully load files from HDFS via the C API like - >>> >>> #include "hdfs.h" >>> #include <stdio.h> >>> #include <string.h> >>> #include <stdlib.h> >>> #include <stdint.h> >>> #include <inttypes.h> >>> >>> int main(int argc, char **argv) { >>> hdfsFS fs = hdfsConnect("127.0.0.1", 9000); >>> const char* readPath = "/lineitem.tbl"; >>> hdfsFile readFile = hdfsOpenFile(fs, readPath, O_RDONLY, 0, 0, 0); >>> if(!readFile) { >>> fprintf(stderr, "Failed to open %s for reading!\n", readPath); >>> exit(-1); >>> } >>> if (!hdfsFileIsOpenForRead(readFile)) { >>> fprintf(stderr, "hdfsFileIsOpenForRead: we just opened a file >>> with O_RDONLY, and it did not show up as 'open for read'\n"); >>> exit(-1); >>> } >>> int size_in_bytes = hdfsAvailable(fs, readFile); >>> fprintf(stderr, "hdfsAvailable: %d\n", size_in_bytes); >>> char *buffer; >>> buffer = (char*)malloc(sizeof(char)*(size_in_bytes+1)); >>> memset(buffer, 0, sizeof(buffer)); >>> int num_read_bytes = 0; >>> while (num_read_bytes < size_in_bytes) { >>> int rbytes = hdfsRead(fs, readFile, &buffer[num_read_bytes], >>> size_in_bytes); >>> num_read_bytes += rbytes; >>> } >>> printf("%s\n", buffer); >>> printf("Total bytes read = %d\n", num_read_bytes); >>> free(buffer); >>> hdfsCloseFile(fs, readFile); >>> hdfsDisconnect(fs); >>> } >>> >>> and I am able to see all the contents of the file printed out >>> successfully. >>> >>> But when I try to use the zero copy API like - >>> >>> #include "hdfs.h" >>> #include <stdio.h> >>> #include <string.h> >>> #include <stdlib.h> >>> #include <stdint.h> >>> #include <inttypes.h> >>> >>> int main(int argc, char **argv) { >>> hdfsFS fs = hdfsConnect("127.0.0.1", 9000); >>> const char* readPath = "/lineitem.tbl"; >>> hdfsFile readFile = hdfsOpenFile(fs, readPath, O_RDONLY, 0, 0, 0); >>> if(!readFile) { >>> fprintf(stderr, "Failed to open %s for reading!\n", readPath); >>> exit(-1); >>> } >>> if (!hdfsFileIsOpenForRead(readFile)) { >>> fprintf(stderr, "hdfsFileIsOpenForRead: we just opened a file >>> with O_RDONLY, and it did not show up as 'open for read'\n"); >>> exit(-1); >>> } >>> int size_in_bytes = hdfsAvailable(fs, readFile); >>> fprintf(stderr, "hdfsAvailable: %d\n", size_in_bytes); >>> struct hadoopRzOptions *opts = NULL; >>> opts = hadoopRzOptionsAlloc(); >>> if (!opts) { >>> fprintf(stderr, "Unable to set zero copy options\n"); >>> exit(-1); >>> } >>> if (hadoopRzOptionsSetSkipChecksum(opts, 1)) { >>> fprintf(stderr, "Unable to set skip checksum\n"); >>> exit(-1); >>> } >>> /*if (hadoopRzOptionsSetByteBufferPool(opts, NULL)) { >>> fprintf(stderr, "Unable to set byte buffer pool\n"); >>> exit(-1); >>> }*/ >>> struct hadoopRzBuffer *hbuffer = NULL; >>> //hadoopRzBufferFree(readFile, hbuffer); >>> hbuffer = hadoopReadZero(readFile, opts, 100); >>> if (!hbuffer) { >>> fprintf(stderr, "Unable to read zero copy hdfs file\n"); >>> exit(-1); >>> } >>> char *buffer; buffer = (char*)malloc(sizeof(char)*(size_in_bytes+1)); >>> memset(buffer, 0, sizeof(buffer)); >>> buffer = hadoopRzBufferGet(hbuffer); >>> int num_read_bytes = hadoopRzBufferLength(hbuffer); >>> printf("Actual size = %d\n", size_in_bytes); >>> printf("Bytes read = %d\n", num_read_bytes); >>> //printf("%s\n", buffer); >>> //printf("%s\n", buffer[size_in_bytes - 1000]); >>> hdfsCloseFile(fs, readFile); >>> } >>> >>> I get the error - "Unable to read zero copy hdfs file" which means that >>> hbuffer didn't read anything in. >>> >>> Am I doing something incorrectly? >>> >>> Thank you, >>> >>> -- >>> Pratyush Das >>> >> > > -- > Pratyush Das > -- Pratyush Das
