Thanks for the reply! In my case, the client process should be on the same machine as the single datanode because I have set up HDFS on my local machine itself. What other reason could there be for zero copy read not being possible?
Some logs about how my datanode is set up - reikdas@reikdas-HP-Pavilion-x360-Convertible-14-dh1xxx:~$ hdfs dfsadmin -report Configured Capacity: 143422484480 (133.57 GB) Present Capacity: 65070809088 (60.60 GB) DFS Remaining: 61662220288 (57.43 GB) DFS Used: 3408588800 (3.17 GB) DFS Used%: 5.24% Replicated Blocks: Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 Erasure Coded Block Groups: Low redundancy block groups: 0 Block groups with corrupt internal blocks: 0 Missing block groups: 0 Low redundancy blocks with highest priority to recover: 0 Pending deletion blocks: 0 ------------------------------------------------- Live datanodes (1): Name: 127.0.0.1:9866 (localhost) Hostname: reikdas-HP-Pavilion-x360-Convertible-14-dh1xxx Decommission Status : Normal Configured Capacity: 143422484480 (133.57 GB) DFS Used: 3408588800 (3.17 GB) Non DFS Used: 70994866176 (66.12 GB) DFS Remaining: 61662220288 (57.43 GB) DFS Used%: 2.38% DFS Remaining%: 42.99% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 0 Last contact: Sat Dec 11 19:51:03 EST 2021 Last Block Report: Sat Dec 11 19:19:57 EST 2021 Num of Blocks: 55 Regards, On Fri, 10 Dec 2021 at 23:33, Chris Nauroth <[email protected]> wrote: > Hello Pratyush, > > Zero-copy read is only possible when the client process attempting the > read is located on the same machine as a DataNode that hosts a replica of > the HDFS block. The implementation relies on the mmap syscall to map the > underlying block file directly into the client process's address space. > This isn't possible across machine boundaries. > > If the client process is not co-located with the block, then it falls back > to a buffer-copying read, using a ByteBufferPool as a factory for producing > buffers to receive copies of the data. This path does not benefit from the > performance enhancements of zero-copy read, but at least it still works > functionally. In practice, it would be rare that all blocks of a > multi-block file would be co-located with a single client, hence the > importance of applications scheduling work with locality in mind. > > When hadoopReadZero fails, it also sets errno. I suspect that if you were > to check errno, you would find the error is EPROTONOSUPPORT. LibHDFS uses > this error code when 1) zero-copy read is not possible (e.g. client is not > co-located with the block replica), and 2) the caller has not provided any > ByteBufferPool for the fallback copying implementation. > > I suggest adding this to your code: > > hadoopRzOptionsSetByteBufferPool(opts, ELASTIC_BYTE_BUFFER_POOL_CLASS); > > This function is documented in the header here: > > > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h#L978-L994 > > With this in place, your client will fall back to a buffer copy when the > process is not co-located with the HDFS block. Some of this is > demonstrated in the test suite for LibHDFS zero copy too: > > > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs-tests/test_libhdfs_zerocopy.c > > I see you are already configuring the client to skip checksums. The other > alternative is to use Centralized Cache Management > <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html>, > which will perform an eager checksum validation before calling mlock to pin > the block into physical memory explicitly at the DataNode host. > > Chris Nauroth > > > On Sun, Oct 24, 2021 at 1:42 PM Pratyush Das <[email protected]> wrote: > >> Hi, >> >> I can successfully load files from HDFS via the C API like - >> >> #include "hdfs.h" >> #include <stdio.h> >> #include <string.h> >> #include <stdlib.h> >> #include <stdint.h> >> #include <inttypes.h> >> >> int main(int argc, char **argv) { >> hdfsFS fs = hdfsConnect("127.0.0.1", 9000); >> const char* readPath = "/lineitem.tbl"; >> hdfsFile readFile = hdfsOpenFile(fs, readPath, O_RDONLY, 0, 0, 0); >> if(!readFile) { >> fprintf(stderr, "Failed to open %s for reading!\n", readPath); >> exit(-1); >> } >> if (!hdfsFileIsOpenForRead(readFile)) { >> fprintf(stderr, "hdfsFileIsOpenForRead: we just opened a file >> with O_RDONLY, and it did not show up as 'open for read'\n"); >> exit(-1); >> } >> int size_in_bytes = hdfsAvailable(fs, readFile); >> fprintf(stderr, "hdfsAvailable: %d\n", size_in_bytes); >> char *buffer; >> buffer = (char*)malloc(sizeof(char)*(size_in_bytes+1)); >> memset(buffer, 0, sizeof(buffer)); >> int num_read_bytes = 0; >> while (num_read_bytes < size_in_bytes) { >> int rbytes = hdfsRead(fs, readFile, &buffer[num_read_bytes], >> size_in_bytes); >> num_read_bytes += rbytes; >> } >> printf("%s\n", buffer); >> printf("Total bytes read = %d\n", num_read_bytes); >> free(buffer); >> hdfsCloseFile(fs, readFile); >> hdfsDisconnect(fs); >> } >> >> and I am able to see all the contents of the file printed out >> successfully. >> >> But when I try to use the zero copy API like - >> >> #include "hdfs.h" >> #include <stdio.h> >> #include <string.h> >> #include <stdlib.h> >> #include <stdint.h> >> #include <inttypes.h> >> >> int main(int argc, char **argv) { >> hdfsFS fs = hdfsConnect("127.0.0.1", 9000); >> const char* readPath = "/lineitem.tbl"; >> hdfsFile readFile = hdfsOpenFile(fs, readPath, O_RDONLY, 0, 0, 0); >> if(!readFile) { >> fprintf(stderr, "Failed to open %s for reading!\n", readPath); >> exit(-1); >> } >> if (!hdfsFileIsOpenForRead(readFile)) { >> fprintf(stderr, "hdfsFileIsOpenForRead: we just opened a file >> with O_RDONLY, and it did not show up as 'open for read'\n"); >> exit(-1); >> } >> int size_in_bytes = hdfsAvailable(fs, readFile); >> fprintf(stderr, "hdfsAvailable: %d\n", size_in_bytes); >> struct hadoopRzOptions *opts = NULL; >> opts = hadoopRzOptionsAlloc(); >> if (!opts) { >> fprintf(stderr, "Unable to set zero copy options\n"); >> exit(-1); >> } >> if (hadoopRzOptionsSetSkipChecksum(opts, 1)) { >> fprintf(stderr, "Unable to set skip checksum\n"); >> exit(-1); >> } >> /*if (hadoopRzOptionsSetByteBufferPool(opts, NULL)) { >> fprintf(stderr, "Unable to set byte buffer pool\n"); >> exit(-1); >> }*/ >> struct hadoopRzBuffer *hbuffer = NULL; >> //hadoopRzBufferFree(readFile, hbuffer); >> hbuffer = hadoopReadZero(readFile, opts, 100); >> if (!hbuffer) { >> fprintf(stderr, "Unable to read zero copy hdfs file\n"); >> exit(-1); >> } >> char *buffer; buffer = (char*)malloc(sizeof(char)*(size_in_bytes+1)); >> memset(buffer, 0, sizeof(buffer)); >> buffer = hadoopRzBufferGet(hbuffer); >> int num_read_bytes = hadoopRzBufferLength(hbuffer); >> printf("Actual size = %d\n", size_in_bytes); >> printf("Bytes read = %d\n", num_read_bytes); >> //printf("%s\n", buffer); >> //printf("%s\n", buffer[size_in_bytes - 1000]); >> hdfsCloseFile(fs, readFile); >> } >> >> I get the error - "Unable to read zero copy hdfs file" which means that >> hbuffer didn't read anything in. >> >> Am I doing something incorrectly? >> >> Thank you, >> >> -- >> Pratyush Das >> > -- Pratyush Das
