That is an interesting idea, but I was kind of hoping for something like a query parameter of “&filesystem=s3a” or something like that, where I can choose the filesystem like I can in the Java API.
From: Wei-Chiu Chuang <[email protected]> Sent: Wednesday, May 22, 2019 9:50 AM To: Joseph Henry <[email protected]> Cc: [email protected] Subject: Re: Webhdfs and S3 EXTERNAL You can start 2 httpfs servers (or even more), and let one set fs.defaultFS to s3a://, and the other set to hdfs. Will that work for you? Or is this not what you need? On Wed, May 22, 2019 at 3:40 PM Joseph Henry <[email protected]<mailto:[email protected]>> wrote: I thought about that, but we need to be able to access storage in native hdfs as well as S3 in the same cluster. If we change fs.defaultFS then I would not be able to access the HDFS storage. From: Wei-Chiu Chuang <[email protected]<mailto:[email protected]>> Sent: Wednesday, May 22, 2019 9:36 AM To: Joseph Henry <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: Webhdfs and S3 EXTERNAL I've never tried, but it seems possible to start a Httpfs server with fs.defaultFS = s3a://your-bucket Httpfs server speaks WebHDFS protocol so your webhdfs client can use webhdfs. And then for each webhdfs request, httpfs server translates that into the corresponding FileSystem API call. If the fs.defaultFS is the s3a:// URI, it may be able to talk to s3. On Wed, May 22, 2019 at 3:29 PM Joseph Henry <[email protected]<mailto:[email protected]>> wrote: Hey, I am not sure if this is the correct mailing list for this question, but I will start here. Our client application needs to support accessing S3 buckets from hdfs. We can do this with the Java API using the s3a:// scheme, but also need a way to access the same files in S3 via the HDFS REST API. Is there a way to access the data stored in S3 via WEBHDFS? Thanks, Joseph Henry.
