Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-04-04 Thread via GitHub
lidavidm commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750178633 > Then the only remaining I/O we need to address is the table metadata file which stores a single json string. This is the only reason to keep the `InputFile` and `OutputFile` interfaces

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
zhjwpku commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2751059392 I'll refactor the FileIO as per Gang's suggestion. Thank you all for your valuable input. Starting with a minimal implementation and expanding it as needed seems like an ideal approach.

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
pitrou commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750938941 > Good question! IMHO, a streaming JSON reader is an overkill in this case. That's why I propose to define a minimal FileIO like: > > ``` > class FileIO { > public: > virt

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
lidavidm commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750433013 That seems reasonable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
lidavidm commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750434350 It will mean different implementations won't be exactly drop-in portable but I think that's too big a goal to tackle anyways -- This is an automated message from the Apache Git Service

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
lidavidm commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750412560 Yeah, in that case I don't think we need InputFile/OutputFile. Is the idea that basically, iceberg-cpp would effectively not deal with I/O (or would only deal with I/O of metadata

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
wgtmac commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750431481 Exactly. Does that make sense to you? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-25 Thread via GitHub
wgtmac commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2750307640 Good question! IMHO, a streaming JSON reader is an overkill in this case. That's why I propose to define a minimal FileIO like: ``` class FileIO { public: virtual expected Del

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-24 Thread via GitHub
wgtmac commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r2011241905 ## src/iceberg/io/fs_file_reader.cc: ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements.

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-24 Thread via GitHub
wgtmac commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r2011224442 ## src/iceberg/file_io.h: ## @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NO

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-24 Thread via GitHub
zhjwpku commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r2011195005 ## src/iceberg/io/fs_file_reader.cc: ## @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements.

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-24 Thread via GitHub
zhjwpku commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r2011192149 ## src/iceberg/file_io.h: ## @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the N

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-24 Thread via GitHub
wgtmac commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r2010260510 ## src/iceberg/file_io.h: ## Review Comment: Why do we need this? ## src/iceberg/file_io.h: ## @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache S

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-03-23 Thread via GitHub
lidavidm commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r2009380801 ## src/iceberg/file_io.h: ## @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-16 Thread via GitHub
zhjwpku commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2597349794 > > Can we provide both? > > I think we can. It is pretty straight-forward to do something similar to `arrow::RandomnAccessFile`: https://github.com/apache/arrow/blob/main/cpp/src/

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-16 Thread via GitHub
wgtmac commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2597327334 > Can we provide both? I think we can. It is pretty straight-forward to do something similar to `arrow::RandomnAccessFile`: https://github.com/apache/arrow/blob/main/cpp/src/arrow/i

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-16 Thread via GitHub
zhjwpku commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r1918650611 ## src/iceberg/io/fs_file_io.h: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
zhjwpku commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2594455927 > Is my understanding correct that this FileIo pertains to locations that are local, on HDFS, or on S3? Yeah, I hope this FileIO to be extended to other storages. -- This is an a

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
zhjwpku commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2594498304 > For instance, async vs sync: [#2 (comment)](https://github.com/apache/iceberg-cpp/issues/2#issuecomment-2522607394) Can we provide both? > > Or, whether the core librar

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
MisterRaindrop commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2594372750 Is my understanding correct that this FileIo pertains to locations that are local, on HDFS, or on S3? -- This is an automated message from the Apache Git Service. To respond to t

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
lidavidm commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2594348439 For instance, async vs sync: https://github.com/apache/iceberg-cpp/issues/2#issuecomment-2522607394 Or, whether the core library should do any IO at all: https://github.com/apache

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
zhjwpku commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2594324922 > Has iceberg-cpp decided on an IO strategy already? > > It might be more productive to start writing the IO-less components, such as parsing the various metadata files, etc.

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
lidavidm commented on code in PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#discussion_r1917497458 ## src/iceberg/io/file_io.h: ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See th

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
pitrou commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2593303684 Has iceberg-cpp decided on an IO strategy already? It might be more productive to start writing the IO-less components, such as parsing the various metadata files, etc. -- This is

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
wgtmac commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2593244484 cc @lidavidm @MisterRaindrop -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] feat: add file_io and local fs impl [iceberg-cpp]

2025-01-15 Thread via GitHub
zhjwpku commented on PR #30: URL: https://github.com/apache/iceberg-cpp/pull/30#issuecomment-2592893837 @wgtmac @Fokko @pitrou @Xuanwo @raulcd Please take a look at this early implementation of FileIO, I plan to add more interfaces, before that, I hope to hear some advices, thanks. -- T