wgtmac commented on issue #33: URL: https://github.com/apache/iceberg-cpp/issues/33#issuecomment-2614188731
> It also depends whether Arrow would be a public or private dependency for iceberg-cpp. If a public dependency, going for nanoarrow is certainly safer at this point. I'd say the `ArrowArray` C data interface should be public at the minimum (as per the requirement in this issue description). To make it easy to use to the downstream projects, we need a wrapper around the `ArrowArray` which is exactly the purpose of `nanoarrow` and `sparrow`. > All this being said, I think that you want: > Use ArrowArray from the [Arrow C data interface](https://arrow.apache.org/docs/format/CDataInterface.html#structure-definitions) and users are free to use their own Arrow implementations. > ...so that you don't have to force your users to pin a nanoarrow or sparrow version. I don't personally expose nanoarrow.h in any projects I use it in (the interface is always in terms of ArrowArrayStream). The downside of this approach (i.e. the option 2) is that downstream projects have to make the decision (choose from nanoarrow, sparrow or bake their own impl) for the Arrow array conversion and manage the dependency by themselves. How about providing two interfaces, one is the `ArrowArray` C data interface and the other is the `nanoarrow::UniqueArray` provided by the bundled nanoarrow? Then downstream projects are still free to plugin their own Arrow array implementation to purely use the C data interface. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org