It’s not exactly like any other 3rd party crate - this library explicitly 
depends on (and extends) datafusion.

This means that new versions of datafusion-python (an official Datafusion 
project) now depends on an unofficial extension
 project to first upgrade to the newer Datafusion version before the updated 
datafusion-python crate can be released.

From: Kevin Liu <[email protected]>
Date: Tuesday, March 31, 2026 at 2:22 AM
To: [email protected] <[email protected]>
Subject: Re: [DISCUSS] Question on pulling in contrib content to 
datafusion-python

I think we can treat it as pulling any other 3rd party crate/library.
I see that it's marked as an optional dependency [1], which is great. It's
also added as a feature [2]; I would suggest making it explicit that this
is a community contribution, instead of apache. So maybe rename the feature
to `community_json` or something similar.
We can also document in LICENSE/NOTICE/README that the library is a
community contribution not affiliated with the Apache Software Foundation

Best,
Kevin Liu


[1]
https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R56
[2]
https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R78

On Mon, Mar 30, 2026 at 9:45 AM Andrew Lamb <[email protected]> wrote:

> Another thing to consider is the maintenance burden (maybe not that bad)
>
> In my mind if we are going to distribute datafusion-python with the json
> functions, we should bring datafusion-json functions under apache
> governance . Otherwise we might end up with a situation like a security
> issue in an Apache product due to some other crate
>
> Of course, we already do this with the other third-party dependencies (like
> `hashbrown` for example ) so maybe it isn't that different 🤔
>
> I think the most important thing about bringing in code like that is that
> we ensure that it IP provenance is clear (e.g. that the (original authors)
> have made the donation explicitly under the apache license.
>
> I am not sure who wrote the code in datafusion-json -- could we get them to
> make the PR instead of a third party?
>
> Andrew
>
> Andrew
>
> On Mon, Mar 30, 2026 at 10:57 AM Luke Kim <[email protected]> wrote:
>
> > We (Spice AI) use the json crate and it would be nice to have it in, but
> I
> > think the API should be reviewed for consistency before making it
> official
> > and having people depend on it.
> >
> > It aligns to the PostgreSQL syntax but not exactly/completely.
> >
> >
> >
> > On Mon, Mar 30, 2026 at 7:39 AM, Tim Saucer <[email protected]<mailto:
> > [email protected]>> wrote:
> >
> > Hi all,
> >
> > A recent PR[1] has been opened to bring in json scalar functions from the
> > datafusion-contrib crate datafusion-functions-json. Before I move forward
> > with either approving or closing this PR, I was wondering how the broader
> > community felt about adding outside content like this. The code from
> > datafusion-contrib is unofficial, so I'm hesitant to include it in our
> > official release.
> >
> > I could see a second route which would be to add python support for all
> of
> > those functions inside that contrib crate. But that means someone who
> > maintains that code will also need to publish python packages in addition
> > to their current rust code. It's not a huge burden, but it is additional
> > work.
> >
> > I'd appreciate any thoughts you have on non-official crate functions
> being
> > included.
> >
> > [1]:
> >
> https://github.com/apache/datafusion-python/pull/1466
> > <https://github.com/apache/datafusion-python/pull/1466>
> >
> >
>

Reply via email to