We (Pydantic) are the original authors of `datafusion-functions-json`. We
would be more than happy to donate it and to make any changes necessary to
align it with Postgres if that's desired. I would personally like to see it
in `datafusion-cli` as well. FWIW DuckDB bundles JSON functions out of the
box: https://duckdb.org/docs/current/data/json/json_functions

On Tue, Mar 31, 2026 at 1:57 AM Vignesh Siva <[email protected]>
wrote:

> Hi All,
>
>
> Thank you for articulating your crucial points regarding the proposed
> integration of `datafusion-json` from `datafusion-contrib` into
> `datafusion-python`. Your perspective that `datafusion-json` acts as a
> direct extension of DataFusion itself, rather than a standalone third-party
> dependency, is well-taken and highlights a significant governance and
> dependency challenge that we absolutely need to address.
>
> The concern that an official Apache project like `datafusion-python` could
> become dependent on an unofficial extension (which may not consistently
> keep pace with DataFusion core updates) is indeed serious, particularly
> regarding potential upgrade blocking. To fully grasp the extent of this
> risk, could you elaborate further on the specific scenarios where you
> foresee `datafusion-python` upgrades being directly impeded by the
> `datafusion-json` dependency? Understanding the precise mechanisms of this
> potential blockage will be vital as we consider viable pathways forward.
>
> It would also be helpful to discuss potential mitigation strategies. Are
> there approaches we could explore that would allow for the desired
> functionality while robustly addressing the core dependency and governance
> issues you've raised, perhaps by more clearly decoupling the lifecycles or
> by formalizing the `datafusion-json` project under Apache auspices? Your
> insights on how best to navigate these complexities, ensuring both
> functionality and the long-term health and integrity of
> `datafusion-python`, would be greatly appreciated.
>
> Thank you.
>
>
> Regards,
> Vignesh
>
> On Tue, 31 Mar 2026 at 12:18, Phillip LeBlanc <[email protected]>
> wrote:
>
> > It’s not exactly like any other 3rd party crate - this library explicitly
> > depends on (and extends) datafusion.
> >
> > This means that new versions of datafusion-python (an official Datafusion
> > project) now depends on an unofficial extension
> >  project to first upgrade to the newer Datafusion version before the
> > updated datafusion-python crate can be released.
> >
> > From: Kevin Liu <[email protected]>
> > Date: Tuesday, March 31, 2026 at 2:22 AM
> > To: [email protected] <[email protected]>
> > Subject: Re: [DISCUSS] Question on pulling in contrib content to
> > datafusion-python
> >
> > I think we can treat it as pulling any other 3rd party crate/library.
> > I see that it's marked as an optional dependency [1], which is great.
> It's
> > also added as a feature [2]; I would suggest making it explicit that this
> > is a community contribution, instead of apache. So maybe rename the
> feature
> > to `community_json` or something similar.
> > We can also document in LICENSE/NOTICE/README that the library is a
> > community contribution not affiliated with the Apache Software Foundation
> >
> > Best,
> > Kevin Liu
> >
> >
> > [1]
> >
> >
> https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R56
> > [2]
> >
> >
> https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R78
> >
> > On Mon, Mar 30, 2026 at 9:45 AM Andrew Lamb <[email protected]>
> > wrote:
> >
> > > Another thing to consider is the maintenance burden (maybe not that
> bad)
> > >
> > > In my mind if we are going to distribute datafusion-python with the
> json
> > > functions, we should bring datafusion-json functions under apache
> > > governance . Otherwise we might end up with a situation like a security
> > > issue in an Apache product due to some other crate
> > >
> > > Of course, we already do this with the other third-party dependencies
> > (like
> > > `hashbrown` for example ) so maybe it isn't that different 🤔
> > >
> > > I think the most important thing about bringing in code like that is
> that
> > > we ensure that it IP provenance is clear (e.g. that the (original
> > authors)
> > > have made the donation explicitly under the apache license.
> > >
> > > I am not sure who wrote the code in datafusion-json -- could we get
> them
> > to
> > > make the PR instead of a third party?
> > >
> > > Andrew
> > >
> > > Andrew
> > >
> > > On Mon, Mar 30, 2026 at 10:57 AM Luke Kim <[email protected]> wrote:
> > >
> > > > We (Spice AI) use the json crate and it would be nice to have it in,
> > but
> > > I
> > > > think the API should be reviewed for consistency before making it
> > > official
> > > > and having people depend on it.
> > > >
> > > > It aligns to the PostgreSQL syntax but not exactly/completely.
> > > >
> > > >
> > > >
> > > > On Mon, Mar 30, 2026 at 7:39 AM, Tim Saucer <[email protected]
> > <mailto:
> > > > [email protected]>> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > A recent PR[1] has been opened to bring in json scalar functions from
> > the
> > > > datafusion-contrib crate datafusion-functions-json. Before I move
> > forward
> > > > with either approving or closing this PR, I was wondering how the
> > broader
> > > > community felt about adding outside content like this. The code from
> > > > datafusion-contrib is unofficial, so I'm hesitant to include it in
> our
> > > > official release.
> > > >
> > > > I could see a second route which would be to add python support for
> all
> > > of
> > > > those functions inside that contrib crate. But that means someone who
> > > > maintains that code will also need to publish python packages in
> > addition
> > > > to their current rust code. It's not a huge burden, but it is
> > additional
> > > > work.
> > > >
> > > > I'd appreciate any thoughts you have on non-official crate functions
> > > being
> > > > included.
> > > >
> > > > [1]:
> > > >
> > > https://github.com/apache/datafusion-python/pull/1466
> > > > <https://github.com/apache/datafusion-python/pull/1466>
> > > >
> > > >
> > >
> >
>

Reply via email to