We (Pydantic) are the original authors of `datafusion-functions-json`. We would be more than happy to donate it and to make any changes necessary to align it with Postgres if that's desired. I would personally like to see it in `datafusion-cli` as well. FWIW DuckDB bundles JSON functions out of the box: https://duckdb.org/docs/current/data/json/json_functions
On Tue, Mar 31, 2026 at 1:57 AM Vignesh Siva <[email protected]> wrote: > Hi All, > > > Thank you for articulating your crucial points regarding the proposed > integration of `datafusion-json` from `datafusion-contrib` into > `datafusion-python`. Your perspective that `datafusion-json` acts as a > direct extension of DataFusion itself, rather than a standalone third-party > dependency, is well-taken and highlights a significant governance and > dependency challenge that we absolutely need to address. > > The concern that an official Apache project like `datafusion-python` could > become dependent on an unofficial extension (which may not consistently > keep pace with DataFusion core updates) is indeed serious, particularly > regarding potential upgrade blocking. To fully grasp the extent of this > risk, could you elaborate further on the specific scenarios where you > foresee `datafusion-python` upgrades being directly impeded by the > `datafusion-json` dependency? Understanding the precise mechanisms of this > potential blockage will be vital as we consider viable pathways forward. > > It would also be helpful to discuss potential mitigation strategies. Are > there approaches we could explore that would allow for the desired > functionality while robustly addressing the core dependency and governance > issues you've raised, perhaps by more clearly decoupling the lifecycles or > by formalizing the `datafusion-json` project under Apache auspices? Your > insights on how best to navigate these complexities, ensuring both > functionality and the long-term health and integrity of > `datafusion-python`, would be greatly appreciated. > > Thank you. > > > Regards, > Vignesh > > On Tue, 31 Mar 2026 at 12:18, Phillip LeBlanc <[email protected]> > wrote: > > > It’s not exactly like any other 3rd party crate - this library explicitly > > depends on (and extends) datafusion. > > > > This means that new versions of datafusion-python (an official Datafusion > > project) now depends on an unofficial extension > > project to first upgrade to the newer Datafusion version before the > > updated datafusion-python crate can be released. > > > > From: Kevin Liu <[email protected]> > > Date: Tuesday, March 31, 2026 at 2:22 AM > > To: [email protected] <[email protected]> > > Subject: Re: [DISCUSS] Question on pulling in contrib content to > > datafusion-python > > > > I think we can treat it as pulling any other 3rd party crate/library. > > I see that it's marked as an optional dependency [1], which is great. > It's > > also added as a feature [2]; I would suggest making it explicit that this > > is a community contribution, instead of apache. So maybe rename the > feature > > to `community_json` or something similar. > > We can also document in LICENSE/NOTICE/README that the library is a > > community contribution not affiliated with the Apache Software Foundation > > > > Best, > > Kevin Liu > > > > > > [1] > > > > > https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R56 > > [2] > > > > > https://github.com/apache/datafusion-python/pull/1466/files#diff-bac59d6e5ada615de3d27a8e8f87d272613a80b5d3e4a7e2c2e4a08e63dcf0a1R78 > > > > On Mon, Mar 30, 2026 at 9:45 AM Andrew Lamb <[email protected]> > > wrote: > > > > > Another thing to consider is the maintenance burden (maybe not that > bad) > > > > > > In my mind if we are going to distribute datafusion-python with the > json > > > functions, we should bring datafusion-json functions under apache > > > governance . Otherwise we might end up with a situation like a security > > > issue in an Apache product due to some other crate > > > > > > Of course, we already do this with the other third-party dependencies > > (like > > > `hashbrown` for example ) so maybe it isn't that different 🤔 > > > > > > I think the most important thing about bringing in code like that is > that > > > we ensure that it IP provenance is clear (e.g. that the (original > > authors) > > > have made the donation explicitly under the apache license. > > > > > > I am not sure who wrote the code in datafusion-json -- could we get > them > > to > > > make the PR instead of a third party? > > > > > > Andrew > > > > > > Andrew > > > > > > On Mon, Mar 30, 2026 at 10:57 AM Luke Kim <[email protected]> wrote: > > > > > > > We (Spice AI) use the json crate and it would be nice to have it in, > > but > > > I > > > > think the API should be reviewed for consistency before making it > > > official > > > > and having people depend on it. > > > > > > > > It aligns to the PostgreSQL syntax but not exactly/completely. > > > > > > > > > > > > > > > > On Mon, Mar 30, 2026 at 7:39 AM, Tim Saucer <[email protected] > > <mailto: > > > > [email protected]>> wrote: > > > > > > > > Hi all, > > > > > > > > A recent PR[1] has been opened to bring in json scalar functions from > > the > > > > datafusion-contrib crate datafusion-functions-json. Before I move > > forward > > > > with either approving or closing this PR, I was wondering how the > > broader > > > > community felt about adding outside content like this. The code from > > > > datafusion-contrib is unofficial, so I'm hesitant to include it in > our > > > > official release. > > > > > > > > I could see a second route which would be to add python support for > all > > > of > > > > those functions inside that contrib crate. But that means someone who > > > > maintains that code will also need to publish python packages in > > addition > > > > to their current rust code. It's not a huge burden, but it is > > additional > > > > work. > > > > > > > > I'd appreciate any thoughts you have on non-official crate functions > > > being > > > > included. > > > > > > > > [1]: > > > > > > > https://github.com/apache/datafusion-python/pull/1466 > > > > <https://github.com/apache/datafusion-python/pull/1466> > > > > > > > > > > > > > >
