CuteChuanChuan commented on PR #22473: URL: https://github.com/apache/datafusion/pull/22473#issuecomment-4805663597
> Thanks @CuteChuanChuan I would expect something like below. Just extend existing `arrays_zip_inner` to accept fieldnames and in datafusion-spark, create another constructor that accepts field names. > > > > Spark caller will create a function with necessary fields. > > > > It can be constructor or some `.with_field_names` to avoid recreate function > > > > > > ```mermaid > > flowchart TB > > subgraph spark["datafusion-spark"] > > SparkUDF["SparkArraysZip (UDF)<br/>function/array/arrays_zip.rs"] > > SparkNew["new()<br/>names from arg_fields"] > > SparkCtor["with_field_names(Vec<String>)<br/>caller-supplied names"] > > SparkMod["function/array/mod.rs<br/>+ pub mod arrays_zip<br/>+ make_udf_function!<br/>+ export_functions!<br/>+ functions() entry"] > > > > SparkUDF --> SparkNew > > SparkUDF --> SparkCtor > > SparkMod -.registers.-> SparkUDF > > end > > > > subgraph nested["datafusion-functions-nested"] > > ArraysZipUDF["ArraysZip (UDF)<br/>names: '1','2','3'..."] > > Wrapper["arrays_zip_inner(args)<br/>(private wrapper)<br/>generates default names"] > > Pub["pub fn arrays_zip_inner_with_names<br/>(args, field_names)"] > > Perfect["try_perfect_list_zip<br/>(args, field_names)"] > > > > ArraysZipUDF --> Wrapper > > Wrapper --> Pub > > Pub --> Perfect > > end > > > > SparkNew -->|resolve names<br/>then call| Pub > > SparkCtor -->|resolve names<br/>then call| Pub > > ``` > > > > ## Name resolution inside `SparkArraysZip` > > > > ```mermaid > > flowchart LR > > Start["SparkArraysZip.field_names"] > > Some["Some(names)"] > > None["None"] > > Validate["validate length<br/>matches args"] > > FromArgs["arg_fields[i].name()<br/>for each i"] > > Use["use names"] > > > > Start --> Some > > Start --> None > > Some --> Validate --> Use > > None -->|Spark default:<br/>column/alias names| FromArgs --> Use > > ``` Got it. Thanks for providing these details.❤️ I will give it a try. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
