nikfio opened a new issue, #3102: URL: https://github.com/apache/arrow-adbc/issues/3102
### What happened? Hello everyone, I am happily experiencing adbc engine with polars using Python language, communicating to and from a DuckDB instance. I discovered that if I write two times the exact same dataframe, for any possible reason. There is no mode available that would let me write to the same table and overwrite the elements that are already in. Because with "append" or "create_append" modes, the same elements are written appending exactly new elments, so I end up having doubles rows with exact same elements. Meanwhile, "replace" deletes completely the existing table and create a new one to insert new fresh data: but, in this case, if the new dataframe has elements not present in the previous table. The deletion takes away any old element no more available in the new dataframe, and that i think it is a waste. Long story short, I think in my humble opinion that it should be available, when I write a new dataframe to an axisting table of a SQL compliant DB, there has to be the the possibility of overwriting automatically the rows with same elements if already present in the existing table. Double elements example generated with append mode ``` import polars as pl import adbc_driver_sqlite.dbapi as litedbapi # Crate example dataframe df = pl.DataFrame({ "id": [1, 2, 3], "name": ["Alice", "Bob", "Charlie"], "age": [25, 30, 35] }) con = litedbapi.connect(uri=r"db2.duckdb") # First write df.write_database( table_name="people", connection=con, if_table_exists ="append", engine="adbc" ) con.commit() # Second write df.write_database( table_name="people", connection=con, if_table_exists ="append", engine="adbc" ) con.commit() # Verify query="SELECT * FROM people" result = pl.read_database(query=query, connection=con) print(result) ``` The result is ``` shape: (6, 3) ┌─────┬─────────┬─────┐ │ id ┆ name ┆ age │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 │ ╞═════╪═════════╪═════╡ │ 1 ┆ Alice ┆ 25 │ │ 2 ┆ Bob ┆ 30 │ │ 3 ┆ Charlie ┆ 35 │ │ 1 ┆ Alice ┆ 25 │ │ 2 ┆ Bob ┆ 30 │ │ 3 ┆ Charlie ┆ 35 │ └─────┴─────────┴─────┘ ``` Old elements gone away example with replace mode ``` df = pl.DataFrame({ "id": [1, 2, 3], "name": ["Alice", "Bob", "Charlie"], "age": [25, 30, 35] }) con = litedbapi.connect(uri=r"db.duckdb") # First write df.write_database( table_name="people", connection=con, if_table_exists ="replace", engine="adbc" ) con.commit() # Verify query="SELECT * FROM people" result = pl.read_database(query=query, connection=con) print(result) ``` An we have here: ``` shape: (3, 3) ┌─────┬─────────┬─────┐ │ id ┆ name ┆ age │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 │ ╞═════╪═════════╪═════╡ │ 1 ┆ Alice ┆ 25 │ │ 2 ┆ Bob ┆ 30 │ │ 3 ┆ Charlie ┆ 35 │ └─────┴─────────┴─────┘ ``` Then let's proceed with the second write with replace mode ``` # Second write df = pl.DataFrame({ "id": [2, 3, 4], "name": ["Bob", "Charlie", "Nathan"], "age": [30, 35, 40] }) df.write_database( table_name="people", connection=con, if_table_exists ="replace", engine="adbc" ) # Verify query="SELECT * FROM people" result = pl.read_database(query=query, connection=con) print(result) ``` And we have: ``` shape: (3, 3) ┌─────┬─────────┬─────┐ │ id ┆ name ┆ age │ │ --- ┆ --- ┆ --- │ │ i64 ┆ str ┆ i64 │ ╞═════╪═════════╪═════╡ │ 2 ┆ Bob ┆ 30 │ │ 3 ┆ Charlie ┆ 35 │ │ 4 ┆ Nathan ┆ 40 │ └─────┴─────────┴─────┘ ``` And so you can see the first of the first write it has gone away. Or, does anyone have any suggestion on how I could achieve that with the present fucntions/implementation? Many thanks, Nick ### Stack Trace _No response_ ### How can we reproduce the bug? _No response_ ### Environment/Setup _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org