vandop opened a new issue, #3134:
URL: https://github.com/apache/arrow-adbc/issues/3134
### What happened?
# Go ADBC Schema validation strictness causes failures with Dremio
## Summary
The Go ADBC FlightSQL driver performs strict schema validation that fails
when distributed query engines like Dremio return data from multiple endpoints
with minor schema inconsistencies (e.g., nullable differences). This appears to
be overly strict for real-world distributed systems where schema inference at
planning time may not match the actual runtime schema.
## Problem Description
### Current Behavior
1. (`readInfo` function): Validates against GetFlightInfo returned schema
### Failure Scenario
When querying Dremio (and likely other distributed engines), queries fail
with errors like:
```
endpoint 0 returned inconsistent schema: expected schema:
fields: 1
- test_value: type=int32
but got schema:
fields: 1
- test_value: type=int32, nullable
```
### Root Cause
Dremio cannot always guarantee that the schema inferred (obtained during
GetFlightInfo) at planning time matches the actual schema returned by each
execution endpoint.
## Impact
This affects real-world usage where:
- Simple queries that work in native SQL clients fail through ADBC, or at
least through the paths that enforce this strictness.
## Possible Solution
### Configuration-based relaxed validation
Add driver options to control schema validation strictness:
```python
# Skip schema validation entirely
conn = dbapi.connect(uri, db_kwargs={
"adbc.flight.sql.skip_schema_validation": "true"
})
# Relaxed validation (ignore nullable differences)
conn = dbapi.connect(uri, db_kwargs={
"adbc.flight.sql.relaxed_schema_validation": "true"
})
```
## Questions for Maintainers
1. **Is the current strict validation intentional** for data integrity
reasons, or is it an implementation artifact?
2. **Would configurable validation be acceptable** to balance data integrity
with real-world compatibility?
3. **Are there existing patterns** in other ADBC drivers for handling schema
inconsistencies?
## Environment
- **ADBC Go/Python Version**: 19
- **Server**: Dremio
- **Language**: Go and Python
## Code References
- Schema validation in metadata operations:
`go/adbc/driver/flightsql/flightsql_connection.go:709`
## Workarounds
Not known, executing the query success, but reading the results does trigger
the strictness error both in Go and Python.
### Stack Trace
_No response_
### How can we reproduce the bug?
1. Should be as easy as trigger a Dremio instance (docker for instance)
2. Execute SHOW SCHEMAS or SELECT 1 "Test" and try to print the results
### Environment/Setup
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]