metegenez opened a new issue, #4198:
URL: https://github.com/apache/arrow-adbc/issues/4198
## Summary
Proposing an ODBC adapter driver for ADBC, analogous to the existing JDBC
adapter (`java/driver/jdbc/`). This driver would wrap ODBC data sources and
expose them through the ADBC interface, converting row-oriented ODBC result
sets into Arrow columnar format.
## Motivation
ADBC currently has a JDBC adapter, but it is Java-only and cannot be used
from C/C++, Go, Python (without JNI overhead), or Rust. An ODBC adapter would
complement the JDBC adapter by providing a **language-agnostic**, **C-level**
bridge to any ODBC-accessible database.
Key advantages of an ODBC adapter over the JDBC adapter:
- **Binary/native interface**: ODBC is a C API, so the adapter can be
implemented in C/C++ and consumed from any language via the ADBC C API — no JVM
required.
- **Performance**: Data stays in native memory without JNI crossings or Java
object overhead. Bulk column-wise fetching (`SQLBindCol` +
`SQL_ATTR_ROW_ARRAY_SIZE`) can be mapped efficiently to Arrow arrays.
- **Broader reach**: Many databases and tools that ship ODBC drivers don't
have native ADBC drivers. An ODBC adapter would instantly make them accessible
through ADBC (e.g., legacy enterprise databases, proprietary data sources,
Excel/Access via ODBC, etc.).
- **Existing ecosystem**: ODBC driver managers (unixODBC, iODBC, Windows
built-in) are mature and widely deployed.
## Prior art in this repo
- The JDBC adapter (`java/driver/jdbc/`) demonstrates the wrapper pattern:
`JdbcDriver` → `JdbcConnection` → `JdbcStatement` → Arrow conversion via
`JdbcToArrow`.
- There is already an ODBC benchmark in `dev/bench/odbc/` that demonstrates
ODBC bulk fetching with `SQLBindCol` + row-array-size, suggesting the project
has already considered ODBC performance characteristics.
## Possible architecture
```
┌─────────────────────────────┐
│ ADBC C API │
│ (AdbcDatabase/Connection/ │
│ Statement) │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ ODBC Adapter (C/C++) │
│ - Type mapping (SQL_C_* │
│ → Arrow types) │
│ - Bulk fetch via │
│ SQL_ATTR_ROW_ARRAY_SIZE │
│ - Metadata mapping │
│ (SQLTables/SQLColumns │
│ → GetObjects) │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ ODBC Driver Manager │
│ (unixODBC / iODBC / │
│ Windows ODBC) │
└──────────┬──────────────────┘
│
┌──────────▼──────────────────┐
│ Database ODBC Driver │
└─────────────────────────────┘
```
## Design considerations / open questions
1. **Which ODBC driver manager to target?** unixODBC is most common on
Linux, iODBC on macOS, and Windows has built-in support. The adapter could link
against the standard ODBC headers (`sql.h`, `sqlext.h`) which are
driver-manager-agnostic.
2. **Type mapping complexity**: ODBC has `SQL_C_*` types that need mapping
to Arrow types. Some databases have quirks (similar to `JdbcQuirks` in the JDBC
adapter). A quirks/dialect system may be needed.
3. **Bulk fetching strategy**: ODBC supports `SQL_ATTR_ROW_ARRAY_SIZE` for
block cursors, which maps naturally to Arrow record batches. The existing
`dev/bench/odbc/` code already demonstrates this pattern.
4. **Parameter binding**: Arrow → ODBC parameter binding for prepared
statements (analogous to `JdbcParameterBinder`).
5. **Unicode handling**: ODBC has separate `SQLExecDirect` vs
`SQLExecDirectW` (ANSI vs Unicode). The adapter should handle both
transparently.
6. **Scope**: Should this live in `c/driver/odbc/` as a C/C++
implementation? This would make it usable from all ADBC language bindings
automatically.
## Related
- Existing JDBC adapter: `java/driver/jdbc/`
- ODBC benchmark: `dev/bench/odbc/`
- ADBC explicitly positions itself as complementary to JDBC/ODBC (from
README): *"Like JDBC/ODBC, the goal is to provide a generic API for multiple
databases"*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]