metegenez opened a new issue, #4198:
URL: https://github.com/apache/arrow-adbc/issues/4198

   ## Summary
   
   Proposing an ODBC adapter driver for ADBC, analogous to the existing JDBC 
adapter (`java/driver/jdbc/`). This driver would wrap ODBC data sources and 
expose them through the ADBC interface, converting row-oriented ODBC result 
sets into Arrow columnar format.
   
   ## Motivation
   
   ADBC currently has a JDBC adapter, but it is Java-only and cannot be used 
from C/C++, Go, Python (without JNI overhead), or Rust. An ODBC adapter would 
complement the JDBC adapter by providing a **language-agnostic**, **C-level** 
bridge to any ODBC-accessible database.
   
   Key advantages of an ODBC adapter over the JDBC adapter:
   
   - **Binary/native interface**: ODBC is a C API, so the adapter can be 
implemented in C/C++ and consumed from any language via the ADBC C API — no JVM 
required.
   - **Performance**: Data stays in native memory without JNI crossings or Java 
object overhead. Bulk column-wise fetching (`SQLBindCol` + 
`SQL_ATTR_ROW_ARRAY_SIZE`) can be mapped efficiently to Arrow arrays.
   - **Broader reach**: Many databases and tools that ship ODBC drivers don't 
have native ADBC drivers. An ODBC adapter would instantly make them accessible 
through ADBC (e.g., legacy enterprise databases, proprietary data sources, 
Excel/Access via ODBC, etc.).
   - **Existing ecosystem**: ODBC driver managers (unixODBC, iODBC, Windows 
built-in) are mature and widely deployed.
   
   ## Prior art in this repo
   
   - The JDBC adapter (`java/driver/jdbc/`) demonstrates the wrapper pattern: 
`JdbcDriver` → `JdbcConnection` → `JdbcStatement` → Arrow conversion via 
`JdbcToArrow`.
   - There is already an ODBC benchmark in `dev/bench/odbc/` that demonstrates 
ODBC bulk fetching with `SQLBindCol` + row-array-size, suggesting the project 
has already considered ODBC performance characteristics.
   
   ## Possible architecture
   
   ```
   ┌─────────────────────────────┐
   │     ADBC C API              │
   │  (AdbcDatabase/Connection/  │
   │   Statement)                │
   └──────────┬──────────────────┘
              │
   ┌──────────▼──────────────────┐
   │   ODBC Adapter (C/C++)      │
   │  - Type mapping (SQL_C_*    │
   │    → Arrow types)           │
   │  - Bulk fetch via           │
   │    SQL_ATTR_ROW_ARRAY_SIZE  │
   │  - Metadata mapping         │
   │    (SQLTables/SQLColumns    │
   │    → GetObjects)            │
   └──────────┬──────────────────┘
              │
   ┌──────────▼──────────────────┐
   │   ODBC Driver Manager       │
   │  (unixODBC / iODBC /        │
   │   Windows ODBC)             │
   └──────────┬──────────────────┘
              │
   ┌──────────▼──────────────────┐
   │   Database ODBC Driver      │
   └─────────────────────────────┘
   ```
   
   ## Design considerations / open questions
   
   1. **Which ODBC driver manager to target?** unixODBC is most common on 
Linux, iODBC on macOS, and Windows has built-in support. The adapter could link 
against the standard ODBC headers (`sql.h`, `sqlext.h`) which are 
driver-manager-agnostic.
   2. **Type mapping complexity**: ODBC has `SQL_C_*` types that need mapping 
to Arrow types. Some databases have quirks (similar to `JdbcQuirks` in the JDBC 
adapter). A quirks/dialect system may be needed.
   3. **Bulk fetching strategy**: ODBC supports `SQL_ATTR_ROW_ARRAY_SIZE` for 
block cursors, which maps naturally to Arrow record batches. The existing 
`dev/bench/odbc/` code already demonstrates this pattern.
   4. **Parameter binding**: Arrow → ODBC parameter binding for prepared 
statements (analogous to `JdbcParameterBinder`).
   5. **Unicode handling**: ODBC has separate `SQLExecDirect` vs 
`SQLExecDirectW` (ANSI vs Unicode). The adapter should handle both 
transparently.
   6. **Scope**: Should this live in `c/driver/odbc/` as a C/C++ 
implementation? This would make it usable from all ADBC language bindings 
automatically.
   
   ## Related
   
   - Existing JDBC adapter: `java/driver/jdbc/`
   - ODBC benchmark: `dev/bench/odbc/`
   - ADBC explicitly positions itself as complementary to JDBC/ODBC (from 
README): *"Like JDBC/ODBC, the goal is to provide a generic API for multiple 
databases"*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to