(doris-website) branch master updated: [fix] arrow flight (#2211)

yiguolei Tue, 18 Mar 2025 21:36:31 -0700

This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 9d15ab7c04a [fix] arrow flight (#2211)
9d15ab7c04a is described below

commit 9d15ab7c04a0fb7d2360141e51e52caaf817e2e9
Author: Xinyi Zou <zouxiny...@gmail.com>
AuthorDate: Wed Mar 19 12:33:50 2025 +0800

    [fix] arrow flight (#2211)
    
    ## Versions
    
    - [x] dev
    - [x] 3.0
    - [x] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [ ] Chinese
    - [ ] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/db-connect/arrow-flight-sql-connect.md        | 152 +++++++++++++++++++--
 .../current/db-connect/arrow-flight-sql-connect.md | 152 +++++++++++++++++++--
 .../db-connect/arrow-flight-sql-connect.md         | 152 +++++++++++++++++++--
 .../db-connect/arrow-flight-sql-connect.md         | 152 +++++++++++++++++++--
 .../db-connect/arrow-flight-sql-connect.md         | 152 +++++++++++++++++++--
 .../db-connect/arrow-flight-sql-connect.md         | 152 +++++++++++++++++++--
 6 files changed, 840 insertions(+), 72 deletions(-)

diff --git a/docs/db-connect/arrow-flight-sql-connect.md 
b/docs/db-connect/arrow-flight-sql-connect.md
index a122ff16b8d..b2a10f63c73 100644
--- a/docs/db-connect/arrow-flight-sql-connect.md
+++ b/docs/db-connect/arrow-flight-sql-connect.md
@@ -451,7 +451,9 @@ try (
 
 ### Choice of Jdbc and Java connection methods
 
-Compared with the traditional `jdbc:mysql` connection method, the performance 
test of the Arrow Flight SQL connection method of Jdbc and Java can be found at 
[GitHub Issue](https://github.com/apache/doris/issues/25514). Here are some 
usage suggestions based on the test conclusions.
+[JDBC/Java Arrow Flight SQL 
Sample](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/README.md)
 is a JDBC/Java demo using Arrow FLight SQL. You can use it to test various 
connection methods for sending queries to Arrow Flight Server, helping you 
understand how to use Arrow FLight SQL and test performance. Implemented in 
[Add Arrow Flight Sql demo for 
Java](https://github.com/apache/doris/pull/45306).
+
+Compared with the traditional `jdbc:mysql` connection method, the performance 
test of the Arrow Flight SQL connection method of Jdbc and Java can be found in 
Section 6.2 of [GitHub Issue](https://github.com/apache/doris/issues/25514). 
Here are some usage suggestions based on the test conclusions.
 
 1. For the above three Java Arrow Flight SQL connection methods, if the 
subsequent data analysis will be based on the row-based data format, then use 
jdbc:arrow-flight-sql, which will return data in the JDBC ResultSet format; if 
the subsequent data analysis can be based on the Arrow format or other 
column-based data formats, then use Flight AdbcDriver or Flight JdbcDriver to 
directly return data in the Arrow format, which will avoid row-column 
conversion and use the characteristics of Ar [...]
 
@@ -471,26 +473,152 @@ Arrow Flight currently has no official plan to support 
Spark and Flink ([GitHub
 
 The community previously referred to the open source 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connector) and 
used FlightClient in Spark to connect to Doris for testing. It was found that 
the data format conversion between Arrow and Doris Block is faster, which is 10 
times the conversion speed between CSV format and Doris Block, and it has 
better support for complex types such as Map and Array. This is because the 
Arrow data format has a high compression rate and lo [...]
 
+### Support BI tools
+
+Since Doris v2.1.8, BI tools such as DBeaver are supported to connect to Doris 
using the `arrow-flight-sql` protocol. For DBeaver's method of connecting to 
Doris using the `arrow-flight-sql` Driver, refer to: 
[how-to-use-jdbc-driver-with-dbeaver-client](https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client),
 
[client-applications/clients/dbeaver/](https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_
 [...]
+
+## Extended Application
+
+### Multiple BEs return results in parallel
+
+Doris will aggregate the results of a query on all BE nodes to one BE node by 
default. In Mysql/JDBC queries, FE will request query results from this 
aggregated data node. In Arrow Flight SQL queries, FE will wrap the IP/Port of 
this node in the Endpoint and return it to ADBC Client. ADBC Client will 
request the BE node corresponding to this Endpoint to pull data.
+
+If the query is just a simple Select to pull data from Doris, without Join, 
Sort, Window Function and other operators with data Shuffle behavior, the query 
can be split according to Tablet granularity. Now Doris Spark/Flink Connector 
uses this method to implement parallel data reading, which is divided into two 
steps:
+1. Execute `explain sql`, and the ScanOperator in the query plan returned by 
FE contains all Tablet ID Lists of Scan.
+2. Split the original SQL into multiple SQLs based on the Tablet ID List 
above. Each SQL only reads part of the Tablet. The usage is similar to `SELECT 
* FROM t1 TABLET(10001,10002) limit 1000;`. The multiple SQLs after splitting 
can be executed in parallel. Refer to [Support select table 
sample](https://github.com/apache/doris/pull/10170).
+
+If the outermost layer of the query is aggregation, the SQL is similar to 
`select k1, sum(k2) from xxx group by k1`. After Doris v3.4, execute `set 
enable_parallel_result_sink=true;` to allow each BE node of a query to return 
query results independently. After receiving the Endpoint list returned by FE, 
ADBC Client pulls results from multiple BE nodes in parallel. However, please 
note that when the aggregation result is very small, returning multiple BEs 
will increase the pressure on R [...]
+
+### Multiple BEs share the same IP accessible from outside the cluster
+
+If there is a Doris cluster, its FE node can be accessed from outside the 
cluster, and all its BE nodes can only be accessed from inside the cluster. 
This is fine when using Mysql Client and JDBC to connect to Doris to execute 
queries, and the query results will be returned by the Doris FE node. However, 
using Arrow Flight SQL to connect to Doris cannot execute queries, because ADBC 
Client needs to connect to the Doris BE node to pull query results, but the 
Doris BE node is not allowed [...]
+
+In a production environment, it is often inconvenient to expose the Doris BE 
node outside the cluster. However, you can add a reverse proxy (such as Nginx) 
to all Doris BE nodes. When the client outside the cluster connects to Nginx, 
it will be randomly routed to a Doris BE node. By default, the Arrow Flight SQL 
query results will be randomly saved on a Doris BE node. If it is different 
from the Doris BE node randomly routed by Nginx, data forwarding is required 
within the Doris BE node.
+
+Starting from Doris v2.1.8, you can configure `public_host` and 
`arrow_flight_sql_proxy_port` in `be.conf` of all Doris BE nodes to the IP and 
port shared by multiple Doris BE nodes and accessible outside the cluster. The 
query results can be correctly forwarded and returned to the ADBC Client.
+```conf
+public_host={nginx ip}
+arrow_flight_sql_proxy_port={nginx port}
+```
+
 ## FAQ
 
-1. ARM environment reports an error `get flight info statement failed, arrow 
flight schema timeout, TimeoutException: Waited 5000 milliseconds for 
io.grpc.stub.Client`. If the Linux kernel version is <= 4.19.90, you need to 
upgrade to 4.19.279 or above, or recompile Doris BE in the environment of the 
lower version of the Linux kernel. For specific compilation methods, refer to 
the document <docs/dev/install/source-install/compilation-arm>
+1. Q: ARM environment reports an error `get flight info statement failed, 
arrow flight schema timeout, TimeoutException: Waited 5000 milliseconds for 
io.grpc.stub.Client`.
+
+A: If the Linux kernel version is <= 4.19.90, you need to upgrade to 4.19.279 
or above, or recompile Doris BE in the environment of the lower version of the 
Linux kernel. For specific compilation methods, refer to the document 
<docs/dev/install/source-install/compilation-arm>
+
+Cause: This is because there is a compatibility issue between the old version 
of the Linux kernel and Arrow. `cpp: arrow::RecordBatch::MakeEmpty()` will get 
stuck when constructing Arrow Record Batch, causing Doris BE's Arrow Flight 
Server to fail to respond to Doris FE's Arrow Flight Server's RPC request 
within 5000ms, causing FE to return rpc timeout failed to Client. When Spark 
and Flink read Doris, they also convert the query results into Arrow Record 
Batch and return them, so the sa [...]
+
+The Linux kernel version of kylinv10 SP2 and SP3 is only 
4.19.90-24.4.v2101.ky10.aarch64 at most. The kernel version cannot be upgraded 
further. Doris BE can only be recompiled on kylinv10. If the problem still 
exists after compiling Doris BE with the new version of ldb_toolchain, you can 
try to compile it with the lower version of ldb_toolchain v0.17. If your ARM 
environment cannot connect to the external network, Huawei Cloud provides ARM + 
kylinv10, and Alibaba Cloud provides x86 + kylinv10
+
+2. Q: Prepared statement passes parameters and reports errors.
+
+A: Currently, `jdbc:arrow-flight-sql` and Java ADBC/JDBCDriver do not support 
prepared statement parameter passing. For example, `select * from xxx where 
id=?` will report an error `parameter ordinal 1 out of range`. This is a bug in 
Arrow Flight SQL ([GitHub Issue](https://github.com/apache/arrow/issues/40118)).
+
+3. Q: How to modify the batch size read by `jdbc:arrow-flight-sql` each time 
to improve performance in some scenarios.
+
+A: By modifying `setTargetBatchSize` in the `makeJdbcConfig` method in the 
`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader` file, the default is 1024, 
and then saving the modified file to the local directory with the same path 
name, so as to overwrite the original file and take effect.
+
+4. Q: ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel 
reading.
+
+A: The `stmt.executePartitioned()` method is not implemented. You can only use 
the native FlightClient to implement parallel reading of multiple endpoints, 
using the method `sqlClient=new FlightSqlClient, 
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(), 
for(FlightEndpoint endpoint: endpoints)`. In addition, the default 
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQuery, 
the row-format JDBC ResultSet is converted back to the Arrow column format. I 
[...]
+
+5. Q: Specify the database name in the URL.
+
+A: As of Arrow v15.0, Arrow JDBC Connector does not support specifying 
database name in URL. For example, specifying connection to `test` database in 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 is invalid, and you can only execute SQL `use database` manually. Arrow v18.0 
supports specifying database name in URL, but there are still bugs in actual 
testing.
+
+6. Q: Python ADBC prints `Warning: Cannot disable autocommit; conn will not 
be DB-API 2.0 compliant`.
+
+A: Ignore this Warning when using Python. This is a problem with Python ADBC 
Client and will not affect queries.
+
+7. Q: Python reports an error `grpc: received message larger than max 
(20748753 vs. 16777216)`.
+
+A: Refer to [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) and add 
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database 
Option.
+
+8. Q: Error `invalid bearer token` is reported.
+
+A: Execute `SET PROPERTY FOR 'root' 'max_user_connections' = '10000';` to 
change the current maximum number of connections for the current user to 10000; 
add qe_max_connection=30000 and arrow_flight_token_cache_size=8000 in `fe.conf` 
and restart FE.
+
+The connection between the ADBC Client and the Arrow Flight Server is 
essentially a long link, which requires Auth Token, Connection, and Session to 
be cached on the Server. After the connection is created, it will not be 
disconnected immediately at the end of a single query. The Client needs to send 
a close() request to clean it up, but in fact, the Client often does not send a 
close request, so the Auth Token, Connection, and Session will be saved on the 
Arrow Flight Server for a lon [...]
+
+As of Doris v2.1.8, Arrow Flight connections and Mysql/JDBC connections use 
the same connection limit, including the total number of connections of all FE 
users `qe_max_connection` and the number of connections of a single user 
`max_user_connections` in `UserProperty`. But the default `qe_max_connection` 
and `max_user_connections` are 1024 and 100 respectively. Arrow Flight SQL is 
often used to replace JDBC scenarios, but the JDBC connection will be released 
immediately after the query e [...]
+
+9. Q: Java Arrow Flight SQL reads the Datatime type and returns a timestamp 
instead of the formatted time.
+
+A: Java Arrow Flight SQL needs to convert the timestamp by itself when reading 
the Datatime type. Refer to [Add java parsing datetime type in arrow flight sql 
sample #48578](https://github.com/apache/doris/pull/48578). Using Python Arrow 
Flight SQL to read the Datatime type returns `2025-03-03 17:23:28Z`, while Java 
Arrow Flight SQL returns `1740993808`.
+
+10. Q: Java Arrow Flight JDBC Client reports an error `Configuration does not 
provide a mapping for array column 2` when reading Array nested types.
+
+A: Refer to 
[`sample/arrow-flight-sql`](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/src/main/java/doris/arrowflight/demo/FlightAdbcDriver.java)
 to use JAVA ADBC Client.
+
+Python ADBC Client, JAVA ADBC Client, and Java JDBC DriverManager all have 
no problem reading Array nested types. Only Java Arrow Flight JDBC Client has 
problems. In fact, the compatibility of Arrow Flight JDBC Client is not 
guaranteed. It is not officially developed by Arrow, but by a third-party 
database company Dremio. Other compatibility issues have been found before, so 
it is recommended to use JAVA ADBC Client first.
+
+## Release Note
+
+> Arrow Flight SQL protocol is supported since Doris 2.1. As of Doris 2.1.9, 
the issues fixed are listed based on Doris 2.1 series versions. Doris 3.0 
series versions are self-checked.
+
+### v2.1.9
+
+1. Fix the problem of Doris data serialization to Arrow.
+[Fix UT DataTypeSerDeArrowTest of Array/Map/Struct/Bitmap/HLL/Decimal256 
types](https://github.com/apache/doris/pull/48944)
+- Failed to read `Decimal256` type;
+- Subtle error in reading `DatetimeV2` type;
+- Incorrect result in reading `DateV2` type;
+- Error when reading `IPV4/IPV6` type result is NULL;
+
+2. Fix the problem that Doris Arrow Flight SQL query fails and returns empty 
result, without returning real error information.
+[Fix query result is empty and not return query error 
message](https://github.com/apache/doris/pull/45023)
+
+### v2.1.8
+
+1. Support BI tools such as DBeaver to connect to Doris using the 
`arrow-flight-sql` protocol, and support the correct display of metadata trees.
+[Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, 
getStreamTables #46217](https://github.com/apache/doris/pull/46217).
+
+2. When multiple BEs share the same IP that is accessible to the outside of 
the cluster, the query results can be correctly forwarded and returned to the 
ADBC Client.
+[Arrow flight server supports data forwarding when BE uses public 
vip](https://github.com/apache/doris/pull/43281)
+
+3. Support multiple endpoints to read in parallel.
+[Arrow Flight support multiple 
endpoints](https://github.com/apache/doris/pull/44286)
+
+4. Fix query error `FE not found arrow flight schema`.
+[Fix FE not found arrow flight 
schema](https://github.com/apache/doris/pull/43960)
+
+5. Fix error `BooleanBuilder::AppendValues` when reading columns that allow 
NULL.
+[Fix Doris NULL column conversion to arrow 
batch](https://github.com/apache/doris/pull/43929)
+
+6. Fix `show processlist` displays duplicate Connection IDs.
+[Fix arrow-flight-sql ConnectContext to use a unified ID 
#46284](https://github.com/apache/doris/pull/46284)
+
+### v2.1.7
+
+1. Fix frequent log printing `Connection wait_timeout`.
+[Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor 
close](https://github.com/apache/doris/pull/41770)
+
+2. Fix Arrow Flight Bearer Token expiration from Cache.
+[Fix Arrow Flight bearer token cache evict after 
expired](https://github.com/apache/doris/pull/41754)
+
+### v2.1.6
 
-Problem cause: This is because there is a compatibility issue between the old 
version of the Linux kernel and Arrow. `cpp: arrow::RecordBatch::MakeEmpty()` 
will get stuck when constructing Arrow Record Batch, causing Doris BE's Arrow 
Flight Server to fail to respond to Doris FE's Arrow Flight Server RPC request 
within 5000ms, causing FE to return rpc timeout failed to Client. When Spark 
and Flink read Doris, they also convert the query results into Arrow Record 
Batch and return them, so  [...]
+1. Fix query error `0.0.0.0:xxx, connection refused`.
+[Fix return result from FE Arrow Flight server error 0.0.0.0:xxx, connection 
refused](https://github.com/apache/doris/pull/40002)
 
-The Linux kernel version of kylinv10 SP2 and SP3 is only up to 
4.19.90-24.4.v2101.ky10.aarch64. You cannot continue to upgrade the kernel 
version. You can only recompile Doris BE on kylinv10. If the problem still 
exists after compiling Doris BE with the new version of ldb_toolchain, you can 
try to compile it with the lower version of ldb_toolchain v0.17. If your ARM 
environment cannot connect to the Internet, Huawei Cloud provides ARM + 
kylinv10, and Alibaba Cloud provides x86 + kylinv10
+2. Fix query error `Reach limit of connections`.
+[Fix exceed user property max connection cause Reach limit of connections 
#39127](https://github.com/apache/doris/pull/39127)
 
-2. Currently, `jdbc:arrow-flight-sql` and Java ADBC/JDBCDriver do not support 
prepared statement passing parameters. For example, `select * from xxx where 
id=?` will report an error `parameter ordinal 1 out of range`. This is a bug in 
Arrow Flight SQL ([GitHub Issue](https://github.com/apache/arrow/issues/40118))
+In previous versions, execute `SET PROPERTY FOR 'root' 'max_user_connections' 
= '1024';` to modify the current maximum number of connections for the current 
user to 1024, which can be temporarily circumvented.
 
-3. Modification `jdbc:arrow-flight-sql` The batch size of each read can 
improve performance in some scenarios. By modifying the `setTargetBatchSize` in 
the `makeJdbcConfig` method in the 
`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader` file, the default is 1024, 
and then saving the modified file to the local directory with the same name, it 
will overwrite the original file to take effect.
+Because the previous version only limits the number of Arrow Flight 
connections to less than `qe_max_connection/2`, `qe_max_connection` is the 
total number of connections for all fe users, the default is 1024, and does not 
limit the number of Arrow Flight connections for a single user to less than 
`max_user_connections` in `UserProperty`, the default is 100, so when the 
number of Arrow Flight connections exceeds the upper limit of the current 
user's connection number, an error `Reach lim [...]
 
-4. ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel 
reading, and the `stmt.executePartitioned()` method is not implemented. You can 
only use the native FlightClient to implement parallel reading of multiple 
Endpoints, using the method `sqlClient=new FlightSqlClient, 
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(), 
for(FlightEndpoint endpoint: endpoints)`. In addition, the default 
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQue [...]
+For details of the problem, see: 
[Questions](https://ask.selectdb.com/questions/D18b1/2-1-4-ban-ben-python-shi-yong-arrow-flight-sql-lian-jie-bu-hui-duan-kai-lian-jie-shu-zhan-man-da-dao-100/E1ic1?commentId=10070000000005324)
 
-5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the 
database name in the URL. For example, 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 specifies that the connection to the `test` database is invalid. You can only 
execute the SQL `use database` manually.
+3. Add Conf `arrow_flight_result_sink_buffer_size_rows` to support modifying 
the ArrowBatch size of query results returned in a single time, the default is 
4096 * 8.
+[Add config 
arrow_flight_result_sink_buffer_size_rows](https://github.com/apache/doris/pull/38221)
 
-6. There is a bug in Doris 2.1.4 version. There is a chance of error when 
reading large amounts of data. This bug is fixed in [Fix arrow flight result 
sink #36827](https://github.com/apache/doris/pull/36827) PR. Upgrading Doris 
2.1.5 version can solve this problem. For details of the problem, see: 
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+### v2.1.5
 
-7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant` 
Ignore this warning when using Python. This is a problem with the Python ADBC 
Client and will not affect the query.
+1. Fix the problem that Arrow Flight SQL query results are empty.
+[Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827)
 
-8. Python reports an error `grpc: received message larger than max (20748753 
vs. 16777216)`. Refer to [Python: grpc: received message larger than max 
(20748753 vs. 16777216) 
#2078](https://github.com/apache/arrow-adbc/issues/2078) to add 
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database 
Option.
+Doris v2.1.4 may report an error when reading large amounts of data. For 
details, see: 
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
 
-9. Before Doris version 2.1.7, the error `Reach limit of connections` is 
reported. This is because there is no limit on the number of Arrow Flight 
connections for a single user, which is less than `max_user_connections` in 
`UserProperty`, which is 100 by default. You can modify the current maximum 
number of connections for Billie user to 100 by `SET PROPERTY FOR 'Billie' 
'max_user_connections' = '1000';`, or add `arrow_flight_token_cache_size=50` in 
`fe.conf` to limit the overall number  [...]
+### Doris Arrow Flight v2.1.4 and earlier versions are not perfect. It is 
recommended to upgrade before use.
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
index 21f0b1eb3f2..19b5abaaba4 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/db-connect/arrow-flight-sql-connect.md
@@ -451,7 +451,9 @@ try (
 
 ### JDBC 和 Java 连接方式的选择
 
-对比传统的 `jdbc:mysql` 连接方式，Jdbc 和 Java 的 Arrow Flight SQL 连接方式的性能测试见 [GitHub 
Issue](https://github.com/apache/doris/issues/25514)，这里基于测试结论给出一些使用建议。
+[JDBC/Java Arrow Flight SQL 
Sample](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/README.md)
 是 JDBC/Java 使用 Arrow FLight SQL 的 demo，你可以使用它来测试向 Arrow Flight Server 
发送查询的多种连接方法，帮助你了解如何使用 Arrow FLight SQL 并测试性能。在 [Add Arrow Flight Sql demo for 
Java](https://github.com/apache/doris/pull/45306) 中实现。
+
+对比传统的 `jdbc:mysql` 连接方式，Jdbc 和 Java 的 Arrow Flight SQL 连接方式的性能测试见 Section 6.2 
of [GitHub 
Issue](https://github.com/apache/doris/issues/25514)，这里基于测试结论给出一些使用建议。
 
 1. 上述三种 Java Arrow Flight SQL 连接方式的选择上，如果后续的数据分析将基于行存的数据格式，那么使用 
jdbc:arrow-flight-sql，这将返回 JDBC ResultSet 格式的数据；如果后续的数据分析可以基于 Arrow 
格式或其他列存数据格式，那么使用 Flight AdbcDriver 或 Flight JdbcDriver 直接返回 Arrow 
格式的数据，这将避免行列转换，并可利用 Arrow 的特性加速数据解析。
 
@@ -471,26 +473,152 @@ Arrow Flight 官方目前没有支持 Spark 和 Flink 的计划（见 [GitHub Is
 
 社区之前参考开源的 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connector)，在 
Spark 中使用 FlightClient 连接 Doris 测试，发现 Arrow 与 Doris Block 之间数据格式转换的速度更快，是 CSV 
格式与 Doris Block 之间转换速度的 10 倍，而且对 Map，Array 等复杂类型的支持更好，这是因为 Arrow 
数据格式的压缩率高，传输时网络开销小。不过目前 Doris Arrow Flight 还没有实现多节点并行读取，仍是将查询结果汇总到一台 BE 
节点后返回，对简单的批量导出数据而言，性能可能没有 Doris Spark Connector 快，后者支持 Tablet 级别的并行读取。如果你希望在 
Spark 使用 Arrow Flight SQL 连接 Doris，可以参考开源的 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connec [...]
 
+### 支持 BI 工具
+
+自 Doris v2.1.8 开始，支持 DBeaver 等 BI 工具使用 `arrow-flight-sql` 协议连接 Doris。DBeaver 
使用 `arrow-flight-sql` Driver 连接 Doris 的方法参考: 
[how-to-use-jdbc-driver-with-dbeaver-client](https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client)，[client-applications/clients/dbeaver/](https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_gcl_au*MjUyNjE1ODM0LjE3MzQwMDExNDg)。
+
+## 扩展应用
+
+### 多 BE 并行返回结果
+
+Doris 默认会将一个查询在所有 BE 节点上的结果汇总聚合到一个 BE 节点上，在 Mysql/JDBC 查询中 FE 
会向这个汇总数据的节点请求查询结果，在 Arrow Flight SQL 查询中 FE 会将这个节点的 IP/Port 包装在 Endpoint 中返回给 
ADBC Client，ADBC Client 会请求这个 Endpoint 对应的 BE 节点拉取数据。
+
+如果查询只是简单的 Select 从 Doris 拉取数据，没有 Join、Sort、窗口函数等有数据 Shuffle 行为的算子，可以将查询按照 
Tablet 粒度拆分，现在 Doris Spark/Flink Connector 就是用的这个方法实现并行读取数据，分为两个步骤：
+    1. 执行 `explain sql`，FE 返回的查询计划中 ScanOperator 包含 Scan 的所有 Tablet ID List。
+    2. 依据上面的 Tablet ID List 将原始 SQL 拆分为多个 SQL，每个 SQL 只读取部分 Tablet，用法类似 `SELECT 
* FROM t1 TABLET(10001,10002) limit 1000;`，拆分后的多个 SQL 可以并行执行。参考 [Support select 
table sample](https://github.com/apache/doris/pull/10170)。
+
+如果查询最外层是聚合，SQL 类似 `select k1, sum(k2) from xxx group by k1`，Doris v3.4 版本后，执行 
`set enable_parallel_result_sink=true;` 后允许一个查询的每个 BE 节点独立返回查询结果，ADBC Client 收到 
FE 返回的 Endpoint 列表后并行从多个 BE 节点拉取结果。不过注意当聚合结果很小时，多 BE 返回会增加 RPC 的压力。具体实现参考 
[support parallel result 
sink](https://github.com/apache/doris/pull/36053)。理论上除了最外层是排序的查询，其他查询都可以支持每个 BE 
节点并行返回结果，不过暂时没有这方便的需求，没有更进一步实现。
+
+### 多 BE 共享同一个可供集群外部访问的 IP
+
+如果存在一个 Doris 集群，它的 FE 节点可以被集群外部访问，它的所有 BE 节点只可以被集群内部访问。这在使用 Mysql Client 和 
JDBC 连接 Doris 执行查询是没问题的，查询结果将由 Doris FE 节点返回给。但使用 Arrow Flight SQL 连接 Doris 
无法执行查询，因为 ADBC Client 需要连接 Doris BE 节点拉取查询结果，但 Doris BE 节点不允许被集群外部访问。
+
+在生产环境中，很多时候不方便将 Doris BE 节点暴露到集群外。但可以为所有 Doris BE 节点增加了一层反向代理（比如 Nginx），集群外部的 
Client 连接 Nginx 时会随机路由到一台 Doris BE 节点上。默认情况下，Arrow Flight SQL 查询结果会随机保存在一台 
Doris BE 节点上，如果和 Nginx 随机路由的 Doris BE 节点不同，需要在 Doris BE 节点内部做一次数据转发。
+
+自 Doris v2.1.8 开始，你可以在所有 Doris BE 节点的 `be.conf` 中将 `public_host` 和 
`arrow_flight_sql_proxy_port` 配置成多 Doris BE 节点共享的可供集群外部访问的 IP 和端口，查询结果可以正确转发后返回 
ADBC Client。
+```conf
+    public_host={nginx ip}
+    arrow_flight_sql_proxy_port={nginx port}
+```
+
 ## FAQ
 
-1. ARM 环境报错 `get flight info statement failed, arrow flight schema timeout, 
TimeoutException: Waited 5000 milliseconds for io.grpc.stub.Client`，如果 Linux 
内核版本 <= 4.19.90，需要升级到 4.19.279 及以上，或者在低版本 Linux 内核的环境中重新编译 Doris 
BE，具体编译方法参考文档<docs/dev/install/source-install/compilation-arm>
+1. Q: ARM 环境报错 `get flight info statement failed, arrow flight schema timeout, 
TimeoutException: Waited 5000 milliseconds for io.grpc.stub.Client`。
+   
+    A: 如果 Linux 内核版本 <= 4.19.90，需要升级到 4.19.279 及以上，或者在低版本 Linux 内核的环境中重新编译 
Doris BE，具体编译方法参考文档<docs/dev/install/source-install/compilation-arm>
+
+    问题原因：这是因为老版本 Linux 内核和 Arrow 存在兼容性问题，`cpp: 
arrow::RecordBatch::MakeEmpty()` 构造 Arrow Record Batch 时会卡住，导致 Doris BE 的 Arrow 
Flight Server 在 5000ms 内没有回应 Doris FE 的 Arrow Flight Server 的 RPC 请求，导致 FE 给 
Client 返回 rpc timeout failed。Spark 和 Flink 读取 Doris 时也是将查询结果转换成 Arrow Record 
Batch 后返回，所以也存在同样的问题。
+
+    kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有 
4.19.90-24.4.v2101.ky10.aarch64，无法继续升级内核版本，只能在 kylinv10 上重新编译 Doris BE，如果使用新版本 
ldb_toolchain 编译 Doris  BE 后问题依然存在，可以尝试使用低版本 ldb_toolchain v0.17 编译，如果你的 ARM 
环境无法连外网，华为云提供 ARM + kylinv10，阿里云提供 x86 + kylinv10
+
+2. Q:  prepared statement 传递参数报错。
+   
+    A: 目前 `jdbc:arrow-flight-sql` 和 Java ADBC/JDBCDriver 不支持 prepared 
statement 传递参数，类似`select * from xxx where id=?`，将报错 `parameter ordinal 1 out of 
range`，这是 Arrow Flight SQL 的一个 BUG（[GitHub 
Issue](https://github.com/apache/arrow/issues/40118)）。
+
+3. Q: 如何修改 `jdbc:arrow-flight-sql` 每次读取的批次大小，在某些场景下提升性能。
+   
+    A: 
通过修改`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader`文件中`makeJdbcConfig`方法中的 
`setTargetBatchSize`，默认是 1024，然后将修改后的文件保存到本地同名路径目录下，从而覆盖原文件生效。
+
+4. Q: ADBC v0.10，JDBC 和 Java ADBC/JDBCDriver 不支持并行读取。
+   
+    A: 没有实现`stmt.executePartitioned()`这个方法，只能使用原生的 FlightClient 实现并行读取多个 
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql), 
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint: 
endpoints)`，此外，ADBC V0.10 默认的 AdbcStatement 实际是 JdbcStatement，executeQuery 
后将行存格式的 JDBC ResultSet 又重新转成的 Arrow 列存格式，预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 
[GitHub Issue](https://github.com/apache/arrow-adbc/issues/1490)。
+
+5. Q: 在 URL 中指定 database name。
+
+    A: 截止 Arrow v15.0，Arrow JDBC Connector 不支持在 URL 中指定 database name，比如 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 中指定连接`test` database 无效，只能手动执行 SQL `use database`。Arrow v18.0 支持了在 URL 中指定 
database name，但实测仍有 BUG。
+
+6. Q: Python ADBC print `Warning: Cannot disable autocommit; conn will not be 
DB-API 2.0 compliant`。
+
+    A: 使用 Python 时忽略这个 Warning，这是 Python ADBC Client 的问题，不会影响查询。
+
+7. Q: Python 报错 `grpc: received message larger than max (20748753 vs. 
16777216)`。
+
+    A: 参考 [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database 
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+
+8. Q: 报错 `invalid bearer token`。
+
+    A: 执行 `SET PROPERTY FOR 'root' 'max_user_connections' = '10000';` 
修改当前用户的当前最大连接数到 10000；在 `fe.conf` 增加 qe_max_connection=30000 和 
arrow_flight_token_cache_size=8000 并重启 FE。
+
+    ADBC Client 和 Arrow Flight Server 端之间的连接本质上是个长链接，需要在 Server 缓存 Auth 
Token、Connection、Session，连接创建后不会在单个查询结束时立即断开，需要 Client 发送 close() 请求后清理，但实际上 
Client 经常不会发送 close 请求，所以 Auth Token、Connection、Session 会长时间在 Arrow Flight 
Server 上保存，默认会在 3 天后超时断开，或者在连接数超过 `arrow_flight_token_cache_size` 的限制后依据 LRU 淘汰。
+    
+    截止 Doris v2.1.8，Arrow Flight 连接和 Mysql/JDBC 连接使用相同的连接数限制，包括 FE 所有用户的总连接数 
`qe_max_connection` 和单个用户的连接数 `UserProperty` 中的 `max_user_connections`。但默认的 
`qe_max_connection` 和 `max_user_connections` 分别是 1024 和 100。Arrow Flight SQL 
常用来取代使用 JDBC 的场景，但 JDBC 连接会在查询结束后立即释放，所以使用 Arrow Flight SQL 时，Doris 
默认的连接数限制太小，经常导致连接数超过 `arrow_flight_token_cache_size` 的限制后将仍在被使用的连接淘汰。
+
+9. Q: Java Arrow Flight SQL 读取 Datatime 类型返回时间戳，而不是格式化时间。
+
+    A: Java Arrow Flight SQL 读取 Datatime 类型需要自行转换时间戳，参考 [Add java parsing 
datetime type in arrow flight sql sample 
#48578](https://github.com/apache/doris/pull/48578)。用 Python Arrow Flight SQL 
读取 Datatime 类型返回结果为 `2025-03-03 17:23:28Z`，而 Java Arrow Flight SQL 返回 
`1740993808`。
+
+10. Q: Java Arrow Flight JDBC Client 读取 Array 嵌套类型报错 `Configuration does not 
provide a mapping for array column 2`。
+
+    A: 参考 
[`sample/arrow-flight-sql`](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/src/main/java/doris/arrowflight/demo/FlightAdbcDriver.java)
 使用 JAVA ADBC Client。
+    
+    Python ADBC Client、JAVA ADBC Client、Java JDBC DriverManager 读取 Array 
嵌套类型都没问题，只有 Java Arrow Flight JDBC Client 有问题，实际上 Arrow Flight JDBC Client 
的兼容性不好保证，不是 Arrow 官方开发的，由一个第三方数据库公司 Dremio 开发，之前还发现过其他兼容性问题，所以建议优先用 JAVA ADBC 
Client。
+
+## Release Note
+
+> 自 Doris 2.1 开始支持 Arrow Flight SQL 协议，截止 Doris 2.1.9，以 Doris 2.1 
系列版本为准列举修复的问题，Doris 3.0 系列版本自行对照。
+
+### v2.1.9
+
+1. 修复 Doris 数据序列化到 Arrow 的问题。
+    [Fix UT DataTypeSerDeArrowTest of Array/Map/Struct/Bitmap/HLL/Decimal256 
types](https://github.com/apache/doris/pull/48944)
+    - 读取 `Decimal256` 类型失败;
+    - 读取 `DatetimeV2` 类型微妙部分错误;
+    - 读取 `DateV2` 类型结果不正确;
+    - 读取 `IPV4/IPV6` 类型结果为 NULL 时报错;
+
+2. 修复 Doris Arrow Flight SQL 查询失败返回空结果，没有返回真实的错误信息。
+    [Fix query result is empty and not return query error 
message](https://github.com/apache/doris/pull/45023)
+
+### v2.1.8
+
+1. 支持 DBeaver 等 BI 工具使用 `arrow-flight-sql` 协议连接 Doris，支持正确显示元数据树。
+    [Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, 
getStreamTables #46217](https://github.com/apache/doris/pull/46217)。
+
+2. 支持多 BE 共享同一个可供集群外部访问的 IP 时，查询结果可以正确转发后返回 ADBC Client。
+    [Arrow flight server supports data forwarding when BE uses public 
vip](https://github.com/apache/doris/pull/43281)
+
+3. 支持多个 Endpoint 并行读取。
+    [Arrow Flight support multiple 
endpoints](https://github.com/apache/doris/pull/44286) 
+
+4. 修复查询报错 `FE not found arrow flight schema`。
+    [Fix FE not found arrow flight 
schema](https://github.com/apache/doris/pull/43960)
+
+5. 修复读取允许 NULL 的列报错 `BooleanBuilder::AppendValues`。
+    [Fix Doris NULL column conversion to arrow 
batch](https://github.com/apache/doris/pull/43929)
+
+6. 修复 `show processlist` 显示重复的 Connection ID。
+    [Fix arrow-flight-sql ConnectContext to use a unified ID 
#46284](https://github.com/apache/doris/pull/46284)
+
+### v2.1.7
+
+1. 修复频繁打印日志 `Connection wait_timeout`。
+    [Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor 
close](https://github.com/apache/doris/pull/41770)
 
-问题原因：这是因为老版本 Linux 内核和 Arrow 存在兼容性问题，`cpp: arrow::RecordBatch::MakeEmpty()` 构造 
Arrow Record Batch 时会卡住，导致 Doris BE 的 Arrow Flight Server 在 5000ms 内没有回应 Doris 
FE 的 Arrow Flight Server 的 RPC 请求，导致 FE 给 Client 返回 rpc timeout failed。Spark 和 
Flink 读取 Doris 时也是将查询结果转换成 Arrow Record Batch 后返回，所以也存在同样的问题。
+2. 修复 Arrow Flight Bearer Token 过期后从 Cache 中淘汰。
+    [Fix Arrow Flight bearer token cache evict after 
expired](https://github.com/apache/doris/pull/41754)
 
-kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有 
4.19.90-24.4.v2101.ky10.aarch64，无法继续升级内核版本，只能在 kylinv10 上重新编译 Doris BE，如果使用新版本 
ldb_toolchain 编译 Doris  BE 后问题依然存在，可以尝试使用低版本 ldb_toolchain v0.17 编译，如果你的 ARM 
环境无法连外网，华为云提供 ARM + kylinv10，阿里云提供 x86 + kylinv10
+### v2.1.6
 
-2. 目前 `jdbc:arrow-flight-sql` 和 Java ADBC/JDBCDriver 不支持 prepared statement 
传递参数，类似`select * from xxx where id=?`，将报错 `parameter ordinal 1 out of range`，这是 
Arrow Flight SQL 的一个 BUG（[GitHub 
Issue](https://github.com/apache/arrow/issues/40118)）
+1. 修复查询报错 `0.0.0.0:xxx, connection refused`。
+    [Fix return result from FE Arrow Flight server error 0.0.0.0:xxx, 
connection refused](https://github.com/apache/doris/pull/40002)
 
-3. 修改 `jdbc:arrow-flight-sql` 
每次读取的批次大小，在某些场景下可以提升性能，通过修改`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader`文件中`makeJdbcConfig`方法中的
 `setTargetBatchSize`，默认是 1024，然后将修改后的文件保存到本地同名路径目录下，从而覆盖原文件生效。
+2. 修复查询报错 `Reach limit of connections`。
+    [Fix exceed user property max connection cause Reach limit of connections 
#39127](https://github.com/apache/doris/pull/39127)
+    
+    之前的版本执行 `SET PROPERTY FOR 'root' 'max_user_connections' = '1024';` 
修改当前用户的当前最大连接数到 1024，可临时规避。
 
-4. ADBC v0.10，JDBC 和 Java ADBC/JDBCDriver 
还不支持并行读取，没有实现`stmt.executePartitioned()`这个方法，只能使用原生的 FlightClient 实现并行读取多个 
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql), 
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint: 
endpoints)`，此外，ADBC V0.10 默认的 AdbcStatement 实际是 JdbcStatement，executeQuery 
后将行存格式的 JDBC ResultSet 又重新转成的 Arrow 列存格式，预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 
[GitHub Issue](https://github.com/apache/arrow-adbc/issues/1490)。
+    因为之前的版本只限制 Arrow Flight 连接数小于 `qe_max_connection/2`，`qe_max_connection` 是 
fe 所有用户的总连接数，默认 1024，没有限制单个用户的 Arrow Flight 连接数小于 `UserProperty` 中的 
`max_user_connections`，默认 100，所以当 Arrow Flight 连接数超过当前用户连接数上限时将报错 `Reach limit 
of connections`，所以需调大当前用户的 `max_user_connections`。
+    
+    
问题详情见：[Questions](https://ask.selectdb.com/questions/D18b1/2-1-4-ban-ben-python-shi-yong-arrow-flight-sql-lian-jie-bu-hui-duan-kai-lian-jie-shu-zhan-man-da-dao-100/E1ic1?commentId=10070000000005324)
 
-5. 截止 Arrow v15.0，Arrow JDBC Connector 不支持在 URL 中指定 database name，比如 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 中指定连接`test` database 无效，只能手动执行 SQL `use database`。
+3. 增加 Conf `arrow_flight_result_sink_buffer_size_rows`，支持修改单次返回的查询结果 
ArrowBatch 大小，默认 4096 * 8。
+    [Add config 
arrow_flight_result_sink_buffer_size_rows](https://github.com/apache/doris/pull/38221)
 
-6. Doris 2.1.4 version 存在一个 Bug，读取大数据量时有几率报错，在 [Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827) 修复，升级 Doris 2.1.5 version 
可以解决。问题详情见：[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+### v2.1.5
 
-7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant` 
使用 Python 时忽略这个 Warning，这是 Python ADBC Client 的问题，这不会影响查询。
+1. 修复Arrow Flight SQL 查询结果为空。
+    [Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827)
 
-8. Python 报错 `grpc: received message larger than max (20748753 vs. 
16777216)`，参考 [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database 
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+    Doris v2.1.4 
读取大数据量时有几率报错，问题详情见：[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
 
-9. Doris version 2.1.7 版本之前，报错 `Reach limit of connections`，这是因为没有限制单个用户的 
Arrow Flight 连接数小于 `UserProperty` 中的 `max_user_connections`，默认 100，可以通过 `SET 
PROPERTY FOR 'Billie' 'max_user_connections' = '1000';` 修改 Billie 用户的当前最大连接数到 
100，或者在 `fe.conf` 中增加 `arrow_flight_token_cache_size=50` 来限制整体的 Arrow Flight 
连接数。Doris version 2.1.7 版本之前 Arrow Flight 连接默认 3 天 超时断开，只强制连接数小于 
`qe_max_connection/2`，超过时依据 lru 淘汰，`qe_max_connection` 是 fe 所有用户的总连接数，默认 
1024。具体可以看 `arrow_flight_token_cache_size` 这个 co [...]
+### v2.1.4 及之前的版本 Doris Arrow Flight 不够完善，建议升级后使用。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
index df5765507da..19b5abaaba4 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/db-connect/arrow-flight-sql-connect.md
@@ -451,7 +451,9 @@ try (
 
 ### JDBC 和 Java 连接方式的选择
 
-对比传统的 `jdbc:mysql` 连接方式，Jdbc 和 Java 的 Arrow Flight SQL 连接方式的性能测试见 [GitHub 
Issue](https://github.com/apache/doris/issues/25514)，这里基于测试结论给出一些使用建议。
+[JDBC/Java Arrow Flight SQL 
Sample](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/README.md)
 是 JDBC/Java 使用 Arrow FLight SQL 的 demo，你可以使用它来测试向 Arrow Flight Server 
发送查询的多种连接方法，帮助你了解如何使用 Arrow FLight SQL 并测试性能。在 [Add Arrow Flight Sql demo for 
Java](https://github.com/apache/doris/pull/45306) 中实现。
+
+对比传统的 `jdbc:mysql` 连接方式，Jdbc 和 Java 的 Arrow Flight SQL 连接方式的性能测试见 Section 6.2 
of [GitHub 
Issue](https://github.com/apache/doris/issues/25514)，这里基于测试结论给出一些使用建议。
 
 1. 上述三种 Java Arrow Flight SQL 连接方式的选择上，如果后续的数据分析将基于行存的数据格式，那么使用 
jdbc:arrow-flight-sql，这将返回 JDBC ResultSet 格式的数据；如果后续的数据分析可以基于 Arrow 
格式或其他列存数据格式，那么使用 Flight AdbcDriver 或 Flight JdbcDriver 直接返回 Arrow 
格式的数据，这将避免行列转换，并可利用 Arrow 的特性加速数据解析。
 
@@ -471,26 +473,152 @@ Arrow Flight 官方目前没有支持 Spark 和 Flink 的计划（见 [GitHub Is
 
 社区之前参考开源的 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connector)，在 
Spark 中使用 FlightClient 连接 Doris 测试，发现 Arrow 与 Doris Block 之间数据格式转换的速度更快，是 CSV 
格式与 Doris Block 之间转换速度的 10 倍，而且对 Map，Array 等复杂类型的支持更好，这是因为 Arrow 
数据格式的压缩率高，传输时网络开销小。不过目前 Doris Arrow Flight 还没有实现多节点并行读取，仍是将查询结果汇总到一台 BE 
节点后返回，对简单的批量导出数据而言，性能可能没有 Doris Spark Connector 快，后者支持 Tablet 级别的并行读取。如果你希望在 
Spark 使用 Arrow Flight SQL 连接 Doris，可以参考开源的 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connec [...]
 
+### 支持 BI 工具
+
+自 Doris v2.1.8 开始，支持 DBeaver 等 BI 工具使用 `arrow-flight-sql` 协议连接 Doris。DBeaver 
使用 `arrow-flight-sql` Driver 连接 Doris 的方法参考: 
[how-to-use-jdbc-driver-with-dbeaver-client](https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client)，[client-applications/clients/dbeaver/](https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_gcl_au*MjUyNjE1ODM0LjE3MzQwMDExNDg)。
+
+## 扩展应用
+
+### 多 BE 并行返回结果
+
+Doris 默认会将一个查询在所有 BE 节点上的结果汇总聚合到一个 BE 节点上，在 Mysql/JDBC 查询中 FE 
会向这个汇总数据的节点请求查询结果，在 Arrow Flight SQL 查询中 FE 会将这个节点的 IP/Port 包装在 Endpoint 中返回给 
ADBC Client，ADBC Client 会请求这个 Endpoint 对应的 BE 节点拉取数据。
+
+如果查询只是简单的 Select 从 Doris 拉取数据，没有 Join、Sort、窗口函数等有数据 Shuffle 行为的算子，可以将查询按照 
Tablet 粒度拆分，现在 Doris Spark/Flink Connector 就是用的这个方法实现并行读取数据，分为两个步骤：
+    1. 执行 `explain sql`，FE 返回的查询计划中 ScanOperator 包含 Scan 的所有 Tablet ID List。
+    2. 依据上面的 Tablet ID List 将原始 SQL 拆分为多个 SQL，每个 SQL 只读取部分 Tablet，用法类似 `SELECT 
* FROM t1 TABLET(10001,10002) limit 1000;`，拆分后的多个 SQL 可以并行执行。参考 [Support select 
table sample](https://github.com/apache/doris/pull/10170)。
+
+如果查询最外层是聚合，SQL 类似 `select k1, sum(k2) from xxx group by k1`，Doris v3.4 版本后，执行 
`set enable_parallel_result_sink=true;` 后允许一个查询的每个 BE 节点独立返回查询结果，ADBC Client 收到 
FE 返回的 Endpoint 列表后并行从多个 BE 节点拉取结果。不过注意当聚合结果很小时，多 BE 返回会增加 RPC 的压力。具体实现参考 
[support parallel result 
sink](https://github.com/apache/doris/pull/36053)。理论上除了最外层是排序的查询，其他查询都可以支持每个 BE 
节点并行返回结果，不过暂时没有这方便的需求，没有更进一步实现。
+
+### 多 BE 共享同一个可供集群外部访问的 IP
+
+如果存在一个 Doris 集群，它的 FE 节点可以被集群外部访问，它的所有 BE 节点只可以被集群内部访问。这在使用 Mysql Client 和 
JDBC 连接 Doris 执行查询是没问题的，查询结果将由 Doris FE 节点返回给。但使用 Arrow Flight SQL 连接 Doris 
无法执行查询，因为 ADBC Client 需要连接 Doris BE 节点拉取查询结果，但 Doris BE 节点不允许被集群外部访问。
+
+在生产环境中，很多时候不方便将 Doris BE 节点暴露到集群外。但可以为所有 Doris BE 节点增加了一层反向代理（比如 Nginx），集群外部的 
Client 连接 Nginx 时会随机路由到一台 Doris BE 节点上。默认情况下，Arrow Flight SQL 查询结果会随机保存在一台 
Doris BE 节点上，如果和 Nginx 随机路由的 Doris BE 节点不同，需要在 Doris BE 节点内部做一次数据转发。
+
+自 Doris v2.1.8 开始，你可以在所有 Doris BE 节点的 `be.conf` 中将 `public_host` 和 
`arrow_flight_sql_proxy_port` 配置成多 Doris BE 节点共享的可供集群外部访问的 IP 和端口，查询结果可以正确转发后返回 
ADBC Client。
+```conf
+    public_host={nginx ip}
+    arrow_flight_sql_proxy_port={nginx port}
+```
+
 ## FAQ
 
-1. ARM 环境报错 `get flight info statement failed, arrow flight schema timeout, 
TimeoutException: Waited 5000 milliseconds for io.grpc.stub.Client`，如果 Linux 
内核版本 <= 4.19.90，需要升级到 4.19.279 及以上，或者在低版本 Linux 内核的环境中重新编译 Doris 
BE，具体编译方法参考文档<docs/dev/install/source-install/compilation-arm>
+1. Q: ARM 环境报错 `get flight info statement failed, arrow flight schema timeout, 
TimeoutException: Waited 5000 milliseconds for io.grpc.stub.Client`。
+   
+    A: 如果 Linux 内核版本 <= 4.19.90，需要升级到 4.19.279 及以上，或者在低版本 Linux 内核的环境中重新编译 
Doris BE，具体编译方法参考文档<docs/dev/install/source-install/compilation-arm>
+
+    问题原因：这是因为老版本 Linux 内核和 Arrow 存在兼容性问题，`cpp: 
arrow::RecordBatch::MakeEmpty()` 构造 Arrow Record Batch 时会卡住，导致 Doris BE 的 Arrow 
Flight Server 在 5000ms 内没有回应 Doris FE 的 Arrow Flight Server 的 RPC 请求，导致 FE 给 
Client 返回 rpc timeout failed。Spark 和 Flink 读取 Doris 时也是将查询结果转换成 Arrow Record 
Batch 后返回，所以也存在同样的问题。
+
+    kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有 
4.19.90-24.4.v2101.ky10.aarch64，无法继续升级内核版本，只能在 kylinv10 上重新编译 Doris BE，如果使用新版本 
ldb_toolchain 编译 Doris  BE 后问题依然存在，可以尝试使用低版本 ldb_toolchain v0.17 编译，如果你的 ARM 
环境无法连外网，华为云提供 ARM + kylinv10，阿里云提供 x86 + kylinv10
+
+2. Q:  prepared statement 传递参数报错。
+   
+    A: 目前 `jdbc:arrow-flight-sql` 和 Java ADBC/JDBCDriver 不支持 prepared 
statement 传递参数，类似`select * from xxx where id=?`，将报错 `parameter ordinal 1 out of 
range`，这是 Arrow Flight SQL 的一个 BUG（[GitHub 
Issue](https://github.com/apache/arrow/issues/40118)）。
+
+3. Q: 如何修改 `jdbc:arrow-flight-sql` 每次读取的批次大小，在某些场景下提升性能。
+   
+    A: 
通过修改`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader`文件中`makeJdbcConfig`方法中的 
`setTargetBatchSize`，默认是 1024，然后将修改后的文件保存到本地同名路径目录下，从而覆盖原文件生效。
+
+4. Q: ADBC v0.10，JDBC 和 Java ADBC/JDBCDriver 不支持并行读取。
+   
+    A: 没有实现`stmt.executePartitioned()`这个方法，只能使用原生的 FlightClient 实现并行读取多个 
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql), 
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint: 
endpoints)`，此外，ADBC V0.10 默认的 AdbcStatement 实际是 JdbcStatement，executeQuery 
后将行存格式的 JDBC ResultSet 又重新转成的 Arrow 列存格式，预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 
[GitHub Issue](https://github.com/apache/arrow-adbc/issues/1490)。
+
+5. Q: 在 URL 中指定 database name。
+
+    A: 截止 Arrow v15.0，Arrow JDBC Connector 不支持在 URL 中指定 database name，比如 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 中指定连接`test` database 无效，只能手动执行 SQL `use database`。Arrow v18.0 支持了在 URL 中指定 
database name，但实测仍有 BUG。
+
+6. Q: Python ADBC print `Warning: Cannot disable autocommit; conn will not be 
DB-API 2.0 compliant`。
+
+    A: 使用 Python 时忽略这个 Warning，这是 Python ADBC Client 的问题，不会影响查询。
+
+7. Q: Python 报错 `grpc: received message larger than max (20748753 vs. 
16777216)`。
+
+    A: 参考 [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database 
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+
+8. Q: 报错 `invalid bearer token`。
+
+    A: 执行 `SET PROPERTY FOR 'root' 'max_user_connections' = '10000';` 
修改当前用户的当前最大连接数到 10000；在 `fe.conf` 增加 qe_max_connection=30000 和 
arrow_flight_token_cache_size=8000 并重启 FE。
+
+    ADBC Client 和 Arrow Flight Server 端之间的连接本质上是个长链接，需要在 Server 缓存 Auth 
Token、Connection、Session，连接创建后不会在单个查询结束时立即断开，需要 Client 发送 close() 请求后清理，但实际上 
Client 经常不会发送 close 请求，所以 Auth Token、Connection、Session 会长时间在 Arrow Flight 
Server 上保存，默认会在 3 天后超时断开，或者在连接数超过 `arrow_flight_token_cache_size` 的限制后依据 LRU 淘汰。
+    
+    截止 Doris v2.1.8，Arrow Flight 连接和 Mysql/JDBC 连接使用相同的连接数限制，包括 FE 所有用户的总连接数 
`qe_max_connection` 和单个用户的连接数 `UserProperty` 中的 `max_user_connections`。但默认的 
`qe_max_connection` 和 `max_user_connections` 分别是 1024 和 100。Arrow Flight SQL 
常用来取代使用 JDBC 的场景，但 JDBC 连接会在查询结束后立即释放，所以使用 Arrow Flight SQL 时，Doris 
默认的连接数限制太小，经常导致连接数超过 `arrow_flight_token_cache_size` 的限制后将仍在被使用的连接淘汰。
+
+9. Q: Java Arrow Flight SQL 读取 Datatime 类型返回时间戳，而不是格式化时间。
+
+    A: Java Arrow Flight SQL 读取 Datatime 类型需要自行转换时间戳，参考 [Add java parsing 
datetime type in arrow flight sql sample 
#48578](https://github.com/apache/doris/pull/48578)。用 Python Arrow Flight SQL 
读取 Datatime 类型返回结果为 `2025-03-03 17:23:28Z`，而 Java Arrow Flight SQL 返回 
`1740993808`。
+
+10. Q: Java Arrow Flight JDBC Client 读取 Array 嵌套类型报错 `Configuration does not 
provide a mapping for array column 2`。
+
+    A: 参考 
[`sample/arrow-flight-sql`](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/src/main/java/doris/arrowflight/demo/FlightAdbcDriver.java)
 使用 JAVA ADBC Client。
+    
+    Python ADBC Client、JAVA ADBC Client、Java JDBC DriverManager 读取 Array 
嵌套类型都没问题，只有 Java Arrow Flight JDBC Client 有问题，实际上 Arrow Flight JDBC Client 
的兼容性不好保证，不是 Arrow 官方开发的，由一个第三方数据库公司 Dremio 开发，之前还发现过其他兼容性问题，所以建议优先用 JAVA ADBC 
Client。
+
+## Release Note
+
+> 自 Doris 2.1 开始支持 Arrow Flight SQL 协议，截止 Doris 2.1.9，以 Doris 2.1 
系列版本为准列举修复的问题，Doris 3.0 系列版本自行对照。
+
+### v2.1.9
+
+1. 修复 Doris 数据序列化到 Arrow 的问题。
+    [Fix UT DataTypeSerDeArrowTest of Array/Map/Struct/Bitmap/HLL/Decimal256 
types](https://github.com/apache/doris/pull/48944)
+    - 读取 `Decimal256` 类型失败;
+    - 读取 `DatetimeV2` 类型微妙部分错误;
+    - 读取 `DateV2` 类型结果不正确;
+    - 读取 `IPV4/IPV6` 类型结果为 NULL 时报错;
+
+2. 修复 Doris Arrow Flight SQL 查询失败返回空结果，没有返回真实的错误信息。
+    [Fix query result is empty and not return query error 
message](https://github.com/apache/doris/pull/45023)
+
+### v2.1.8
+
+1. 支持 DBeaver 等 BI 工具使用 `arrow-flight-sql` 协议连接 Doris，支持正确显示元数据树。
+    [Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, 
getStreamTables #46217](https://github.com/apache/doris/pull/46217)。
+
+2. 支持多 BE 共享同一个可供集群外部访问的 IP 时，查询结果可以正确转发后返回 ADBC Client。
+    [Arrow flight server supports data forwarding when BE uses public 
vip](https://github.com/apache/doris/pull/43281)
+
+3. 支持多个 Endpoint 并行读取。
+    [Arrow Flight support multiple 
endpoints](https://github.com/apache/doris/pull/44286) 
+
+4. 修复查询报错 `FE not found arrow flight schema`。
+    [Fix FE not found arrow flight 
schema](https://github.com/apache/doris/pull/43960)
+
+5. 修复读取允许 NULL 的列报错 `BooleanBuilder::AppendValues`。
+    [Fix Doris NULL column conversion to arrow 
batch](https://github.com/apache/doris/pull/43929)
+
+6. 修复 `show processlist` 显示重复的 Connection ID。
+    [Fix arrow-flight-sql ConnectContext to use a unified ID 
#46284](https://github.com/apache/doris/pull/46284)
+
+### v2.1.7
+
+1. 修复频繁打印日志 `Connection wait_timeout`。
+    [Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor 
close](https://github.com/apache/doris/pull/41770)
 
-问题原因：这是因为老版本 Linux 内核和 Arrow 存在兼容性问题，`cpp: arrow::RecordBatch::MakeEmpty()` 构造 
Arrow Record Batch 时会卡住，导致 Doris BE 的 Arrow Flight Server 在 5000ms 内没有回应 Doris 
FE 的 Arrow Flight Server 的 RPC 请求，导致 FE 给 Client 返回 rpc timeout failed。Spark 和 
Flink 读取 Doris 时也是将查询结果转换成 Arrow Record Batch 后返回，所以也存在同样的问题。
+2. 修复 Arrow Flight Bearer Token 过期后从 Cache 中淘汰。
+    [Fix Arrow Flight bearer token cache evict after 
expired](https://github.com/apache/doris/pull/41754)
 
-kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有 
4.19.90-24.4.v2101.ky10.aarch64，无法继续升级内核版本，只能在 kylinv10 上重新编译 Doris BE，如果使用新版本 
ldb_toolchain 编译 Doris  BE 后问题依然存在，可以尝试使用低版本 ldb_toolchain v0.17 编译，如果你的 ARM 
环境无法连外网，华为云提供 ARM + kylinv10，阿里云提供 x86 + kylinv10
+### v2.1.6
 
-2. 目前 `jdbc:arrow-flight-sql` 和 Java ADBC/JDBCDriver 不支持 prepared statement 
传递参数，类似`select * from xxx where id=?`，将报错 `parameter ordinal 1 out of range`，这是 
Arrow Flight SQL 的一个 BUG（[GitHub 
Issue](https://github.com/apache/arrow/issues/40118)）
+1. 修复查询报错 `0.0.0.0:xxx, connection refused`。
+    [Fix return result from FE Arrow Flight server error 0.0.0.0:xxx, 
connection refused](https://github.com/apache/doris/pull/40002)
 
-3. 修改 `jdbc:arrow-flight-sql` 
每次读取的批次大小，在某些场景下可以提升性能，通过修改`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader`文件中`makeJdbcConfig`方法中的
 `setTargetBatchSize`，默认是 1024，然后将修改后的文件保存到本地同名路径目录下，从而覆盖原文件生效。
+2. 修复查询报错 `Reach limit of connections`。
+    [Fix exceed user property max connection cause Reach limit of connections 
#39127](https://github.com/apache/doris/pull/39127)
+    
+    之前的版本执行 `SET PROPERTY FOR 'root' 'max_user_connections' = '1024';` 
修改当前用户的当前最大连接数到 1024，可临时规避。
 
-4. ADBC v0.10，JDBC 和 Java ADBC/JDBCDriver 
还不支持并行读取，没有实现`stmt.executePartitioned()`这个方法，只能使用原生的 FlightClient 实现并行读取多个 
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql), 
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint: 
endpoints)`，此外，ADBC V0.10 默认的 AdbcStatement 实际是 JdbcStatement，executeQuery 
后将行存格式的 JDBC ResultSet 又重新转成的 Arrow 列存格式，预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 
[GitHub Issue](https://github.com/apache/arrow-adbc/issues/1490)。
+    因为之前的版本只限制 Arrow Flight 连接数小于 `qe_max_connection/2`，`qe_max_connection` 是 
fe 所有用户的总连接数，默认 1024，没有限制单个用户的 Arrow Flight 连接数小于 `UserProperty` 中的 
`max_user_connections`，默认 100，所以当 Arrow Flight 连接数超过当前用户连接数上限时将报错 `Reach limit 
of connections`，所以需调大当前用户的 `max_user_connections`。
+    
+    
问题详情见：[Questions](https://ask.selectdb.com/questions/D18b1/2-1-4-ban-ben-python-shi-yong-arrow-flight-sql-lian-jie-bu-hui-duan-kai-lian-jie-shu-zhan-man-da-dao-100/E1ic1?commentId=10070000000005324)
 
-5. 截止 Arrow v15.0，Arrow JDBC Connector 不支持在 URL 中指定 database name，比如 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 中指定连接`test` database 无效，只能手动执行 SQL `use database`。
+3. 增加 Conf `arrow_flight_result_sink_buffer_size_rows`，支持修改单次返回的查询结果 
ArrowBatch 大小，默认 4096 * 8。
+    [Add config 
arrow_flight_result_sink_buffer_size_rows](https://github.com/apache/doris/pull/38221)
 
-6. Doris 2.1.4 version 存在一个 Bug，读取大数据量时有几率报错，在 [Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827) 这个 pr 修复，升级 Doris 2.1.5 
version 
可以解决。问题详情见：[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+### v2.1.5
 
-7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant` 
使用 Python 时忽略这个 Warning，这是 Python ADBC Client 的问题，这不会影响查询。
+1. 修复Arrow Flight SQL 查询结果为空。
+    [Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827)
 
-8. Python 报错 `grpc: received message larger than max (20748753 vs. 
16777216)`，参考 [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database 
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+    Doris v2.1.4 
读取大数据量时有几率报错，问题详情见：[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
 
-9. Doris version 2.1.7 版本之前，报错 `Reach limit of connections`，这是因为没有限制单个用户的 
Arrow Flight 连接数小于 `UserProperty` 中的 `max_user_connections`，默认 100，可以通过 `SET 
PROPERTY FOR 'Billie' 'max_user_connections' = '1000';` 修改 Billie 用户的当前最大连接数到 
100，或者在 `fe.conf` 中增加 `arrow_flight_token_cache_size=50` 来限制整体的 Arrow Flight 
连接数。Doris version 2.1.7 版本之前 Arrow Flight 连接默认 3 天 超时断开，只强制连接数小于 
`qe_max_connection/2`，超过时依据 lru 淘汰，`qe_max_connection` 是 fe 所有用户的总连接数，默认 
1024。具体可以看 `arrow_flight_token_cache_size` 这个 co [...]
+### v2.1.4 及之前的版本 Doris Arrow Flight 不够完善，建议升级后使用。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
index df5765507da..19b5abaaba4 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/db-connect/arrow-flight-sql-connect.md
@@ -451,7 +451,9 @@ try (
 
 ### JDBC 和 Java 连接方式的选择
 
-对比传统的 `jdbc:mysql` 连接方式，Jdbc 和 Java 的 Arrow Flight SQL 连接方式的性能测试见 [GitHub 
Issue](https://github.com/apache/doris/issues/25514)，这里基于测试结论给出一些使用建议。
+[JDBC/Java Arrow Flight SQL 
Sample](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/README.md)
 是 JDBC/Java 使用 Arrow FLight SQL 的 demo，你可以使用它来测试向 Arrow Flight Server 
发送查询的多种连接方法，帮助你了解如何使用 Arrow FLight SQL 并测试性能。在 [Add Arrow Flight Sql demo for 
Java](https://github.com/apache/doris/pull/45306) 中实现。
+
+对比传统的 `jdbc:mysql` 连接方式，Jdbc 和 Java 的 Arrow Flight SQL 连接方式的性能测试见 Section 6.2 
of [GitHub 
Issue](https://github.com/apache/doris/issues/25514)，这里基于测试结论给出一些使用建议。
 
 1. 上述三种 Java Arrow Flight SQL 连接方式的选择上，如果后续的数据分析将基于行存的数据格式，那么使用 
jdbc:arrow-flight-sql，这将返回 JDBC ResultSet 格式的数据；如果后续的数据分析可以基于 Arrow 
格式或其他列存数据格式，那么使用 Flight AdbcDriver 或 Flight JdbcDriver 直接返回 Arrow 
格式的数据，这将避免行列转换，并可利用 Arrow 的特性加速数据解析。
 
@@ -471,26 +473,152 @@ Arrow Flight 官方目前没有支持 Spark 和 Flink 的计划（见 [GitHub Is
 
 社区之前参考开源的 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connector)，在 
Spark 中使用 FlightClient 连接 Doris 测试，发现 Arrow 与 Doris Block 之间数据格式转换的速度更快，是 CSV 
格式与 Doris Block 之间转换速度的 10 倍，而且对 Map，Array 等复杂类型的支持更好，这是因为 Arrow 
数据格式的压缩率高，传输时网络开销小。不过目前 Doris Arrow Flight 还没有实现多节点并行读取，仍是将查询结果汇总到一台 BE 
节点后返回，对简单的批量导出数据而言，性能可能没有 Doris Spark Connector 快，后者支持 Tablet 级别的并行读取。如果你希望在 
Spark 使用 Arrow Flight SQL 连接 Doris，可以参考开源的 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connec [...]
 
+### 支持 BI 工具
+
+自 Doris v2.1.8 开始，支持 DBeaver 等 BI 工具使用 `arrow-flight-sql` 协议连接 Doris。DBeaver 
使用 `arrow-flight-sql` Driver 连接 Doris 的方法参考: 
[how-to-use-jdbc-driver-with-dbeaver-client](https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client)，[client-applications/clients/dbeaver/](https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_gcl_au*MjUyNjE1ODM0LjE3MzQwMDExNDg)。
+
+## 扩展应用
+
+### 多 BE 并行返回结果
+
+Doris 默认会将一个查询在所有 BE 节点上的结果汇总聚合到一个 BE 节点上，在 Mysql/JDBC 查询中 FE 
会向这个汇总数据的节点请求查询结果，在 Arrow Flight SQL 查询中 FE 会将这个节点的 IP/Port 包装在 Endpoint 中返回给 
ADBC Client，ADBC Client 会请求这个 Endpoint 对应的 BE 节点拉取数据。
+
+如果查询只是简单的 Select 从 Doris 拉取数据，没有 Join、Sort、窗口函数等有数据 Shuffle 行为的算子，可以将查询按照 
Tablet 粒度拆分，现在 Doris Spark/Flink Connector 就是用的这个方法实现并行读取数据，分为两个步骤：
+    1. 执行 `explain sql`，FE 返回的查询计划中 ScanOperator 包含 Scan 的所有 Tablet ID List。
+    2. 依据上面的 Tablet ID List 将原始 SQL 拆分为多个 SQL，每个 SQL 只读取部分 Tablet，用法类似 `SELECT 
* FROM t1 TABLET(10001,10002) limit 1000;`，拆分后的多个 SQL 可以并行执行。参考 [Support select 
table sample](https://github.com/apache/doris/pull/10170)。
+
+如果查询最外层是聚合，SQL 类似 `select k1, sum(k2) from xxx group by k1`，Doris v3.4 版本后，执行 
`set enable_parallel_result_sink=true;` 后允许一个查询的每个 BE 节点独立返回查询结果，ADBC Client 收到 
FE 返回的 Endpoint 列表后并行从多个 BE 节点拉取结果。不过注意当聚合结果很小时，多 BE 返回会增加 RPC 的压力。具体实现参考 
[support parallel result 
sink](https://github.com/apache/doris/pull/36053)。理论上除了最外层是排序的查询，其他查询都可以支持每个 BE 
节点并行返回结果，不过暂时没有这方便的需求，没有更进一步实现。
+
+### 多 BE 共享同一个可供集群外部访问的 IP
+
+如果存在一个 Doris 集群，它的 FE 节点可以被集群外部访问，它的所有 BE 节点只可以被集群内部访问。这在使用 Mysql Client 和 
JDBC 连接 Doris 执行查询是没问题的，查询结果将由 Doris FE 节点返回给。但使用 Arrow Flight SQL 连接 Doris 
无法执行查询，因为 ADBC Client 需要连接 Doris BE 节点拉取查询结果，但 Doris BE 节点不允许被集群外部访问。
+
+在生产环境中，很多时候不方便将 Doris BE 节点暴露到集群外。但可以为所有 Doris BE 节点增加了一层反向代理（比如 Nginx），集群外部的 
Client 连接 Nginx 时会随机路由到一台 Doris BE 节点上。默认情况下，Arrow Flight SQL 查询结果会随机保存在一台 
Doris BE 节点上，如果和 Nginx 随机路由的 Doris BE 节点不同，需要在 Doris BE 节点内部做一次数据转发。
+
+自 Doris v2.1.8 开始，你可以在所有 Doris BE 节点的 `be.conf` 中将 `public_host` 和 
`arrow_flight_sql_proxy_port` 配置成多 Doris BE 节点共享的可供集群外部访问的 IP 和端口，查询结果可以正确转发后返回 
ADBC Client。
+```conf
+    public_host={nginx ip}
+    arrow_flight_sql_proxy_port={nginx port}
+```
+
 ## FAQ
 
-1. ARM 环境报错 `get flight info statement failed, arrow flight schema timeout, 
TimeoutException: Waited 5000 milliseconds for io.grpc.stub.Client`，如果 Linux 
内核版本 <= 4.19.90，需要升级到 4.19.279 及以上，或者在低版本 Linux 内核的环境中重新编译 Doris 
BE，具体编译方法参考文档<docs/dev/install/source-install/compilation-arm>
+1. Q: ARM 环境报错 `get flight info statement failed, arrow flight schema timeout, 
TimeoutException: Waited 5000 milliseconds for io.grpc.stub.Client`。
+   
+    A: 如果 Linux 内核版本 <= 4.19.90，需要升级到 4.19.279 及以上，或者在低版本 Linux 内核的环境中重新编译 
Doris BE，具体编译方法参考文档<docs/dev/install/source-install/compilation-arm>
+
+    问题原因：这是因为老版本 Linux 内核和 Arrow 存在兼容性问题，`cpp: 
arrow::RecordBatch::MakeEmpty()` 构造 Arrow Record Batch 时会卡住，导致 Doris BE 的 Arrow 
Flight Server 在 5000ms 内没有回应 Doris FE 的 Arrow Flight Server 的 RPC 请求，导致 FE 给 
Client 返回 rpc timeout failed。Spark 和 Flink 读取 Doris 时也是将查询结果转换成 Arrow Record 
Batch 后返回，所以也存在同样的问题。
+
+    kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有 
4.19.90-24.4.v2101.ky10.aarch64，无法继续升级内核版本，只能在 kylinv10 上重新编译 Doris BE，如果使用新版本 
ldb_toolchain 编译 Doris  BE 后问题依然存在，可以尝试使用低版本 ldb_toolchain v0.17 编译，如果你的 ARM 
环境无法连外网，华为云提供 ARM + kylinv10，阿里云提供 x86 + kylinv10
+
+2. Q:  prepared statement 传递参数报错。
+   
+    A: 目前 `jdbc:arrow-flight-sql` 和 Java ADBC/JDBCDriver 不支持 prepared 
statement 传递参数，类似`select * from xxx where id=?`，将报错 `parameter ordinal 1 out of 
range`，这是 Arrow Flight SQL 的一个 BUG（[GitHub 
Issue](https://github.com/apache/arrow/issues/40118)）。
+
+3. Q: 如何修改 `jdbc:arrow-flight-sql` 每次读取的批次大小，在某些场景下提升性能。
+   
+    A: 
通过修改`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader`文件中`makeJdbcConfig`方法中的 
`setTargetBatchSize`，默认是 1024，然后将修改后的文件保存到本地同名路径目录下，从而覆盖原文件生效。
+
+4. Q: ADBC v0.10，JDBC 和 Java ADBC/JDBCDriver 不支持并行读取。
+   
+    A: 没有实现`stmt.executePartitioned()`这个方法，只能使用原生的 FlightClient 实现并行读取多个 
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql), 
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint: 
endpoints)`，此外，ADBC V0.10 默认的 AdbcStatement 实际是 JdbcStatement，executeQuery 
后将行存格式的 JDBC ResultSet 又重新转成的 Arrow 列存格式，预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 
[GitHub Issue](https://github.com/apache/arrow-adbc/issues/1490)。
+
+5. Q: 在 URL 中指定 database name。
+
+    A: 截止 Arrow v15.0，Arrow JDBC Connector 不支持在 URL 中指定 database name，比如 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 中指定连接`test` database 无效，只能手动执行 SQL `use database`。Arrow v18.0 支持了在 URL 中指定 
database name，但实测仍有 BUG。
+
+6. Q: Python ADBC print `Warning: Cannot disable autocommit; conn will not be 
DB-API 2.0 compliant`。
+
+    A: 使用 Python 时忽略这个 Warning，这是 Python ADBC Client 的问题，不会影响查询。
+
+7. Q: Python 报错 `grpc: received message larger than max (20748753 vs. 
16777216)`。
+
+    A: 参考 [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database 
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+
+8. Q: 报错 `invalid bearer token`。
+
+    A: 执行 `SET PROPERTY FOR 'root' 'max_user_connections' = '10000';` 
修改当前用户的当前最大连接数到 10000；在 `fe.conf` 增加 qe_max_connection=30000 和 
arrow_flight_token_cache_size=8000 并重启 FE。
+
+    ADBC Client 和 Arrow Flight Server 端之间的连接本质上是个长链接，需要在 Server 缓存 Auth 
Token、Connection、Session，连接创建后不会在单个查询结束时立即断开，需要 Client 发送 close() 请求后清理，但实际上 
Client 经常不会发送 close 请求，所以 Auth Token、Connection、Session 会长时间在 Arrow Flight 
Server 上保存，默认会在 3 天后超时断开，或者在连接数超过 `arrow_flight_token_cache_size` 的限制后依据 LRU 淘汰。
+    
+    截止 Doris v2.1.8，Arrow Flight 连接和 Mysql/JDBC 连接使用相同的连接数限制，包括 FE 所有用户的总连接数 
`qe_max_connection` 和单个用户的连接数 `UserProperty` 中的 `max_user_connections`。但默认的 
`qe_max_connection` 和 `max_user_connections` 分别是 1024 和 100。Arrow Flight SQL 
常用来取代使用 JDBC 的场景，但 JDBC 连接会在查询结束后立即释放，所以使用 Arrow Flight SQL 时，Doris 
默认的连接数限制太小，经常导致连接数超过 `arrow_flight_token_cache_size` 的限制后将仍在被使用的连接淘汰。
+
+9. Q: Java Arrow Flight SQL 读取 Datatime 类型返回时间戳，而不是格式化时间。
+
+    A: Java Arrow Flight SQL 读取 Datatime 类型需要自行转换时间戳，参考 [Add java parsing 
datetime type in arrow flight sql sample 
#48578](https://github.com/apache/doris/pull/48578)。用 Python Arrow Flight SQL 
读取 Datatime 类型返回结果为 `2025-03-03 17:23:28Z`，而 Java Arrow Flight SQL 返回 
`1740993808`。
+
+10. Q: Java Arrow Flight JDBC Client 读取 Array 嵌套类型报错 `Configuration does not 
provide a mapping for array column 2`。
+
+    A: 参考 
[`sample/arrow-flight-sql`](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/src/main/java/doris/arrowflight/demo/FlightAdbcDriver.java)
 使用 JAVA ADBC Client。
+    
+    Python ADBC Client、JAVA ADBC Client、Java JDBC DriverManager 读取 Array 
嵌套类型都没问题，只有 Java Arrow Flight JDBC Client 有问题，实际上 Arrow Flight JDBC Client 
的兼容性不好保证，不是 Arrow 官方开发的，由一个第三方数据库公司 Dremio 开发，之前还发现过其他兼容性问题，所以建议优先用 JAVA ADBC 
Client。
+
+## Release Note
+
+> 自 Doris 2.1 开始支持 Arrow Flight SQL 协议，截止 Doris 2.1.9，以 Doris 2.1 
系列版本为准列举修复的问题，Doris 3.0 系列版本自行对照。
+
+### v2.1.9
+
+1. 修复 Doris 数据序列化到 Arrow 的问题。
+    [Fix UT DataTypeSerDeArrowTest of Array/Map/Struct/Bitmap/HLL/Decimal256 
types](https://github.com/apache/doris/pull/48944)
+    - 读取 `Decimal256` 类型失败;
+    - 读取 `DatetimeV2` 类型微妙部分错误;
+    - 读取 `DateV2` 类型结果不正确;
+    - 读取 `IPV4/IPV6` 类型结果为 NULL 时报错;
+
+2. 修复 Doris Arrow Flight SQL 查询失败返回空结果，没有返回真实的错误信息。
+    [Fix query result is empty and not return query error 
message](https://github.com/apache/doris/pull/45023)
+
+### v2.1.8
+
+1. 支持 DBeaver 等 BI 工具使用 `arrow-flight-sql` 协议连接 Doris，支持正确显示元数据树。
+    [Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, 
getStreamTables #46217](https://github.com/apache/doris/pull/46217)。
+
+2. 支持多 BE 共享同一个可供集群外部访问的 IP 时，查询结果可以正确转发后返回 ADBC Client。
+    [Arrow flight server supports data forwarding when BE uses public 
vip](https://github.com/apache/doris/pull/43281)
+
+3. 支持多个 Endpoint 并行读取。
+    [Arrow Flight support multiple 
endpoints](https://github.com/apache/doris/pull/44286) 
+
+4. 修复查询报错 `FE not found arrow flight schema`。
+    [Fix FE not found arrow flight 
schema](https://github.com/apache/doris/pull/43960)
+
+5. 修复读取允许 NULL 的列报错 `BooleanBuilder::AppendValues`。
+    [Fix Doris NULL column conversion to arrow 
batch](https://github.com/apache/doris/pull/43929)
+
+6. 修复 `show processlist` 显示重复的 Connection ID。
+    [Fix arrow-flight-sql ConnectContext to use a unified ID 
#46284](https://github.com/apache/doris/pull/46284)
+
+### v2.1.7
+
+1. 修复频繁打印日志 `Connection wait_timeout`。
+    [Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor 
close](https://github.com/apache/doris/pull/41770)
 
-问题原因：这是因为老版本 Linux 内核和 Arrow 存在兼容性问题，`cpp: arrow::RecordBatch::MakeEmpty()` 构造 
Arrow Record Batch 时会卡住，导致 Doris BE 的 Arrow Flight Server 在 5000ms 内没有回应 Doris 
FE 的 Arrow Flight Server 的 RPC 请求，导致 FE 给 Client 返回 rpc timeout failed。Spark 和 
Flink 读取 Doris 时也是将查询结果转换成 Arrow Record Batch 后返回，所以也存在同样的问题。
+2. 修复 Arrow Flight Bearer Token 过期后从 Cache 中淘汰。
+    [Fix Arrow Flight bearer token cache evict after 
expired](https://github.com/apache/doris/pull/41754)
 
-kylinv10 SP2 和 SP3 的 Linux 内核版本最高只有 
4.19.90-24.4.v2101.ky10.aarch64，无法继续升级内核版本，只能在 kylinv10 上重新编译 Doris BE，如果使用新版本 
ldb_toolchain 编译 Doris  BE 后问题依然存在，可以尝试使用低版本 ldb_toolchain v0.17 编译，如果你的 ARM 
环境无法连外网，华为云提供 ARM + kylinv10，阿里云提供 x86 + kylinv10
+### v2.1.6
 
-2. 目前 `jdbc:arrow-flight-sql` 和 Java ADBC/JDBCDriver 不支持 prepared statement 
传递参数，类似`select * from xxx where id=?`，将报错 `parameter ordinal 1 out of range`，这是 
Arrow Flight SQL 的一个 BUG（[GitHub 
Issue](https://github.com/apache/arrow/issues/40118)）
+1. 修复查询报错 `0.0.0.0:xxx, connection refused`。
+    [Fix return result from FE Arrow Flight server error 0.0.0.0:xxx, 
connection refused](https://github.com/apache/doris/pull/40002)
 
-3. 修改 `jdbc:arrow-flight-sql` 
每次读取的批次大小，在某些场景下可以提升性能，通过修改`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader`文件中`makeJdbcConfig`方法中的
 `setTargetBatchSize`，默认是 1024，然后将修改后的文件保存到本地同名路径目录下，从而覆盖原文件生效。
+2. 修复查询报错 `Reach limit of connections`。
+    [Fix exceed user property max connection cause Reach limit of connections 
#39127](https://github.com/apache/doris/pull/39127)
+    
+    之前的版本执行 `SET PROPERTY FOR 'root' 'max_user_connections' = '1024';` 
修改当前用户的当前最大连接数到 1024，可临时规避。
 
-4. ADBC v0.10，JDBC 和 Java ADBC/JDBCDriver 
还不支持并行读取，没有实现`stmt.executePartitioned()`这个方法，只能使用原生的 FlightClient 实现并行读取多个 
Endpoints, 使用方法`sqlClient=new FlightSqlClient, execute=sqlClient.execute(sql), 
endpoints=execute.getEndpoints(), for(FlightEndpoint endpoint: 
endpoints)`，此外，ADBC V0.10 默认的 AdbcStatement 实际是 JdbcStatement，executeQuery 
后将行存格式的 JDBC ResultSet 又重新转成的 Arrow 列存格式，预期到 ADBC 1.0.0 时 Java ADBC 将功能完善 
[GitHub Issue](https://github.com/apache/arrow-adbc/issues/1490)。
+    因为之前的版本只限制 Arrow Flight 连接数小于 `qe_max_connection/2`，`qe_max_connection` 是 
fe 所有用户的总连接数，默认 1024，没有限制单个用户的 Arrow Flight 连接数小于 `UserProperty` 中的 
`max_user_connections`，默认 100，所以当 Arrow Flight 连接数超过当前用户连接数上限时将报错 `Reach limit 
of connections`，所以需调大当前用户的 `max_user_connections`。
+    
+    
问题详情见：[Questions](https://ask.selectdb.com/questions/D18b1/2-1-4-ban-ben-python-shi-yong-arrow-flight-sql-lian-jie-bu-hui-duan-kai-lian-jie-shu-zhan-man-da-dao-100/E1ic1?commentId=10070000000005324)
 
-5. 截止 Arrow v15.0，Arrow JDBC Connector 不支持在 URL 中指定 database name，比如 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 中指定连接`test` database 无效，只能手动执行 SQL `use database`。
+3. 增加 Conf `arrow_flight_result_sink_buffer_size_rows`，支持修改单次返回的查询结果 
ArrowBatch 大小，默认 4096 * 8。
+    [Add config 
arrow_flight_result_sink_buffer_size_rows](https://github.com/apache/doris/pull/38221)
 
-6. Doris 2.1.4 version 存在一个 Bug，读取大数据量时有几率报错，在 [Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827) 这个 pr 修复，升级 Doris 2.1.5 
version 
可以解决。问题详情见：[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+### v2.1.5
 
-7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant` 
使用 Python 时忽略这个 Warning，这是 Python ADBC Client 的问题，这不会影响查询。
+1. 修复Arrow Flight SQL 查询结果为空。
+    [Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827)
 
-8. Python 报错 `grpc: received message larger than max (20748753 vs. 
16777216)`，参考 [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) 在 Database 
Option 中增加 `adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value`.
+    Doris v2.1.4 
读取大数据量时有几率报错，问题详情见：[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
 
-9. Doris version 2.1.7 版本之前，报错 `Reach limit of connections`，这是因为没有限制单个用户的 
Arrow Flight 连接数小于 `UserProperty` 中的 `max_user_connections`，默认 100，可以通过 `SET 
PROPERTY FOR 'Billie' 'max_user_connections' = '1000';` 修改 Billie 用户的当前最大连接数到 
100，或者在 `fe.conf` 中增加 `arrow_flight_token_cache_size=50` 来限制整体的 Arrow Flight 
连接数。Doris version 2.1.7 版本之前 Arrow Flight 连接默认 3 天 超时断开，只强制连接数小于 
`qe_max_connection/2`，超过时依据 lru 淘汰，`qe_max_connection` 是 fe 所有用户的总连接数，默认 
1024。具体可以看 `arrow_flight_token_cache_size` 这个 co [...]
+### v2.1.4 及之前的版本 Doris Arrow Flight 不够完善，建议升级后使用。
diff --git a/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md 
b/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
index a122ff16b8d..b2a10f63c73 100644
--- a/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
+++ b/versioned_docs/version-2.1/db-connect/arrow-flight-sql-connect.md
@@ -451,7 +451,9 @@ try (
 
 ### Choice of Jdbc and Java connection methods
 
-Compared with the traditional `jdbc:mysql` connection method, the performance 
test of the Arrow Flight SQL connection method of Jdbc and Java can be found at 
[GitHub Issue](https://github.com/apache/doris/issues/25514). Here are some 
usage suggestions based on the test conclusions.
+[JDBC/Java Arrow Flight SQL 
Sample](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/README.md)
 is a JDBC/Java demo using Arrow FLight SQL. You can use it to test various 
connection methods for sending queries to Arrow Flight Server, helping you 
understand how to use Arrow FLight SQL and test performance. Implemented in 
[Add Arrow Flight Sql demo for 
Java](https://github.com/apache/doris/pull/45306).
+
+Compared with the traditional `jdbc:mysql` connection method, the performance 
test of the Arrow Flight SQL connection method of Jdbc and Java can be found in 
Section 6.2 of [GitHub Issue](https://github.com/apache/doris/issues/25514). 
Here are some usage suggestions based on the test conclusions.
 
 1. For the above three Java Arrow Flight SQL connection methods, if the 
subsequent data analysis will be based on the row-based data format, then use 
jdbc:arrow-flight-sql, which will return data in the JDBC ResultSet format; if 
the subsequent data analysis can be based on the Arrow format or other 
column-based data formats, then use Flight AdbcDriver or Flight JdbcDriver to 
directly return data in the Arrow format, which will avoid row-column 
conversion and use the characteristics of Ar [...]
 
@@ -471,26 +473,152 @@ Arrow Flight currently has no official plan to support 
Spark and Flink ([GitHub
 
 The community previously referred to the open source 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connector) and 
used FlightClient in Spark to connect to Doris for testing. It was found that 
the data format conversion between Arrow and Doris Block is faster, which is 10 
times the conversion speed between CSV format and Doris Block, and it has 
better support for complex types such as Map and Array. This is because the 
Arrow data format has a high compression rate and lo [...]
 
+### Support BI tools
+
+Since Doris v2.1.8, BI tools such as DBeaver are supported to connect to Doris 
using the `arrow-flight-sql` protocol. For DBeaver's method of connecting to 
Doris using the `arrow-flight-sql` Driver, refer to: 
[how-to-use-jdbc-driver-with-dbeaver-client](https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client),
 
[client-applications/clients/dbeaver/](https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_
 [...]
+
+## Extended Application
+
+### Multiple BEs return results in parallel
+
+Doris will aggregate the results of a query on all BE nodes to one BE node by 
default. In Mysql/JDBC queries, FE will request query results from this 
aggregated data node. In Arrow Flight SQL queries, FE will wrap the IP/Port of 
this node in the Endpoint and return it to ADBC Client. ADBC Client will 
request the BE node corresponding to this Endpoint to pull data.
+
+If the query is just a simple Select to pull data from Doris, without Join, 
Sort, Window Function and other operators with data Shuffle behavior, the query 
can be split according to Tablet granularity. Now Doris Spark/Flink Connector 
uses this method to implement parallel data reading, which is divided into two 
steps:
+1. Execute `explain sql`, and the ScanOperator in the query plan returned by 
FE contains all Tablet ID Lists of Scan.
+2. Split the original SQL into multiple SQLs based on the Tablet ID List 
above. Each SQL only reads part of the Tablet. The usage is similar to `SELECT 
* FROM t1 TABLET(10001,10002) limit 1000;`. The multiple SQLs after splitting 
can be executed in parallel. Refer to [Support select table 
sample](https://github.com/apache/doris/pull/10170).
+
+If the outermost layer of the query is aggregation, the SQL is similar to 
`select k1, sum(k2) from xxx group by k1`. After Doris v3.4, execute `set 
enable_parallel_result_sink=true;` to allow each BE node of a query to return 
query results independently. After receiving the Endpoint list returned by FE, 
ADBC Client pulls results from multiple BE nodes in parallel. However, please 
note that when the aggregation result is very small, returning multiple BEs 
will increase the pressure on R [...]
+
+### Multiple BEs share the same IP accessible from outside the cluster
+
+If there is a Doris cluster, its FE node can be accessed from outside the 
cluster, and all its BE nodes can only be accessed from inside the cluster. 
This is fine when using Mysql Client and JDBC to connect to Doris to execute 
queries, and the query results will be returned by the Doris FE node. However, 
using Arrow Flight SQL to connect to Doris cannot execute queries, because ADBC 
Client needs to connect to the Doris BE node to pull query results, but the 
Doris BE node is not allowed [...]
+
+In a production environment, it is often inconvenient to expose the Doris BE 
node outside the cluster. However, you can add a reverse proxy (such as Nginx) 
to all Doris BE nodes. When the client outside the cluster connects to Nginx, 
it will be randomly routed to a Doris BE node. By default, the Arrow Flight SQL 
query results will be randomly saved on a Doris BE node. If it is different 
from the Doris BE node randomly routed by Nginx, data forwarding is required 
within the Doris BE node.
+
+Starting from Doris v2.1.8, you can configure `public_host` and 
`arrow_flight_sql_proxy_port` in `be.conf` of all Doris BE nodes to the IP and 
port shared by multiple Doris BE nodes and accessible outside the cluster. The 
query results can be correctly forwarded and returned to the ADBC Client.
+```conf
+public_host={nginx ip}
+arrow_flight_sql_proxy_port={nginx port}
+```
+
 ## FAQ
 
-1. ARM environment reports an error `get flight info statement failed, arrow 
flight schema timeout, TimeoutException: Waited 5000 milliseconds for 
io.grpc.stub.Client`. If the Linux kernel version is <= 4.19.90, you need to 
upgrade to 4.19.279 or above, or recompile Doris BE in the environment of the 
lower version of the Linux kernel. For specific compilation methods, refer to 
the document <docs/dev/install/source-install/compilation-arm>
+1. Q: ARM environment reports an error `get flight info statement failed, 
arrow flight schema timeout, TimeoutException: Waited 5000 milliseconds for 
io.grpc.stub.Client`.
+
+A: If the Linux kernel version is <= 4.19.90, you need to upgrade to 4.19.279 
or above, or recompile Doris BE in the environment of the lower version of the 
Linux kernel. For specific compilation methods, refer to the document 
<docs/dev/install/source-install/compilation-arm>
+
+Cause: This is because there is a compatibility issue between the old version 
of the Linux kernel and Arrow. `cpp: arrow::RecordBatch::MakeEmpty()` will get 
stuck when constructing Arrow Record Batch, causing Doris BE's Arrow Flight 
Server to fail to respond to Doris FE's Arrow Flight Server's RPC request 
within 5000ms, causing FE to return rpc timeout failed to Client. When Spark 
and Flink read Doris, they also convert the query results into Arrow Record 
Batch and return them, so the sa [...]
+
+The Linux kernel version of kylinv10 SP2 and SP3 is only 
4.19.90-24.4.v2101.ky10.aarch64 at most. The kernel version cannot be upgraded 
further. Doris BE can only be recompiled on kylinv10. If the problem still 
exists after compiling Doris BE with the new version of ldb_toolchain, you can 
try to compile it with the lower version of ldb_toolchain v0.17. If your ARM 
environment cannot connect to the external network, Huawei Cloud provides ARM + 
kylinv10, and Alibaba Cloud provides x86 + kylinv10
+
+2. Q: Prepared statement passes parameters and reports errors.
+
+A: Currently, `jdbc:arrow-flight-sql` and Java ADBC/JDBCDriver do not support 
prepared statement parameter passing. For example, `select * from xxx where 
id=?` will report an error `parameter ordinal 1 out of range`. This is a bug in 
Arrow Flight SQL ([GitHub Issue](https://github.com/apache/arrow/issues/40118)).
+
+3. Q: How to modify the batch size read by `jdbc:arrow-flight-sql` each time 
to improve performance in some scenarios.
+
+A: By modifying `setTargetBatchSize` in the `makeJdbcConfig` method in the 
`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader` file, the default is 1024, 
and then saving the modified file to the local directory with the same path 
name, so as to overwrite the original file and take effect.
+
+4. Q: ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel 
reading.
+
+A: The `stmt.executePartitioned()` method is not implemented. You can only use 
the native FlightClient to implement parallel reading of multiple endpoints, 
using the method `sqlClient=new FlightSqlClient, 
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(), 
for(FlightEndpoint endpoint: endpoints)`. In addition, the default 
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQuery, 
the row-format JDBC ResultSet is converted back to the Arrow column format. I 
[...]
+
+5. Q: Specify the database name in the URL.
+
+A: As of Arrow v15.0, Arrow JDBC Connector does not support specifying 
database name in URL. For example, specifying connection to `test` database in 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 is invalid, and you can only execute SQL `use database` manually. Arrow v18.0 
supports specifying database name in URL, but there are still bugs in actual 
testing.
+
+6. Q: Python ADBC prints `Warning: Cannot disable autocommit; conn will not 
be DB-API 2.0 compliant`.
+
+A: Ignore this Warning when using Python. This is a problem with Python ADBC 
Client and will not affect queries.
+
+7. Q: Python reports an error `grpc: received message larger than max 
(20748753 vs. 16777216)`.
+
+A: Refer to [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) and add 
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database 
Option.
+
+8. Q: Error `invalid bearer token` is reported.
+
+A: Execute `SET PROPERTY FOR 'root' 'max_user_connections' = '10000';` to 
change the current maximum number of connections for the current user to 10000; 
add qe_max_connection=30000 and arrow_flight_token_cache_size=8000 in `fe.conf` 
and restart FE.
+
+The connection between the ADBC Client and the Arrow Flight Server is 
essentially a long link, which requires Auth Token, Connection, and Session to 
be cached on the Server. After the connection is created, it will not be 
disconnected immediately at the end of a single query. The Client needs to send 
a close() request to clean it up, but in fact, the Client often does not send a 
close request, so the Auth Token, Connection, and Session will be saved on the 
Arrow Flight Server for a lon [...]
+
+As of Doris v2.1.8, Arrow Flight connections and Mysql/JDBC connections use 
the same connection limit, including the total number of connections of all FE 
users `qe_max_connection` and the number of connections of a single user 
`max_user_connections` in `UserProperty`. But the default `qe_max_connection` 
and `max_user_connections` are 1024 and 100 respectively. Arrow Flight SQL is 
often used to replace JDBC scenarios, but the JDBC connection will be released 
immediately after the query e [...]
+
+9. Q: Java Arrow Flight SQL reads the Datatime type and returns a timestamp 
instead of the formatted time.
+
+A: Java Arrow Flight SQL needs to convert the timestamp by itself when reading 
the Datatime type. Refer to [Add java parsing datetime type in arrow flight sql 
sample #48578](https://github.com/apache/doris/pull/48578). Using Python Arrow 
Flight SQL to read the Datatime type returns `2025-03-03 17:23:28Z`, while Java 
Arrow Flight SQL returns `1740993808`.
+
+10. Q: Java Arrow Flight JDBC Client reports an error `Configuration does not 
provide a mapping for array column 2` when reading Array nested types.
+
+A: Refer to 
[`sample/arrow-flight-sql`](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/src/main/java/doris/arrowflight/demo/FlightAdbcDriver.java)
 to use JAVA ADBC Client.
+
+Python ADBC Client, JAVA ADBC Client, and Java JDBC DriverManager all have 
no problem reading Array nested types. Only Java Arrow Flight JDBC Client has 
problems. In fact, the compatibility of Arrow Flight JDBC Client is not 
guaranteed. It is not officially developed by Arrow, but by a third-party 
database company Dremio. Other compatibility issues have been found before, so 
it is recommended to use JAVA ADBC Client first.
+
+## Release Note
+
+> Arrow Flight SQL protocol is supported since Doris 2.1. As of Doris 2.1.9, 
the issues fixed are listed based on Doris 2.1 series versions. Doris 3.0 
series versions are self-checked.
+
+### v2.1.9
+
+1. Fix the problem of Doris data serialization to Arrow.
+[Fix UT DataTypeSerDeArrowTest of Array/Map/Struct/Bitmap/HLL/Decimal256 
types](https://github.com/apache/doris/pull/48944)
+- Failed to read `Decimal256` type;
+- Subtle error in reading `DatetimeV2` type;
+- Incorrect result in reading `DateV2` type;
+- Error when reading `IPV4/IPV6` type result is NULL;
+
+2. Fix the problem that Doris Arrow Flight SQL query fails and returns empty 
result, without returning real error information.
+[Fix query result is empty and not return query error 
message](https://github.com/apache/doris/pull/45023)
+
+### v2.1.8
+
+1. Support BI tools such as DBeaver to connect to Doris using the 
`arrow-flight-sql` protocol, and support the correct display of metadata trees.
+[Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, 
getStreamTables #46217](https://github.com/apache/doris/pull/46217).
+
+2. When multiple BEs share the same IP that is accessible to the outside of 
the cluster, the query results can be correctly forwarded and returned to the 
ADBC Client.
+[Arrow flight server supports data forwarding when BE uses public 
vip](https://github.com/apache/doris/pull/43281)
+
+3. Support multiple endpoints to read in parallel.
+[Arrow Flight support multiple 
endpoints](https://github.com/apache/doris/pull/44286)
+
+4. Fix query error `FE not found arrow flight schema`.
+[Fix FE not found arrow flight 
schema](https://github.com/apache/doris/pull/43960)
+
+5. Fix error `BooleanBuilder::AppendValues` when reading columns that allow 
NULL.
+[Fix Doris NULL column conversion to arrow 
batch](https://github.com/apache/doris/pull/43929)
+
+6. Fix `show processlist` displays duplicate Connection IDs.
+[Fix arrow-flight-sql ConnectContext to use a unified ID 
#46284](https://github.com/apache/doris/pull/46284)
+
+### v2.1.7
+
+1. Fix frequent log printing `Connection wait_timeout`.
+[Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor 
close](https://github.com/apache/doris/pull/41770)
+
+2. Fix Arrow Flight Bearer Token expiration from Cache.
+[Fix Arrow Flight bearer token cache evict after 
expired](https://github.com/apache/doris/pull/41754)
+
+### v2.1.6
 
-Problem cause: This is because there is a compatibility issue between the old 
version of the Linux kernel and Arrow. `cpp: arrow::RecordBatch::MakeEmpty()` 
will get stuck when constructing Arrow Record Batch, causing Doris BE's Arrow 
Flight Server to fail to respond to Doris FE's Arrow Flight Server RPC request 
within 5000ms, causing FE to return rpc timeout failed to Client. When Spark 
and Flink read Doris, they also convert the query results into Arrow Record 
Batch and return them, so  [...]
+1. Fix query error `0.0.0.0:xxx, connection refused`.
+[Fix return result from FE Arrow Flight server error 0.0.0.0:xxx, connection 
refused](https://github.com/apache/doris/pull/40002)
 
-The Linux kernel version of kylinv10 SP2 and SP3 is only up to 
4.19.90-24.4.v2101.ky10.aarch64. You cannot continue to upgrade the kernel 
version. You can only recompile Doris BE on kylinv10. If the problem still 
exists after compiling Doris BE with the new version of ldb_toolchain, you can 
try to compile it with the lower version of ldb_toolchain v0.17. If your ARM 
environment cannot connect to the Internet, Huawei Cloud provides ARM + 
kylinv10, and Alibaba Cloud provides x86 + kylinv10
+2. Fix query error `Reach limit of connections`.
+[Fix exceed user property max connection cause Reach limit of connections 
#39127](https://github.com/apache/doris/pull/39127)
 
-2. Currently, `jdbc:arrow-flight-sql` and Java ADBC/JDBCDriver do not support 
prepared statement passing parameters. For example, `select * from xxx where 
id=?` will report an error `parameter ordinal 1 out of range`. This is a bug in 
Arrow Flight SQL ([GitHub Issue](https://github.com/apache/arrow/issues/40118))
+In previous versions, execute `SET PROPERTY FOR 'root' 'max_user_connections' 
= '1024';` to modify the current maximum number of connections for the current 
user to 1024, which can be temporarily circumvented.
 
-3. Modification `jdbc:arrow-flight-sql` The batch size of each read can 
improve performance in some scenarios. By modifying the `setTargetBatchSize` in 
the `makeJdbcConfig` method in the 
`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader` file, the default is 1024, 
and then saving the modified file to the local directory with the same name, it 
will overwrite the original file to take effect.
+Because the previous version only limits the number of Arrow Flight 
connections to less than `qe_max_connection/2`, `qe_max_connection` is the 
total number of connections for all fe users, the default is 1024, and does not 
limit the number of Arrow Flight connections for a single user to less than 
`max_user_connections` in `UserProperty`, the default is 100, so when the 
number of Arrow Flight connections exceeds the upper limit of the current 
user's connection number, an error `Reach lim [...]
 
-4. ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel 
reading, and the `stmt.executePartitioned()` method is not implemented. You can 
only use the native FlightClient to implement parallel reading of multiple 
Endpoints, using the method `sqlClient=new FlightSqlClient, 
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(), 
for(FlightEndpoint endpoint: endpoints)`. In addition, the default 
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQue [...]
+For details of the problem, see: 
[Questions](https://ask.selectdb.com/questions/D18b1/2-1-4-ban-ben-python-shi-yong-arrow-flight-sql-lian-jie-bu-hui-duan-kai-lian-jie-shu-zhan-man-da-dao-100/E1ic1?commentId=10070000000005324)
 
-5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the 
database name in the URL. For example, 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 specifies that the connection to the `test` database is invalid. You can only 
execute the SQL `use database` manually.
+3. Add Conf `arrow_flight_result_sink_buffer_size_rows` to support modifying 
the ArrowBatch size of query results returned in a single time, the default is 
4096 * 8.
+[Add config 
arrow_flight_result_sink_buffer_size_rows](https://github.com/apache/doris/pull/38221)
 
-6. There is a bug in Doris 2.1.4 version. There is a chance of error when 
reading large amounts of data. This bug is fixed in [Fix arrow flight result 
sink #36827](https://github.com/apache/doris/pull/36827) PR. Upgrading Doris 
2.1.5 version can solve this problem. For details of the problem, see: 
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+### v2.1.5
 
-7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant` 
Ignore this warning when using Python. This is a problem with the Python ADBC 
Client and will not affect the query.
+1. Fix the problem that Arrow Flight SQL query results are empty.
+[Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827)
 
-8. Python reports an error `grpc: received message larger than max (20748753 
vs. 16777216)`. Refer to [Python: grpc: received message larger than max 
(20748753 vs. 16777216) 
#2078](https://github.com/apache/arrow-adbc/issues/2078) to add 
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database 
Option.
+Doris v2.1.4 may report an error when reading large amounts of data. For 
details, see: 
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
 
-9. Before Doris version 2.1.7, the error `Reach limit of connections` is 
reported. This is because there is no limit on the number of Arrow Flight 
connections for a single user, which is less than `max_user_connections` in 
`UserProperty`, which is 100 by default. You can modify the current maximum 
number of connections for Billie user to 100 by `SET PROPERTY FOR 'Billie' 
'max_user_connections' = '1000';`, or add `arrow_flight_token_cache_size=50` in 
`fe.conf` to limit the overall number  [...]
+### Doris Arrow Flight v2.1.4 and earlier versions are not perfect. It is 
recommended to upgrade before use.
diff --git a/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md 
b/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
index a122ff16b8d..b2a10f63c73 100644
--- a/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
+++ b/versioned_docs/version-3.0/db-connect/arrow-flight-sql-connect.md
@@ -451,7 +451,9 @@ try (
 
 ### Choice of Jdbc and Java connection methods
 
-Compared with the traditional `jdbc:mysql` connection method, the performance 
test of the Arrow Flight SQL connection method of Jdbc and Java can be found at 
[GitHub Issue](https://github.com/apache/doris/issues/25514). Here are some 
usage suggestions based on the test conclusions.
+[JDBC/Java Arrow Flight SQL 
Sample](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/README.md)
 is a JDBC/Java demo using Arrow FLight SQL. You can use it to test various 
connection methods for sending queries to Arrow Flight Server, helping you 
understand how to use Arrow FLight SQL and test performance. Implemented in 
[Add Arrow Flight Sql demo for 
Java](https://github.com/apache/doris/pull/45306).
+
+Compared with the traditional `jdbc:mysql` connection method, the performance 
test of the Arrow Flight SQL connection method of Jdbc and Java can be found in 
Section 6.2 of [GitHub Issue](https://github.com/apache/doris/issues/25514). 
Here are some usage suggestions based on the test conclusions.
 
 1. For the above three Java Arrow Flight SQL connection methods, if the 
subsequent data analysis will be based on the row-based data format, then use 
jdbc:arrow-flight-sql, which will return data in the JDBC ResultSet format; if 
the subsequent data analysis can be based on the Arrow format or other 
column-based data formats, then use Flight AdbcDriver or Flight JdbcDriver to 
directly return data in the Arrow format, which will avoid row-column 
conversion and use the characteristics of Ar [...]
 
@@ -471,26 +473,152 @@ Arrow Flight currently has no official plan to support 
Spark and Flink ([GitHub
 
 The community previously referred to the open source 
[Spark-Flight-Connector](https://github.com/qwshen/spark-flight-connector) and 
used FlightClient in Spark to connect to Doris for testing. It was found that 
the data format conversion between Arrow and Doris Block is faster, which is 10 
times the conversion speed between CSV format and Doris Block, and it has 
better support for complex types such as Map and Array. This is because the 
Arrow data format has a high compression rate and lo [...]
 
+### Support BI tools
+
+Since Doris v2.1.8, BI tools such as DBeaver are supported to connect to Doris 
using the `arrow-flight-sql` protocol. For DBeaver's method of connecting to 
Doris using the `arrow-flight-sql` Driver, refer to: 
[how-to-use-jdbc-driver-with-dbeaver-client](https://www.dremio.com/blog/jdbc-driver-for-arrow-flight-sql/#h-how-to-use-jdbc-driver-with-dbeaver-client),
 
[client-applications/clients/dbeaver/](https://docs.dremio.com/current/sonar/client-applications/clients/dbeaver/?_gl=1*1epgwh0*_
 [...]
+
+## Extended Application
+
+### Multiple BEs return results in parallel
+
+Doris will aggregate the results of a query on all BE nodes to one BE node by 
default. In Mysql/JDBC queries, FE will request query results from this 
aggregated data node. In Arrow Flight SQL queries, FE will wrap the IP/Port of 
this node in the Endpoint and return it to ADBC Client. ADBC Client will 
request the BE node corresponding to this Endpoint to pull data.
+
+If the query is just a simple Select to pull data from Doris, without Join, 
Sort, Window Function and other operators with data Shuffle behavior, the query 
can be split according to Tablet granularity. Now Doris Spark/Flink Connector 
uses this method to implement parallel data reading, which is divided into two 
steps:
+1. Execute `explain sql`, and the ScanOperator in the query plan returned by 
FE contains all Tablet ID Lists of Scan.
+2. Split the original SQL into multiple SQLs based on the Tablet ID List 
above. Each SQL only reads part of the Tablet. The usage is similar to `SELECT 
* FROM t1 TABLET(10001,10002) limit 1000;`. The multiple SQLs after splitting 
can be executed in parallel. Refer to [Support select table 
sample](https://github.com/apache/doris/pull/10170).
+
+If the outermost layer of the query is aggregation, the SQL is similar to 
`select k1, sum(k2) from xxx group by k1`. After Doris v3.4, execute `set 
enable_parallel_result_sink=true;` to allow each BE node of a query to return 
query results independently. After receiving the Endpoint list returned by FE, 
ADBC Client pulls results from multiple BE nodes in parallel. However, please 
note that when the aggregation result is very small, returning multiple BEs 
will increase the pressure on R [...]
+
+### Multiple BEs share the same IP accessible from outside the cluster
+
+If there is a Doris cluster, its FE node can be accessed from outside the 
cluster, and all its BE nodes can only be accessed from inside the cluster. 
This is fine when using Mysql Client and JDBC to connect to Doris to execute 
queries, and the query results will be returned by the Doris FE node. However, 
using Arrow Flight SQL to connect to Doris cannot execute queries, because ADBC 
Client needs to connect to the Doris BE node to pull query results, but the 
Doris BE node is not allowed [...]
+
+In a production environment, it is often inconvenient to expose the Doris BE 
node outside the cluster. However, you can add a reverse proxy (such as Nginx) 
to all Doris BE nodes. When the client outside the cluster connects to Nginx, 
it will be randomly routed to a Doris BE node. By default, the Arrow Flight SQL 
query results will be randomly saved on a Doris BE node. If it is different 
from the Doris BE node randomly routed by Nginx, data forwarding is required 
within the Doris BE node.
+
+Starting from Doris v2.1.8, you can configure `public_host` and 
`arrow_flight_sql_proxy_port` in `be.conf` of all Doris BE nodes to the IP and 
port shared by multiple Doris BE nodes and accessible outside the cluster. The 
query results can be correctly forwarded and returned to the ADBC Client.
+```conf
+public_host={nginx ip}
+arrow_flight_sql_proxy_port={nginx port}
+```
+
 ## FAQ
 
-1. ARM environment reports an error `get flight info statement failed, arrow 
flight schema timeout, TimeoutException: Waited 5000 milliseconds for 
io.grpc.stub.Client`. If the Linux kernel version is <= 4.19.90, you need to 
upgrade to 4.19.279 or above, or recompile Doris BE in the environment of the 
lower version of the Linux kernel. For specific compilation methods, refer to 
the document <docs/dev/install/source-install/compilation-arm>
+1. Q: ARM environment reports an error `get flight info statement failed, 
arrow flight schema timeout, TimeoutException: Waited 5000 milliseconds for 
io.grpc.stub.Client`.
+
+A: If the Linux kernel version is <= 4.19.90, you need to upgrade to 4.19.279 
or above, or recompile Doris BE in the environment of the lower version of the 
Linux kernel. For specific compilation methods, refer to the document 
<docs/dev/install/source-install/compilation-arm>
+
+Cause: This is because there is a compatibility issue between the old version 
of the Linux kernel and Arrow. `cpp: arrow::RecordBatch::MakeEmpty()` will get 
stuck when constructing Arrow Record Batch, causing Doris BE's Arrow Flight 
Server to fail to respond to Doris FE's Arrow Flight Server's RPC request 
within 5000ms, causing FE to return rpc timeout failed to Client. When Spark 
and Flink read Doris, they also convert the query results into Arrow Record 
Batch and return them, so the sa [...]
+
+The Linux kernel version of kylinv10 SP2 and SP3 is only 
4.19.90-24.4.v2101.ky10.aarch64 at most. The kernel version cannot be upgraded 
further. Doris BE can only be recompiled on kylinv10. If the problem still 
exists after compiling Doris BE with the new version of ldb_toolchain, you can 
try to compile it with the lower version of ldb_toolchain v0.17. If your ARM 
environment cannot connect to the external network, Huawei Cloud provides ARM + 
kylinv10, and Alibaba Cloud provides x86 + kylinv10
+
+2. Q: Prepared statement passes parameters and reports errors.
+
+A: Currently, `jdbc:arrow-flight-sql` and Java ADBC/JDBCDriver do not support 
prepared statement parameter passing. For example, `select * from xxx where 
id=?` will report an error `parameter ordinal 1 out of range`. This is a bug in 
Arrow Flight SQL ([GitHub Issue](https://github.com/apache/arrow/issues/40118)).
+
+3. Q: How to modify the batch size read by `jdbc:arrow-flight-sql` each time 
to improve performance in some scenarios.
+
+A: By modifying `setTargetBatchSize` in the `makeJdbcConfig` method in the 
`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader` file, the default is 1024, 
and then saving the modified file to the local directory with the same path 
name, so as to overwrite the original file and take effect.
+
+4. Q: ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel 
reading.
+
+A: The `stmt.executePartitioned()` method is not implemented. You can only use 
the native FlightClient to implement parallel reading of multiple endpoints, 
using the method `sqlClient=new FlightSqlClient, 
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(), 
for(FlightEndpoint endpoint: endpoints)`. In addition, the default 
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQuery, 
the row-format JDBC ResultSet is converted back to the Arrow column format. I 
[...]
+
+5. Q: Specify the database name in the URL.
+
+A: As of Arrow v15.0, Arrow JDBC Connector does not support specifying 
database name in URL. For example, specifying connection to `test` database in 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 is invalid, and you can only execute SQL `use database` manually. Arrow v18.0 
supports specifying database name in URL, but there are still bugs in actual 
testing.
+
+6. Q: Python ADBC prints `Warning: Cannot disable autocommit; conn will not 
be DB-API 2.0 compliant`.
+
+A: Ignore this Warning when using Python. This is a problem with Python ADBC 
Client and will not affect queries.
+
+7. Q: Python reports an error `grpc: received message larger than max 
(20748753 vs. 16777216)`.
+
+A: Refer to [Python: grpc: received message larger than max (20748753 vs. 
16777216) #2078](https://github.com/apache/arrow-adbc/issues/2078) and add 
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database 
Option.
+
+8. Q: Error `invalid bearer token` is reported.
+
+A: Execute `SET PROPERTY FOR 'root' 'max_user_connections' = '10000';` to 
change the current maximum number of connections for the current user to 10000; 
add qe_max_connection=30000 and arrow_flight_token_cache_size=8000 in `fe.conf` 
and restart FE.
+
+The connection between the ADBC Client and the Arrow Flight Server is 
essentially a long link, which requires Auth Token, Connection, and Session to 
be cached on the Server. After the connection is created, it will not be 
disconnected immediately at the end of a single query. The Client needs to send 
a close() request to clean it up, but in fact, the Client often does not send a 
close request, so the Auth Token, Connection, and Session will be saved on the 
Arrow Flight Server for a lon [...]
+
+As of Doris v2.1.8, Arrow Flight connections and Mysql/JDBC connections use 
the same connection limit, including the total number of connections of all FE 
users `qe_max_connection` and the number of connections of a single user 
`max_user_connections` in `UserProperty`. But the default `qe_max_connection` 
and `max_user_connections` are 1024 and 100 respectively. Arrow Flight SQL is 
often used to replace JDBC scenarios, but the JDBC connection will be released 
immediately after the query e [...]
+
+9. Q: Java Arrow Flight SQL reads the Datatime type and returns a timestamp 
instead of the formatted time.
+
+A: Java Arrow Flight SQL needs to convert the timestamp by itself when reading 
the Datatime type. Refer to [Add java parsing datetime type in arrow flight sql 
sample #48578](https://github.com/apache/doris/pull/48578). Using Python Arrow 
Flight SQL to read the Datatime type returns `2025-03-03 17:23:28Z`, while Java 
Arrow Flight SQL returns `1740993808`.
+
+10. Q: Java Arrow Flight JDBC Client reports an error `Configuration does not 
provide a mapping for array column 2` when reading Array nested types.
+
+A: Refer to 
[`sample/arrow-flight-sql`](https://github.com/apache/doris/blob/master/samples/arrow-flight-sql/java/src/main/java/doris/arrowflight/demo/FlightAdbcDriver.java)
 to use JAVA ADBC Client.
+
+Python ADBC Client, JAVA ADBC Client, and Java JDBC DriverManager all have 
no problem reading Array nested types. Only Java Arrow Flight JDBC Client has 
problems. In fact, the compatibility of Arrow Flight JDBC Client is not 
guaranteed. It is not officially developed by Arrow, but by a third-party 
database company Dremio. Other compatibility issues have been found before, so 
it is recommended to use JAVA ADBC Client first.
+
+## Release Note
+
+> Arrow Flight SQL protocol is supported since Doris 2.1. As of Doris 2.1.9, 
the issues fixed are listed based on Doris 2.1 series versions. Doris 3.0 
series versions are self-checked.
+
+### v2.1.9
+
+1. Fix the problem of Doris data serialization to Arrow.
+[Fix UT DataTypeSerDeArrowTest of Array/Map/Struct/Bitmap/HLL/Decimal256 
types](https://github.com/apache/doris/pull/48944)
+- Failed to read `Decimal256` type;
+- Subtle error in reading `DatetimeV2` type;
+- Incorrect result in reading `DateV2` type;
+- Error when reading `IPV4/IPV6` type result is NULL;
+
+2. Fix the problem that Doris Arrow Flight SQL query fails and returns empty 
result, without returning real error information.
+[Fix query result is empty and not return query error 
message](https://github.com/apache/doris/pull/45023)
+
+### v2.1.8
+
+1. Support BI tools such as DBeaver to connect to Doris using the 
`arrow-flight-sql` protocol, and support the correct display of metadata trees.
+[Support arrow-flight-sql protocol getStreamCatalogs, getStreamSchemas, 
getStreamTables #46217](https://github.com/apache/doris/pull/46217).
+
+2. When multiple BEs share the same IP that is accessible to the outside of 
the cluster, the query results can be correctly forwarded and returned to the 
ADBC Client.
+[Arrow flight server supports data forwarding when BE uses public 
vip](https://github.com/apache/doris/pull/43281)
+
+3. Support multiple endpoints to read in parallel.
+[Arrow Flight support multiple 
endpoints](https://github.com/apache/doris/pull/44286)
+
+4. Fix query error `FE not found arrow flight schema`.
+[Fix FE not found arrow flight 
schema](https://github.com/apache/doris/pull/43960)
+
+5. Fix error `BooleanBuilder::AppendValues` when reading columns that allow 
NULL.
+[Fix Doris NULL column conversion to arrow 
batch](https://github.com/apache/doris/pull/43929)
+
+6. Fix `show processlist` displays duplicate Connection IDs.
+[Fix arrow-flight-sql ConnectContext to use a unified ID 
#46284](https://github.com/apache/doris/pull/46284)
+
+### v2.1.7
+
+1. Fix frequent log printing `Connection wait_timeout`.
+[Fix kill timeout FlightSqlConnection and FlightSqlConnectProcessor 
close](https://github.com/apache/doris/pull/41770)
+
+2. Fix Arrow Flight Bearer Token expiration from Cache.
+[Fix Arrow Flight bearer token cache evict after 
expired](https://github.com/apache/doris/pull/41754)
+
+### v2.1.6
 
-Problem cause: This is because there is a compatibility issue between the old 
version of the Linux kernel and Arrow. `cpp: arrow::RecordBatch::MakeEmpty()` 
will get stuck when constructing Arrow Record Batch, causing Doris BE's Arrow 
Flight Server to fail to respond to Doris FE's Arrow Flight Server RPC request 
within 5000ms, causing FE to return rpc timeout failed to Client. When Spark 
and Flink read Doris, they also convert the query results into Arrow Record 
Batch and return them, so  [...]
+1. Fix query error `0.0.0.0:xxx, connection refused`.
+[Fix return result from FE Arrow Flight server error 0.0.0.0:xxx, connection 
refused](https://github.com/apache/doris/pull/40002)
 
-The Linux kernel version of kylinv10 SP2 and SP3 is only up to 
4.19.90-24.4.v2101.ky10.aarch64. You cannot continue to upgrade the kernel 
version. You can only recompile Doris BE on kylinv10. If the problem still 
exists after compiling Doris BE with the new version of ldb_toolchain, you can 
try to compile it with the lower version of ldb_toolchain v0.17. If your ARM 
environment cannot connect to the Internet, Huawei Cloud provides ARM + 
kylinv10, and Alibaba Cloud provides x86 + kylinv10
+2. Fix query error `Reach limit of connections`.
+[Fix exceed user property max connection cause Reach limit of connections 
#39127](https://github.com/apache/doris/pull/39127)
 
-2. Currently, `jdbc:arrow-flight-sql` and Java ADBC/JDBCDriver do not support 
prepared statement passing parameters. For example, `select * from xxx where 
id=?` will report an error `parameter ordinal 1 out of range`. This is a bug in 
Arrow Flight SQL ([GitHub Issue](https://github.com/apache/arrow/issues/40118))
+In previous versions, execute `SET PROPERTY FOR 'root' 'max_user_connections' 
= '1024';` to modify the current maximum number of connections for the current 
user to 1024, which can be temporarily circumvented.
 
-3. Modification `jdbc:arrow-flight-sql` The batch size of each read can 
improve performance in some scenarios. By modifying the `setTargetBatchSize` in 
the `makeJdbcConfig` method in the 
`org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader` file, the default is 1024, 
and then saving the modified file to the local directory with the same name, it 
will overwrite the original file to take effect.
+Because the previous version only limits the number of Arrow Flight 
connections to less than `qe_max_connection/2`, `qe_max_connection` is the 
total number of connections for all fe users, the default is 1024, and does not 
limit the number of Arrow Flight connections for a single user to less than 
`max_user_connections` in `UserProperty`, the default is 100, so when the 
number of Arrow Flight connections exceeds the upper limit of the current 
user's connection number, an error `Reach lim [...]
 
-4. ADBC v0.10, JDBC and Java ADBC/JDBCDriver do not support parallel 
reading, and the `stmt.executePartitioned()` method is not implemented. You can 
only use the native FlightClient to implement parallel reading of multiple 
Endpoints, using the method `sqlClient=new FlightSqlClient, 
execute=sqlClient.execute(sql), endpoints=execute.getEndpoints(), 
for(FlightEndpoint endpoint: endpoints)`. In addition, the default 
AdbcStatement of ADBC V0.10 is actually JdbcStatement. After executeQue [...]
+For details of the problem, see: 
[Questions](https://ask.selectdb.com/questions/D18b1/2-1-4-ban-ben-python-shi-yong-arrow-flight-sql-lian-jie-bu-hui-duan-kai-lian-jie-shu-zhan-man-da-dao-100/E1ic1?commentId=10070000000005324)
 
-5. As of Arrow v15.0, Arrow JDBC Connector does not support specifying the 
database name in the URL. For example, 
`jdbc:arrow-flight-sql://{FE_HOST}:{fe.conf:arrow_flight_sql_port}/test?useServerPrepStmts=false`
 specifies that the connection to the `test` database is invalid. You can only 
execute the SQL `use database` manually.
+3. Add Conf `arrow_flight_result_sink_buffer_size_rows` to support modifying 
the ArrowBatch size of query results returned in a single time, the default is 
4096 * 8.
+[Add config 
arrow_flight_result_sink_buffer_size_rows](https://github.com/apache/doris/pull/38221)
 
-6. There is a bug in Doris 2.1.4 version. There is a chance of error when 
reading large amounts of data. This bug is fixed in [Fix arrow flight result 
sink #36827](https://github.com/apache/doris/pull/36827) PR. Upgrading Doris 
2.1.5 version can solve this problem. For details of the problem, see: 
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
+### v2.1.5
 
-7. `Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant` 
Ignore this warning when using Python. This is a problem with the Python ADBC 
Client and will not affect the query.
+1. Fix the problem that Arrow Flight SQL query results are empty.
+[Fix arrow flight result sink 
#36827](https://github.com/apache/doris/pull/36827)
 
-8. Python reports an error `grpc: received message larger than max (20748753 
vs. 16777216)`. Refer to [Python: grpc: received message larger than max 
(20748753 vs. 16777216) 
#2078](https://github.com/apache/arrow-adbc/issues/2078) to add 
`adbc_driver_flightsql.DatabaseOptions.WITH_MAX_MSG_SIZE.value` in Database 
Option.
+Doris v2.1.4 may report an error when reading large amounts of data. For 
details, see: 
[Questions](https://ask.selectdb.com/questions/D1Ia1/arrow-flight-sql-shi-yong-python-de-adbc-driver-lian-jie-doris-zhi-xing-cha-xun-sql-du-qu-bu-dao-shu-ju)
 
-9. Before Doris version 2.1.7, the error `Reach limit of connections` is 
reported. This is because there is no limit on the number of Arrow Flight 
connections for a single user, which is less than `max_user_connections` in 
`UserProperty`, which is 100 by default. You can modify the current maximum 
number of connections for Billie user to 100 by `SET PROPERTY FOR 'Billie' 
'max_user_connections' = '1000';`, or add `arrow_flight_token_cache_size=50` in 
`fe.conf` to limit the overall number  [...]
+### Doris Arrow Flight v2.1.4 and earlier versions are not perfect. It is 
recommended to upgrade before use.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

(doris-website) branch master updated: [fix] arrow flight (#2211)

Reply via email to