GitHub user andr-sokolov edited a comment on the discussion: [Proposal] Iceberg 
subsystem for datalake_fdw — design proposal

**Comparing the performance of Greenplum, Starrocks and Trino when reading 
Iceberg tables**

_Greenplum cluster for the test_

The cluster consist of 5 hosts: master, standby and 3 segment hosts. Each host 
has 4 CPU cores and 16GB RAM. There are 4 primaries on each segment host - one 
primary per CPU core. The same equipment has been used to run TPC-H queries on 
Trino и StarRocks. I used TEA (https://github.com/lithium-tech/tea) to read 
Iceberg tables from Greenplum 6.

_Results_

<img width="1280" height="625" alt="1" 
src="https://github-production-user-asset-6210df.s3.amazonaws.com/105369428/589551651-ba2a6703-22c0-409f-959f-47e87f58a930.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20260508%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20260508T125610Z&X-Amz-Expires=300&X-Amz-Signature=6cd32f47024864cfc22d6fdae80419befaeb4d093835d1a47c73f81cec1cf975&X-Amz-SignedHeaders=host&response-content-type=image%2Fjpeg";
 />


The horizontal axis shows the numbers of the TPC-H test queries, and the 
vertical axis shows their execution time in seconds. You can find out the exact 
numbers in the attached 
[html](https://github.com/user-attachments/files/27520864/11.html). The dark 
red color means that the query failed.

There are `explain analyze verbose`-s for each query in [this 
file](https://github.com/user-attachments/files/27520800/explain.txt).

Greenplum executed only the q06 query faster, other queries are executed 
significantly slower.

GitHub link: 
https://github.com/apache/cloudberry/discussions/1683#discussioncomment-16852933

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to