Hi all,

It has been one year since Apache Cloudberry officially joined the
Apache Incubator — a great time to look back on our journey so far!

Our Apache Cloudberry Roadmap was the first large open discussion we
launched after entering the Incubator, aiming to gather community
feedback and align on development priorities. Now, after a year of
active collaboration and contributions from many community members,
we’d like to share a brief recap of our progress.

1. Cherry-pick from Greenplum to Cloudberry (Highest Priority)

Now this task is almost done. You can track the progress here:
https://lists.apache.org/thread/bf4n0p6jt8x2wnsmgwqwmqqboy4kq0st

2. PostgreSQL Kernel Upgrade

PostgreSQL 14 ~> PostgreSQL 16 kernel upgrade work is in progress:
https://lists.apache.org/thread/1b5sr96315txsvs1zg65vsd1n01kf0ql

3. Performance and Usability

a) Support hybrid Row-Column storage, inspired by Partition Attributes
Across (https://www.vldb.org/conf/2001/P169.pdf), which has the same
write performance as AO tables and the same read performance as AOCS
tables. We will also integrate the latest compression algorithms and
encoding algorithms (such as dictionary encoding) into it.
  - This has been done, see
https://github.com/apache/cloudberry/tree/main/contrib/pax_storage
b) Refactor the Materialized view and query for external tables.
c) Support parallel execution in ORCA.
  -  See https://github.com/apache/cloudberry/pull/1398
d) Parallel query optimization to support more SQL operators.
  - See https://github.com/apache/cloudberry/pull/1261

4. Availability Improvements

a) Support hot (read-only) standby
  - See https://github.com/apache/cloudberry/pull/1268
b) Robust resource groups isolation - IO/CPU/Memory/Network
  - Already supported in Kernel

5. Functionality Improvements

* Pg_hint_plan for ORCA: done.

6. Streaming / Real-time

a) mplementing kafka_fdw extension to enable streaming data from Kafka
to Cloudberry.
  - See https://github.com/cloudberry-contrib/kafka_fdw
b) Integration with Flink CDC / Kafka connector to support near
real-time data integration.
  - See
    - Flink Connector JDBC -
https://github.com/apache/flink-connector-jdbc/commit/544275c8c8b03426b71192b0dde39bc51c041bab
c) Support Dynamic Tables.
  - See https://cloudberry.apache.org/docs/performance/use-dynamic-tables

7. Utilities and Ecosystem

a) Cherry-pick the latest commits from the original Greenplum projects
to Cloudberry, including cloudberry-pxf, cloudberry-gpbackup,
cloudberry-gpbackup-s3-plugin, cloudberry-go-libs.
  - cloudberry-gpbackup has been renamed to cloudberry-backup, its
codebase has synced with the GP’s archived version:
https://github.com/apache/cloudberry-backup
  - cloudberry-go-libs: cloudberry-go-libs has synced with the GP’s
archived version: https://github.com/apache/cloudberry-go-libs
  - cloudberry-gpbackup-s3-plugin: this repo has been archived and its
core files are merged into the cloudberry-backup:
https://github.com/apache/cloudberry-backup/tree/main/plugins/s3plugin
  - cloudberry-pxf: still in progress on the archived commits sync to
Cloudberry.
b) Support PGRX to support writing UDFs in Rust in Cloudberry.
  - See https://github.com/cloudberry-contrib/pgrx
c) DBeaver for Cloudberry
  - It has supported Cloudberry since its 25.2.2:
https://github.com/dbeaver/dbeaver/releases
d) JDBC/ODBC for Cloudberry
  - we can use the PostgreSQL JDBC/ODBC drive for Cloudberry
e) Integrations with other ASF projects
  - Apache SeaTunnel (done)
  - Apache MADlib (WIP)

8. Release Management

a) First Apache release
  - the first Apache release can be downloaded here:
https://cloudberry.apache.org/releases

b) Release Process
  - Documented release procedures following Apache guidelines
    - See https://github.com/apache/cloudberry/wiki
  - Automated release preparation and verification tools
    - See https://github.com/apache/cloudberry/tree/main/devops/release
  - Release notes and migration guides for each version
  - Security vulnerability handling process
    - See https://github.com/apache/cloudberry/blob/main/SECURITY.md

c) Pipelines
  -  Introduce the new build, test, and deployment workflows for
Cloudberry based on GitHub Actions and Docker.
    - See https://github.com/apache/cloudberry/tree/main/.github/workflows

9. Website, Documents, and Marketing

We’ve also made steady progress on documentation, website content, and
community outreach, strengthening Cloudberry’s visibility and
engagement in both the Apache and PostgreSQL ecosystems.

Note: if something is ignored, welcome to have your comments.

~~~

This milestone would not have been possible without the help of our
community contributors, mentors, and everyone who has been part of
this journey.

Thank you all for your continued support and contributions! Let’s keep
up the great work and make Cloudberry even better.


Best,
Dianjin Wang

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to