Hi all, It has been one year since Apache Cloudberry officially joined the Apache Incubator — a great time to look back on our journey so far!
Our Apache Cloudberry Roadmap was the first large open discussion we launched after entering the Incubator, aiming to gather community feedback and align on development priorities. Now, after a year of active collaboration and contributions from many community members, we’d like to share a brief recap of our progress. 1. Cherry-pick from Greenplum to Cloudberry (Highest Priority) Now this task is almost done. You can track the progress here: https://lists.apache.org/thread/bf4n0p6jt8x2wnsmgwqwmqqboy4kq0st 2. PostgreSQL Kernel Upgrade PostgreSQL 14 ~> PostgreSQL 16 kernel upgrade work is in progress: https://lists.apache.org/thread/1b5sr96315txsvs1zg65vsd1n01kf0ql 3. Performance and Usability a) Support hybrid Row-Column storage, inspired by Partition Attributes Across (https://www.vldb.org/conf/2001/P169.pdf), which has the same write performance as AO tables and the same read performance as AOCS tables. We will also integrate the latest compression algorithms and encoding algorithms (such as dictionary encoding) into it. - This has been done, see https://github.com/apache/cloudberry/tree/main/contrib/pax_storage b) Refactor the Materialized view and query for external tables. c) Support parallel execution in ORCA. - See https://github.com/apache/cloudberry/pull/1398 d) Parallel query optimization to support more SQL operators. - See https://github.com/apache/cloudberry/pull/1261 4. Availability Improvements a) Support hot (read-only) standby - See https://github.com/apache/cloudberry/pull/1268 b) Robust resource groups isolation - IO/CPU/Memory/Network - Already supported in Kernel 5. Functionality Improvements * Pg_hint_plan for ORCA: done. 6. Streaming / Real-time a) mplementing kafka_fdw extension to enable streaming data from Kafka to Cloudberry. - See https://github.com/cloudberry-contrib/kafka_fdw b) Integration with Flink CDC / Kafka connector to support near real-time data integration. - See - Flink Connector JDBC - https://github.com/apache/flink-connector-jdbc/commit/544275c8c8b03426b71192b0dde39bc51c041bab c) Support Dynamic Tables. - See https://cloudberry.apache.org/docs/performance/use-dynamic-tables 7. Utilities and Ecosystem a) Cherry-pick the latest commits from the original Greenplum projects to Cloudberry, including cloudberry-pxf, cloudberry-gpbackup, cloudberry-gpbackup-s3-plugin, cloudberry-go-libs. - cloudberry-gpbackup has been renamed to cloudberry-backup, its codebase has synced with the GP’s archived version: https://github.com/apache/cloudberry-backup - cloudberry-go-libs: cloudberry-go-libs has synced with the GP’s archived version: https://github.com/apache/cloudberry-go-libs - cloudberry-gpbackup-s3-plugin: this repo has been archived and its core files are merged into the cloudberry-backup: https://github.com/apache/cloudberry-backup/tree/main/plugins/s3plugin - cloudberry-pxf: still in progress on the archived commits sync to Cloudberry. b) Support PGRX to support writing UDFs in Rust in Cloudberry. - See https://github.com/cloudberry-contrib/pgrx c) DBeaver for Cloudberry - It has supported Cloudberry since its 25.2.2: https://github.com/dbeaver/dbeaver/releases d) JDBC/ODBC for Cloudberry - we can use the PostgreSQL JDBC/ODBC drive for Cloudberry e) Integrations with other ASF projects - Apache SeaTunnel (done) - Apache MADlib (WIP) 8. Release Management a) First Apache release - the first Apache release can be downloaded here: https://cloudberry.apache.org/releases b) Release Process - Documented release procedures following Apache guidelines - See https://github.com/apache/cloudberry/wiki - Automated release preparation and verification tools - See https://github.com/apache/cloudberry/tree/main/devops/release - Release notes and migration guides for each version - Security vulnerability handling process - See https://github.com/apache/cloudberry/blob/main/SECURITY.md c) Pipelines - Introduce the new build, test, and deployment workflows for Cloudberry based on GitHub Actions and Docker. - See https://github.com/apache/cloudberry/tree/main/.github/workflows 9. Website, Documents, and Marketing We’ve also made steady progress on documentation, website content, and community outreach, strengthening Cloudberry’s visibility and engagement in both the Apache and PostgreSQL ecosystems. Note: if something is ignored, welcome to have your comments. ~~~ This milestone would not have been possible without the help of our community contributors, mentors, and everyone who has been part of this journey. Thank you all for your continued support and contributions! Let’s keep up the great work and make Cloudberry even better. Best, Dianjin Wang --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
