Hi all,

As part of an Apache RAT audit of the Cloudberry (Incubating) codebase,
I’ve been reviewing the repository layout and came across the following
directory:

https://github.com/apache/cloudberry/tree/main/deploy

This directory appears to contain deployment scripts and configurations
that seem tightly coupled to HashData’s internal infrastructure. Several
files reference specific paths, services, or assumptions about the
environment that don’t appear usable or reproducible by the broader
open-source community.

A few open questions for discussion:

- Is the `deploy/` directory still in active use or maintained?
- Was it contributed primarily to support internal workflows, or is it
intended to serve as a general deployment guide for the community?
- If it isn’t currently usable without access to internal resources, should
we clarify that in the README or revisit how it’s presented?
- Would it be valuable to provide a community-accessible deployment path
using standard open tools (e.g., Docker Compose, Helm charts, Terraform)?
- Does the current setup align with ASF expectations around transparency
and community-driven development?

While it’s entirely understandable that early artifacts might reflect their
original environment, continued reliance on internal infrastructure could
make onboarding harder for external users — and may not reflect the Apache
spirit of openness and independence.

I’m not proposing any specific changes yet — just raising this for context
gathering and community input. Happy to follow up with a GitHub issue later
if there’s interest in improving deployability for all users.

Thanks in advance for your thoughts.

Best,
-=e

-- 
Ed Espino
Apache Cloudberry (Incubating) & MADlib

Reply via email to