Hi all, As part of an Apache RAT audit of the Cloudberry (Incubating) codebase, I’ve been reviewing the repository layout and came across the following directory:
https://github.com/apache/cloudberry/tree/main/deploy This directory appears to contain deployment scripts and configurations that seem tightly coupled to HashData’s internal infrastructure. Several files reference specific paths, services, or assumptions about the environment that don’t appear usable or reproducible by the broader open-source community. A few open questions for discussion: - Is the `deploy/` directory still in active use or maintained? - Was it contributed primarily to support internal workflows, or is it intended to serve as a general deployment guide for the community? - If it isn’t currently usable without access to internal resources, should we clarify that in the README or revisit how it’s presented? - Would it be valuable to provide a community-accessible deployment path using standard open tools (e.g., Docker Compose, Helm charts, Terraform)? - Does the current setup align with ASF expectations around transparency and community-driven development? While it’s entirely understandable that early artifacts might reflect their original environment, continued reliance on internal infrastructure could make onboarding harder for external users — and may not reflect the Apache spirit of openness and independence. I’m not proposing any specific changes yet — just raising this for context gathering and community input. Happy to follow up with a GitHub issue later if there’s interest in improving deployability for all users. Thanks in advance for your thoughts. Best, -=e -- Ed Espino Apache Cloudberry (Incubating) & MADlib