Hello everyone, I’d like to briefly introduce a project I have submitted to the Apache Incubator—it's called *Debo*, a lightweight, open-source Hadoop cluster management system inspired by Apache Ambari. You can find the initial discussion thread here: [1] https://lists.apache.org/thread/bp0rqvf4xvmdb3n1m2fzhwym95841s5v
As I continue engaging with the incubation process, I’ve learned that *an ASF project requires a minimum of three active (P)PMC members to vote on releases*. At this stage, I am actively working to improve the project by: - Refactoring and cleaning up the codebase, - Adding automated tests, including make check-style test coverage, and - Preparing an initial beta-quality release, which I expect to finalize within about a month—unless I discover any critical missing features during final testing. With that in mind, I’d like to ask: *Are there any active (P)PMC members who are interested in following or supporting Debo and potentially participating in future release voting?* Your guidance and involvement would be highly valuable as the project advances. ------------------------------ *How Debo Works* *Installation* Debo installs Hadoop ecosystem components by downloading their binary distributions from publicly available sources. However, these are *not official Apache releases*, and their long-term availability cannot be guaranteed. I have also tested installation from source, but found that it can be significantly slower on commodity hardware and often requires a full development environment to be present on the user's machine. This can lead to resource overhead and incomplete cleanup. Given that Hadoop is intended to run on resource-constrained hardware, this approach isn't always ideal. An alternative is to host pre-built binaries in a dedicated location and configure Debo to download from there. However, that adds a dependency on the availability of a maintained storage endpoint, which introduces potential reliability risks. Currently, Debo uses the publicly available binaries when possible. A *hybrid approach*—using binaries when available and falling back to source builds when necessary—may be the best long-term strategy. I’d appreciate any community input or suggestions on how to align this with ASF best practices. *Reporting Functionality* Debo includes a reporting module that provides a visual, user-friendly overview of cluster health and performance. It does this by combining: - *Component-level statistics*, such as output from hdfs dfsadmin -report, and - *System-level metrics*, like CPU, memory, and disk usage obtained from the operating system. This combined data is parsed and displayed through a visually appealing interface, enabling users to monitor both individual component behavior and overall system status at a glance. ------------------------------ Thank you for your time and consideration. I’m looking forward to any feedback or interest in supporting Debo's development and incubation journey. Best regards, *Surafel Temesgen* Proposer of the Debo Project