Seeking (P)PMC Interest for the "Debo" Project (Hadoop Cluster Management Tool)

Surafel Temesgen Mon, 16 Jun 2025 00:07:23 -0700

Hello everyone,

I’d like to briefly introduce a project I have submitted to the Apache
Incubator—it's called *Debo*, a lightweight, open-source Hadoop cluster
management system inspired by Apache Ambari. You can find the initial
discussion thread here:
[1] https://lists.apache.org/thread/bp0rqvf4xvmdb3n1m2fzhwym95841s5v


As I continue engaging with the incubation process, I’ve learned that *an
ASF project requires a minimum of three active (P)PMC members to vote on
releases*. At this stage, I am actively working to improve the project by:

   -

   Refactoring and cleaning up the codebase,
   -

   Adding automated tests, including make check-style test coverage, and
   -

   Preparing an initial beta-quality release, which I expect to finalize
   within about a month—unless I discover any critical missing features during
   final testing.

With that in mind, I’d like to ask:
*Are there any active (P)PMC members who are interested in following or
supporting Debo and potentially participating in future release voting?*
Your guidance and involvement would be highly valuable as the project
advances.
------------------------------

*How Debo Works*

*Installation*
Debo installs Hadoop ecosystem components by downloading their binary
distributions from publicly available sources. However, these are *not
official Apache releases*, and their long-term availability cannot be
guaranteed.

I have also tested installation from source, but found that it can be
significantly slower on commodity hardware and often requires a full
development environment to be present on the user's machine. This can lead
to resource overhead and incomplete cleanup. Given that Hadoop is intended
to run on resource-constrained hardware, this approach isn't always ideal.

An alternative is to host pre-built binaries in a dedicated location and
configure Debo to download from there. However, that adds a dependency on
the availability of a maintained storage endpoint, which introduces
potential reliability risks.

Currently, Debo uses the publicly available binaries when possible. A *hybrid
approach*—using binaries when available and falling back to source builds
when necessary—may be the best long-term strategy. I’d appreciate any
community input or suggestions on how to align this with ASF best practices.

*Reporting Functionality*
Debo includes a reporting module that provides a visual, user-friendly
overview of cluster health and performance. It does this by combining:

   -

   *Component-level statistics*, such as output from hdfs dfsadmin -report,
   and
   -

   *System-level metrics*, like CPU, memory, and disk usage obtained from
   the operating system.

This combined data is parsed and displayed through a visually appealing
interface, enabling users to monitor both individual component behavior and
overall system status at a glance.
------------------------------

Thank you for your time and consideration. I’m looking forward to any
feedback or interest in supporting Debo's development and incubation
journey.

Best regards,
*Surafel Temesgen*
Proposer of the Debo Project

Seeking (P)PMC Interest for the "Debo" Project (Hadoop Cluster Management Tool)

Reply via email to