> On Apr 5, 2025, at 11:43 AM, Joshua DeWeese <josh.dewe...@gmail.com> wrote: > > Hi, > I was wondering about the possibility of adding, as a feature to make, > the addition of a standard makefile fragments library.
On a related note, I released a library called "make-booster": https://github.com/david-a-wheeler/make-booster It provides additional useful facilities, e.g., added support for large data pipelines & Python. Below is a summary. It might be useful if make pointed to some of these support systems, or made it easier to download/install them. --- David A. Wheeler ==== INFO ABOUT MAKE-BOOSTER ==== Make-booster This project (contained in this directory and below) provides utility routines intended to greatly simplify data processing (particularly a data pipeline) using GNU make. It includes some mechanisms specifically to help Python, as well as general-purpose mechanisms that can be useful in any system. In particular, it helps reliably reproduce results, and it automatically determines what needs to run and runs only that (producing a significant speedup in most cases). Specific capabilities In particular: • It provides mechanisms to ensure that if a Python script is modified (including one that is transitively included by other Python scripts), or its internal inputs are modified, all the processes that depend on that script (or internal inputs) are rerun. This dependency calculation for Python scripts is done automatically by a tool included in this pacakge. • It provides general-purpose mechanisms to help do the same for other programming languages. • By default it enables "Delete on Error" to avoid accidentally including corrupted data in final results. • It supports "grouped targets" to correctly handle processes that generate multiple files, without requiring GNU make version 4.3 or later. • It automatically runs tests as appropriate if some file is changed, but only if the test could change its results (by examining transitive dependencies). We include default mechanisms for doing that in Python, and hooks to support other languages. • It will run source code scans run as appropriate if a file is changed. It includes defaults to do that in Python and shell, and hooks to do that with other languages. For example, imagine that Python file BBB.py says include CC, and file CC.py reads from file F.txt (and CC.py declares its INPUTS= as described below). Now if you modify file F.txt or CC.py, any rule that runs BBB.py will automatically be re-run in the correct order when you use make, even if you didn't directly edit BBB.py. In tests with over 1000 files the overhead for GNU make to figure out "what to do" was only 0.07 seconds when there was nothing to do. The first time you ever use it on a project there's some work for it to do to record information, but that is a one-time cost and even that doesn't take too long (depending on your project's size). The approaches used here are not new to software development; people who use compiled programming languages have used them for decades. However, many people who use dynamic languages (like Python) to implement data pipelines are unaware that these mechanisms exist, and we didn't find ready-make mechanisms to do this for data processing pipelines. So this is small set of tools on top of GNU make to do the same thing for data pipelines as is already done for some projects that use compiled languages.