Hello, I could find on this ML archives only a thread about this subject: to consider the file checksum instead of the timestamp. Other systems like scons already support this feature and it would be great to have it for GNU Make too.
I attached a patch against the current CVS to add --use-checksum to GNU Make, it is just a proof-of-concept but it shows that adding this feature can really boost a remake. In this way, simply touching a file will not cause it to be recompiled, as it was easy to imagine but for example let's say you modify a comment in the file test.c; using the standard make you will have to: test.c -> test.o -> test Using a checksum you will have only: test.c -> test.o because the .o file is unchanged. This scenario is what surprised me more as it is a very common one and can save a lot of time at linking time. The biggest problem is how save information, in the patch the checksum for file a is saved in the file a.checksum, but I don't think this can be a reasonable solution; probably hide them in a subdirectory is not a so bad idea. Concurrent accesses are not a problem using files, they will be used almost in the same way as the timestamp information is used now; anyway, in the worst case the hash will be different and the file will be recompiled. Beside use a better algorithm to find a hash for the file, MD5 is my first thought, and hopefully find another way to store data (but still I think files are the best choice), do you have other ideas or suggestions? Regards, Giuseppe
? checksum_patch.diff Index: file.c =================================================================== RCS file: /sources/make/make/file.c,v retrieving revision 1.90 diff -u -r1.90 file.c --- file.c 4 Nov 2007 21:54:01 -0000 1.90 +++ file.c 11 Apr 2008 21:20:54 -0000 @@ -1,6 +1,6 @@ -/* Target file management for GNU Make. +/* Target file management fo GNU Make. Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, -1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software +1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. This file is part of GNU Make. @@ -189,8 +189,80 @@ f->last = new; } + new->last_checksum = read_checksum (new); + + return new; } + + +/* Compute the checksum for the file. */ + +int +compute_checksum(struct file *new) +{ + int checksum = 0; + FILE *f; + char buffer [4096]; + + f = fopen (new->name, "r"); + if (f != NULL) + { + size_t nbr; + int i; + do + { + nbr = fread (buffer, 4096, 1, f); + + for (i = 0; i < nbr; i++) + checksum = 21 * checksum + 23 * buffer[i]; + + } + while (nbr); + fclose (f); + } + return checksum; +} + +int +read_checksum(struct file *new) +{ + int checksum = 0; + FILE *f; + char * checksum_file = (char*) xmalloc (strlen (new->name) + 10); + + sprintf (checksum_file, "%s.checksum", new->name); + + f = fopen (checksum_file, "r"); + if (f != NULL) + { + fread (&checksum, 4, 1, f); + fclose (f); + } + + + free (checksum_file); + return checksum; +} + +void +write_checksum(struct file *new) +{ + FILE *f; + char * checksum_file = (char*) xmalloc (strlen (new->name) + 10); + + sprintf (checksum_file, "%s.checksum", new->name); + + f = fopen (checksum_file, "w"); + if (f != NULL) + { + fwrite (&new->checksum, 4, 1, f); + fclose (f); + } + + free (checksum_file); +} + /* Rehash FILE to NAME. This is not as simple as resetting the `hname' member, since it must be put in a new hash bucket, Index: filedef.h =================================================================== RCS file: /sources/make/make/filedef.h,v retrieving revision 2.30 diff -u -r2.30 filedef.h --- filedef.h 4 Jul 2007 19:35:18 -0000 2.30 +++ filedef.h 11 Apr 2008 21:20:54 -0000 @@ -1,6 +1,6 @@ /* Definition of target file data structures for GNU Make. Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, -1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software +1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. This file is part of GNU Make. @@ -94,6 +94,8 @@ pattern-specific variables. */ unsigned int considered:1; /* equal to 'considered' if file has been considered on current scan of goal chain */ + int checksum; /* Actual checksum of the file. */ + int last_checksum; /* Last checksum registered on the file. */ }; @@ -103,6 +105,9 @@ struct file *lookup_file (const char *name); struct file *enter_file (const char *name); +int compute_checksum(struct file *new); +int read_checksum(struct file *new); +void write_checksum(struct file *new); struct dep *parse_prereqs (char *prereqs); void remove_intermediates (int sig); void snap_deps (void); Index: main.c =================================================================== RCS file: /sources/make/make/main.c,v retrieving revision 1.227 diff -u -r1.227 main.c --- main.c 4 Nov 2007 21:54:01 -0000 1.227 +++ main.c 11 Apr 2008 21:20:57 -0000 @@ -1,6 +1,6 @@ /* Argument parsing and main program of GNU Make. Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, -1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software +1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. This file is part of GNU Make. @@ -226,6 +226,12 @@ unsigned int default_job_slots = 1; static unsigned int master_job_slots = 0; + +/* Define if the checksum of a file should be considered. */ + +int use_checksum = 0; + + /* Value of job_slots that means no limit. */ static unsigned int inf_jobs = 0; @@ -365,6 +371,8 @@ Consider FILE to be infinitely new.\n"), N_("\ --warn-undefined-variables Warn when an undefined variable is referenced.\n"), + N_("\ + --use-checksum Use the files checksum.\n"), NULL }; @@ -411,6 +419,7 @@ { 'S', flag_off, &keep_going_flag, 1, 1, 0, 0, &default_keep_going_flag, "no-keep-going" }, { 't', flag, &touch_flag, 1, 1, 1, 0, 0, "touch" }, + { 'U', flag, &use_checksum, 1, 1, 1, 0, 0, "checksum" }, { 'v', flag, &print_version_flag, 1, 1, 0, 0, 0, "version" }, { CHAR_MAX+3, string, &verbosity_flags, 1, 1, 0, 0, 0, "verbosity" }, @@ -432,6 +441,7 @@ { "new-file", required_argument, 0, 'W' }, { "assume-new", required_argument, 0, 'W' }, { "assume-old", required_argument, 0, 'o' }, + { "use-checksum", optional_argument, 0, 'U' }, { "max-load", optional_argument, 0, 'l' }, { "dry-run", no_argument, 0, 'n' }, { "recon", no_argument, 0, 'n' }, Index: make.h =================================================================== RCS file: /sources/make/make/make.h,v retrieving revision 1.131 diff -u -r1.131 make.h --- make.h 4 Nov 2007 21:54:01 -0000 1.131 +++ make.h 11 Apr 2008 21:20:57 -0000 @@ -517,6 +517,7 @@ extern unsigned int commands_started; +extern int use_checksum; extern int handling_fatal_signal; Index: remake.c =================================================================== RCS file: /sources/make/make/remake.c,v retrieving revision 1.137 diff -u -r1.137 remake.c --- remake.c 5 Nov 2007 14:15:20 -0000 1.137 +++ remake.c 11 Apr 2008 21:20:59 -0000 @@ -1,6 +1,6 @@ /* Basic dependency engine for GNU Make. Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, -1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007 Free Software +1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc. This file is part of GNU Make. @@ -505,7 +505,6 @@ d->file->dontcare = file->dontcare; } - dep_status |= check_dep (d->file, depth, this_mtime, &maybe_make); /* Restore original dontcare flag. */ @@ -515,6 +514,7 @@ if (! d->ignore_mtime) must_make = maybe_make; + check_renamed (d->file); { @@ -546,6 +546,7 @@ /* Now we know whether this target needs updating. If it does, update all the intermediate files we depend on. */ + if (must_make || always_make_flag) { for (d = file->deps; d != 0; d = d->next) @@ -764,6 +765,11 @@ DBF (DB_VERBOSE, _("Recipe of `%s' is being run.\n")); return 0; } + else if (use_checksum) + { + file->checksum = compute_checksum (file); + write_checksum (file); + } switch (file->update_status) { @@ -946,8 +952,31 @@ check_renamed (file); mtime = file_mtime (file); check_renamed (file); - if (mtime == NONEXISTENT_MTIME || mtime > this_mtime) - *must_make_ptr = 1; + + if (mtime == NONEXISTENT_MTIME) + { + *must_make_ptr = 1; + } + else if(mtime > this_mtime) + { + if (use_checksum && file->last_checksum ) + { + file->checksum = compute_checksum (file); + + if (file->checksum != file->last_checksum) + *must_make_ptr = 1; + + } + else + *must_make_ptr = 1; + + if (use_checksum) + { + if (!file->checksum) + file->checksum = compute_checksum (file); + write_checksum (file); + } + } } else { @@ -972,10 +1001,33 @@ check_renamed (file); mtime = file_mtime (file); check_renamed (file); - if (mtime != NONEXISTENT_MTIME && mtime > this_mtime) - /* If the intermediate file actually exists and is newer, then we - should remake from it. */ - *must_make_ptr = 1; + if (mtime != NONEXISTENT_MTIME) + { + *must_make_ptr = 1; + } + else if(mtime > this_mtime) + { + if (use_checksum && file->last_checksum) + { + file->checksum = compute_checksum (file); + if (file->checksum != file->last_checksum) + *must_make_ptr = 1; + } + else + /* If the intermediate file actually exists and is newer, then we + should remake from it. */ + *must_make_ptr = 1; + + + if (use_checksum) + { + if (!file->checksum) + file->checksum = compute_checksum (file); + write_checksum (file); + } + + + } else { /* Otherwise, update all non-intermediate files we depend on, if @@ -1002,20 +1054,20 @@ if (lastd == 0) { file->deps = d->next; - free_dep (d); + free_dep (d); d = file->deps; } else { lastd->next = d->next; - free_dep (d); + free_dep (d); d = lastd->next; } continue; } d->file->parent = file; - maybe_make = *must_make_ptr; + maybe_make = *must_make_ptr; dep_status |= check_dep (d->file, depth, this_mtime, &maybe_make); if (! d->ignore_mtime)
_______________________________________________ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make