Hi Eric, > I've added > documentation to the autoconf manual to mention the algorithmic speed being > dependent on the quality of the underlying m4 implementation, and > recommending > m4_append over m4_append_uniq when duplicates can be tolerated.
Thanks for doing that. > But I'm hoping that for M4 1.6, I can make m4_append behave linearithmically > (O > (n log n)), by implementing the same optimization as using bash's += operator > for growing the contents of variables. The idea is that if the prefix of the > new definition is the same pointer as the prior definition, then m4 should > geometrically grow the (over-allocated) storage ... With that implementation, m4_append behaves linearly: O(n) where n is the number of m4_append plus the total length of the arguments. Or, if you exclude m4_append calls that append the empty string, O(n) where n is the total length of the arguments = length of the resulting string. > But since we can append an arbitrary > amount of bytes in one insertion, we can end up with fewer than n insertions > between reallocs, so I think that the more pessamistic amortized O(n log n) > is > a better characterization of the overall growth. Eh? Even in that case it's O(n). Proof: Let n be the sum of lengths of the arguments. Let 1/q (0 < q < 1) be the growth factor. Then - the number of byte copies from the argument strings to working memory is exactly = n, - the number of zeroed bytes and of copied bytes spent in reallocation is: <= n*q + O(1) for the last reallocation, <= n*q^2 + O(1) for the second-to-last reallocation, ... (at most log n / log q + O(1) reallocations). So in total : <= n * (q + q^2 + ...) + O(log n) = n *q/(1-q) + O(log n) = O(n). Bruno