Hi Eric,

> I've added 
> documentation to the autoconf manual to mention the algorithmic speed being 
> dependent on the quality of the underlying m4 implementation, and 
> recommending 
> m4_append over m4_append_uniq when duplicates can be tolerated.

Thanks for doing that.

> But I'm hoping that for M4 1.6, I can make m4_append behave linearithmically 
> (O
> (n log n)), by implementing the same optimization as using bash's += operator 
> for growing the contents of variables.  The idea is that if the prefix of the 
> new definition is the same pointer as the prior definition, then m4 should 
> geometrically grow the (over-allocated) storage ...

With that implementation, m4_append behaves linearly: O(n) where n is the
number of m4_append plus the total length of the arguments. Or, if you
exclude m4_append calls that append the empty string, O(n) where n is the total
length of the arguments = length of the resulting string.

> But since we can append an arbitrary 
> amount of bytes in one insertion, we can end up with fewer than n insertions 
> between reallocs, so I think that the more pessamistic amortized O(n log n) 
> is 
> a better characterization of the overall growth.

Eh? Even in that case it's O(n). Proof: Let n be the sum of lengths of the
arguments. Let 1/q (0 < q < 1) be the growth factor. Then
  - the number of byte copies from the argument strings to working memory is
    exactly = n,
  - the number of zeroed bytes and of copied bytes spent in reallocation is:
      <= n*q + O(1) for the last reallocation,
      <= n*q^2 + O(1) for the second-to-last reallocation,
      ...
      (at most log n / log q + O(1) reallocations).
    So in total : <= n * (q + q^2 + ...) + O(log n) = n *q/(1-q) + O(log n)
    = O(n).

Bruno



Reply via email to