While I think the complexity piece is important, I feel like it's worth
pointing out just how much more expensive the allocation -- and its
ramifications, like the resulting GC pressure and CPU cache misses -- is
than one might think.  Here's a quadratic version of the code that
avoids allocations:

    #lang racket

    (require racket/flonum)

    (define (xy->string x y)
      (string-append
       (~r x #:precision 1) ","
       (~r y #:precision 1)))

    (define len 0)
    (define dst (make-string 5000000))
    (define (string-append! b)
      (string-copy! dst 0 dst 0 len) ;; intentionally performing pointless work 
to be n^2
      (string-copy! dst len b)
      (set! len (+ len (string-length b))))

    (define (xy-vectors->string x-vec y-vec)
      (for ((x (in-flvector x-vec))
            (y (in-flvector y-vec)))
        (string-append! (xy->string x y))
        (string-append! " ")))

    (time
     (let ([x (make-flvector 100000)]
           [y (make-flvector 100000)])
       (xy-vectors->string x y)))

On my machine (running Racket CS 8.1), this is faster than the
`call-with-output-string` version I posted earlier by about 50ms.


Jens Axel Søgaard writes:

> Den søn. 27. jun. 2021 kl. 18.58 skrev Alessandro Motta <[email protected]
>>:
>
>> I also like Jens' code for its pure functional elegance. I'm surprised
>> that building up a long list of short strings before joining them is so
>> much faster. After all, isn't the final `string-join` essentially doing
>> what the original code snipped did? (Clearly not! I will have a look at
>> the implementation.)
>>
>
> The following is the same answer as Robby's, but fills in some details.
> I wrote this mainly because I couldn't find a web-page with the explanation.
>
> Let's say we have two strings a and b of lengths m and n respectively.
>
> How much work must  (string-append a b) do?
>
> Conceptually the following steps need to be done:
>   1. Allocate a new string c of length m+n.
>   2. Copy the string a into c.
>   3. Copy the string b into c.
>
> This means that the time (work) needed to append the strings a and b are
> proportional to m+n.
>
> How much work is done here?
>
>     (string-append "x" (string-append "y" (string-append "z" "")))
>
> First (string-append "z" "w") takes time 1+0=1 with the result "z".
> Then (string-append "y" (string-append "z" "")) or  (string-append "y" "z")
> takes time 1+1=2 with the result "yz".
> Then (string-append "x" (string-append "y" (string-append "z" ""))) or
> (string-append "x" "yz") takes time 1+2 = 3 with result "xyz".
> In total we have used time 1+2+3 =6.
>
> We see that if we 3 times use string-append to add strings of length 1,
> then we use time 1+2+3=6.
> In general, if we n times use string-append to add strings of length 1,
> then we use time 1+2+3+...+n.
> The last sum is more or less n^2.  See [1].
>
> The same thing happens in your loop, where you repeatedly use string-append
> to append a small string.
> The length of the small string is no longer 1, but the same happens - and
> the time used by the
> loop is proportional to n^2.
>
>
> Now suppose we have a list of the strings to append
>     (list "x" "y" "z")
> and we need to append them. How much work is there now?
>
> Well, first we can calculate the length of the result 1+1+1 = 3.
> Then we allocate a new string of length 3. Then each individual
> string is copied and that takes time 1+1+1=3.
>
> Here, each string is only copied once. For comparison in the loop version
> the string z is copied multiple times.
>
>
> But wait - why does a loop that uses cons multiple times to build up
> a list not have the same problem?
>
> Because in (cons x xs) the existing list xs isn't copied.
> The steps are
>    1. a new pair with two cells  (the car and the cdr) are allocated
>    2. the contents of the car is set to x
>    3. the contents of the cdr is set to xs.
>
> This always takes the same amount of time, no matter how long the list xs
> is.
> This means that the time used by cons is constant (that is, proportional to
> 1).
>
> Note: This phenomenon that using string-append in a loop is not a Racket
> only problem.
>           It is a common pitfall in many languages.
>
>
>
> [1] Remember the famous story of Gauss as a child that calculated
> 1+2+...1000 ?
>      https://www.youtube.com/watch?v=Dd81F6-Ar_0
>
>
> /Jens Axel  -  https://racket-stories.com

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/m2pmw7cfza.fsf%40defn.io.

Reply via email to