Anyone got an implementation of a mutable StringBuilder-like object? I 
could use it in Rebellion's implementation of `into-string` which currently 
isn't quadratic, but it definitely has the allocation problem.

On Sunday, June 27, 2021 at 11:36:14 AM UTC-7 bogdan wrote:

> While I think the complexity piece is important, I feel like it's worth
> pointing out just how much more expensive the allocation -- and its
> ramifications, like the resulting GC pressure and CPU cache misses -- is
> than one might think. Here's a quadratic version of the code that
> avoids allocations:
>
> #lang racket
>
> (require racket/flonum)
>
> (define (xy->string x y)
> (string-append
> (~r x #:precision 1) ","
> (~r y #:precision 1)))
>
> (define len 0)
> (define dst (make-string 5000000))
> (define (string-append! b)
> (string-copy! dst 0 dst 0 len) ;; intentionally performing pointless work 
> to be n^2
> (string-copy! dst len b)
> (set! len (+ len (string-length b))))
>
> (define (xy-vectors->string x-vec y-vec)
> (for ((x (in-flvector x-vec))
> (y (in-flvector y-vec)))
> (string-append! (xy->string x y))
> (string-append! " ")))
>
> (time
> (let ([x (make-flvector 100000)]
> [y (make-flvector 100000)])
> (xy-vectors->string x y)))
>
> On my machine (running Racket CS 8.1), this is faster than the
> `call-with-output-string` version I posted earlier by about 50ms.
>
>
> Jens Axel Søgaard writes:
>
> > Den søn. 27. jun. 2021 kl. 18.58 skrev Alessandro Motta <
> [email protected]
> >>:
> >
> >> I also like Jens' code for its pure functional elegance. I'm surprised
> >> that building up a long list of short strings before joining them is so
> >> much faster. After all, isn't the final `string-join` essentially doing
> >> what the original code snipped did? (Clearly not! I will have a look at
> >> the implementation.)
> >>
> >
> > The following is the same answer as Robby's, but fills in some details.
> > I wrote this mainly because I couldn't find a web-page with the 
> explanation.
> >
> > Let's say we have two strings a and b of lengths m and n respectively.
> >
> > How much work must (string-append a b) do?
> >
> > Conceptually the following steps need to be done:
> > 1. Allocate a new string c of length m+n.
> > 2. Copy the string a into c.
> > 3. Copy the string b into c.
> >
> > This means that the time (work) needed to append the strings a and b are
> > proportional to m+n.
> >
> > How much work is done here?
> >
> > (string-append "x" (string-append "y" (string-append "z" "")))
> >
> > First (string-append "z" "w") takes time 1+0=1 with the result "z".
> > Then (string-append "y" (string-append "z" "")) or (string-append "y" 
> "z")
> > takes time 1+1=2 with the result "yz".
> > Then (string-append "x" (string-append "y" (string-append "z" ""))) or
> > (string-append "x" "yz") takes time 1+2 = 3 with result "xyz".
> > In total we have used time 1+2+3 =6.
> >
> > We see that if we 3 times use string-append to add strings of length 1,
> > then we use time 1+2+3=6.
> > In general, if we n times use string-append to add strings of length 1,
> > then we use time 1+2+3+...+n.
> > The last sum is more or less n^2. See [1].
> >
> > The same thing happens in your loop, where you repeatedly use 
> string-append
> > to append a small string.
> > The length of the small string is no longer 1, but the same happens - and
> > the time used by the
> > loop is proportional to n^2.
> >
> >
> > Now suppose we have a list of the strings to append
> > (list "x" "y" "z")
> > and we need to append them. How much work is there now?
> >
> > Well, first we can calculate the length of the result 1+1+1 = 3.
> > Then we allocate a new string of length 3. Then each individual
> > string is copied and that takes time 1+1+1=3.
> >
> > Here, each string is only copied once. For comparison in the loop version
> > the string z is copied multiple times.
> >
> >
> > But wait - why does a loop that uses cons multiple times to build up
> > a list not have the same problem?
> >
> > Because in (cons x xs) the existing list xs isn't copied.
> > The steps are
> > 1. a new pair with two cells (the car and the cdr) are allocated
> > 2. the contents of the car is set to x
> > 3. the contents of the cdr is set to xs.
> >
> > This always takes the same amount of time, no matter how long the list xs
> > is.
> > This means that the time used by cons is constant (that is, proportional 
> to
> > 1).
> >
> > Note: This phenomenon that using string-append in a loop is not a Racket
> > only problem.
> > It is a common pitfall in many languages.
> >
> >
> >
> > [1] Remember the famous story of Gauss as a child that calculated
> > 1+2+...1000 ?
> > https://www.youtube.com/watch?v=Dd81F6-Ar_0
> >
> >
> > /Jens Axel - https://racket-stories.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/61691331-226f-4708-bd2a-42e5c6e5c2bcn%40googlegroups.com.

Reply via email to