On Sun, Jun 28, 2020 at 06:01:27PM -0700, Alex Harsanyi wrote: > I tested the your string port version and I also wrote a "string-append" > version of the xml reader and they are both slower by about 10-15% on my > machine, when compared to the current read-xml implementation which uses > `list->string`. It looks like `list->string` is not the bottleneck here.
Odd -- to remove all that storage-allocaation overhead and to find it gets slower... Perhaps the overhead lies in the Scheme interpreter? Does it allocate lots of storage? Would using chez racket help any? -- hendrik > > There are some small improvements that can be made from micro > optimizations. For example, I changed `name-char?` to not use > `name-start?` but instead check for all chars, and I also changed > `lex-name` to construct the list in reverse and use `(list->string (reverse > chars))`, plus I reordered the cond condition to check the common case > first (that the next character is a name-char? and not a 'special one). > However, this resulted in only about 5-10% speed improvement, nowhere near > where the 4 time speedup when using `sxml`, as reported by John. > > In the end, it may well be that speeding up `read-xml` can only be done by > these types of micro-optimizations. Another thing I looked into was the > "pattern" used for reading: all the `read-xml` code will use the pattern of > "peeking" the next character, deciding if it is good, than reading it. > This is much slower than just reading the characters directly. These are > the results from just reading in a 14Mb XML file: > > read-char only: cpu time: 312 real time: 307 gc time: 0 > read-char-or-special only: cpu time: 750 real time: 741 gc time: 0 > peek-char than read-char: cpu time: 1234 real time: 1210 gc time: 0 > peek-char-or-special than read-char-or-special: cpu time: 1688 real > time: 1690 gc time: 0 > > Using this code: > > (define file-name "your-test-file-here.xml") > > (printf "read-char only~%") > (collect-garbage 'major) > (time > (call-with-input-file file-name > (lambda (in) > (let loop ([c (read-char in)]) > (if (eof-object? c) > (void) > (loop (read-char in))))))) > > (printf "read-char-or-special only~%") > (collect-garbage 'major) > (time > (call-with-input-file file-name > (lambda (in) > (let loop ([c (read-char-or-special in)]) > (if (eof-object? c) > (void) > (loop (read-char-or-special in))))))) > > (printf "peek-char than read-char~%") > (collect-garbage 'major) > (time > (call-with-input-file file-name > (lambda (in) > (let loop ([c (peek-char in)]) > (if (eof-object? c) > (void) > (begin > (void (read-char in)) > (loop (peek-char in)))))))) > > (printf "peek-char-or-special than read-char-or-special~%") > (collect-garbage 'major) > (time > (call-with-input-file file-name > (lambda (in) > (let loop ([c (peek-char-or-special in)]) > (if (eof-object? c) > (void) > (begin > (void (read-char-or-special in)) > (loop (peek-char-or-special in)))))))) > > Alex. > > On Monday, June 29, 2020 at 5:30:43 AM UTC+8 [email protected] wrote: > > > Thanks Alex for pointing out the use of list->string. I've created a PR ( > > https://github.com/racket/racket/pull/3275) that changes that code to use > > string ports instead (similar to Hendrik's suggestion, but the string port > > handles resizing automatically). Could someone (John?) with some large XML > > files lying around try the changes and see if they help? > > > > Ryan > > > > > > On Sun, Jun 28, 2020 at 9:56 PM Neil Van Dyke <[email protected]> > > wrote: > > > >> If anyone wants to optimize `read-xml` for particular classes of use, > >> without changing the interface, it might be very helpful to run your > >> representative tests using the statistical profiler. > >> > >> The profiler text report takes a little while of tracing through > >> manually to get a feel for how to read and use it, but it can be > >> tremendously useful, and is worth learning to do if you need performance. > >> > >> After a first pass with that, you might also want to look at how costly > >> allocations/GC are, and maybe do some controlled experiments around > >> that. For example, force a few GC cycles, run your workload under > >> profiler, check GC time during, and forced time after. If you're > >> dealing with very large graphs coming out of the parser, I don't know > >> whether those are enough to matter with the current GC mechanism, but > >> maybe also check GC time while you're holding onto large graphs, when > >> you release them, and after they've been collected. At some point, GC > >> gets hard for at least me to reason about, but some things make sense, > >> and other things you decide when to stop digging. :) If you record all > >> your measurements, you can compare empirically the how different changes > >> to the code affect things, hopefully in representative situations. > >> > >> I went through a lot of these exercises to optimize a large system, and > >> sped up dynamic Web page loads dramatically in the usual case (to the > >> point we were then mainly limited by PostgreSQL query cost, not much by > >> the application code in Scheme, nor our request&response network I/O), > >> and also greatly reduced the pain of intermittent request latency spikes > >> due to GC. > >> > >> One of the hotspots, I did half a dozen very different implementations, > >> including C extension, and found an old-school pure Scheme > >> implementation was fastest. I compared the performance of the > >> implementation using something like `shootout`, but there might be > >> better ways now in Racket. https://www.neilvandyke.org/racket/shootout/ > >> I also found we could be much faster if we made a change to what the > >> algorithm guarantees, since it was more of a consistency check that > >> turned out to be very expensive and very redundant, due to all the ways > >> that utility code ended up being used. > >> > >> In addition to contrived experiments, I also rigged up a runtime option > >> so that the server would save data from the statistical profiler for > >> each request a Web server handled in production. Which was tremendously > >> useful, since it gave us real-world examples that were also difficult to > >> synthesize (e.g., complex dynamic queries), and we could go from Web > >> logs and user feedback, to exactly what happened. > >> > >> (In that system I optimized, we used Oleg's SXML tools very heavily > >> throughout the system, plus some bespoke SXML tools for HTML and XML. > >> There was one case in which someone had accidentally used the `xml` > >> module, not knowing it was incompatible with the rest of the system, > >> which caused some strange failures (no static checking) before it was > >> discovered, and we changed that code to use SXML.) > >> > >> -- > >> You received this message because you are subscribed to the Google Groups > >> "Racket Users" group. > >> To unsubscribe from this group and stop receiving emails from it, send an > >> email to [email protected]. > >> To view this discussion on the web visit > >> https://groups.google.com/d/msgid/racket-users/68624c9a-df35-14a3-a912-df806799a7e0%40neilvandyke.org > >> . > >> > > > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/racket-users/b663b6e8-ac63-4ecd-8212-c0175db5afden%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/20200629120946.jh5tk4ospbo3cu3v%40topoi.pooq.com.

