On Tuesday, June 30, 2020 at 7:48:14 AM UTC+8 Neil Van Dyke wrote:
> Is even 2x speedup helpful for your purpose? Yes it is, and for my purpose `read-xml` is fine even without any speed improvement. In the sports field, XML (via the TCX format) is a legacy technology. Typical TCX files are about 1Mb in size, the 14Mb one is a very large one. Setting ` xml-count-bytes` to #t while calling `read-xml` gets me a speed improvement at a low effort, but it is not worth adding another package dependency just to support a legacy technology. 3 seconds is one old magic > number for user patience in HCI, so I suppose there's still a big > difference between 4 seconds and almost 10 seconds? > I am not sure where you got the 3 seconds from, but even 3 seconds is too long to wait on a button callback. For large files, both read-xml and sxml would need to have a progress dialog with a cancel button, or some other form of user feedback, if one wants to make a "well behaved" GUI. > For large (and absolutely massive) XML... SSAX can shine even better > than in this comparison, since you can, say, populate a database *while > you're parsing, without first constructing the intermediate > representation* of xexpr or SXML. GC-wise, with the database-populating > scenario, you'll probably end up with small, little-referencing, local, > short-lived allocations. Besides GC costs, you'll also use less RAM > (possibly lower AWS bill), and be less likely to push into swap (which > would be bad for performance). > ... if you are willing to deal with the complexity of a SAX interface, that is. I have written code for parsing documents (correctly!) using a SAX interface, and the resulting code was so complex that I had to use a code generator for it, but yes, the resulting code was very fast. Would I do it again? No. The complexity of SAX parsing is probably why most people use a DOM style interface... > In addition to SSAX's current performance characteristics and > opportunities... There might also be opportunity to optimize SSAX > significantly for Racket. Oleg is a famously capable Scheme programmer, > but he was writing SSAX in fairly portable Scheme code, a couple decades > ago, when he wrote SSAX. I did an initial packaging of SSAX for PLT > Scheme, Kirill Lisovsky later did many packagings of various SXML-ish > tools (including his own), and then John Clements did more work to > package Oleg's SXML-ish tools for Racket... But I don't know that anyone > has had motivation to try to optimize Racket's SSAX port, using current > Racket features, and tuning for current performance characteristics. > > Side note regarding performance comparison... FWIW, SSAX might be doing > some things `read-xml` doesn't, such as namespace resolution, entity > reference resolution, and some validation. > You used the phrase "might be doing...", does that mean that it might not do those things? Alex. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/affcfe0e-a5a7-43a6-9019-8876dc40ed03n%40googlegroups.com.

