Hi all, As you probably know, I've written a large part of the PowerPoint filters of Calligra. A large part of that code is automatically generated. The files filters/libmso/generated/simpleParser.h filters/libmso/generated/simpleParser.cpp are generated from a file mso.xml. The project msoscheme at gitorious git://gitorious.org/msoscheme/msoscheme.git is where this file is hosted. There is a copy of the generator in the Calligra tree. The code works, but it is inefficient: it uses way more memory than needed via many dynamic memory allocations. There is a version that is two times as fast and uses much less memory. This version can be created with the following commands: git clone git://gitorious.org/msoscheme/msoscheme.git cd msoscheme ant && mkdir build && cd build && cmake ../cpp && make
This will give two executables: apitest <- new version simpletest <- old version When run on a set of 600 ppt files from a.o. kofficetests, this is the output from valgrind: simpletest: (normal run time: 5.7 seconds) ==28930== total heap usage: 2,457,961 allocs, 2,457,954 frees, 218,241,950 bytes allocated apitest: (normal run time: 2.9 seconds) ==28852== total heap usage: 254,832 allocs, 254,825 frees, 52,421,077 bytes allocated That's almost 10x fewer memory allocations and 4.2x lower memory usage. All the memory allocations for apitest are from POLE, since api.h does not do any memory allocations while parsing. This is partially what makes it fast: all memory is either in the continuous parsed stream or on the stack. Also, the usage interface is more convenient. To parse a memory structure, you do not need to create a stream and feed it into the structure. This is how the api works for e.g. parsing a PowerPoint stream from an OLE container: MSO::PowerPointStructs pps(array.data(), array.size()); if (!pps) { // error } The current parser that Calligra uses, uses QSharedPointer, QList, QVector and QByteArray. api.h does not use any of these. To start using this code in Calligra, replace simpleParser.* with api.* and fix all compilation errors. This is actually quite some work since this parser is used in many places in the filters right now. So before you embark on this effort, first do measurements to see how much time is currently spent in parsing these files and whether halving of that time has a significant effort on the total loading time. Cheers, Jos -- Jos van den Oever, software architect +49 391 25 19 15 53 074 3491911 http://kogmbh.com/legal/ _______________________________________________ calligra-devel mailing list calligra-devel@kde.org https://mail.kde.org/mailman/listinfo/calligra-devel