> On April 9, 2013, 9:05 a.m., Inge Wallin wrote: > > libs/odf/KoXmlStreamReader.cpp, line 229 > > <http://git.reviewboard.kde.org/r/109887/diff/2/?file=131690#file131690line229> > > > > This code is indeed slower than the normal QXmlStreamReader. Since it > > builds on top of it, it would be impossible not to be. But you shouldn't > > compare this class with the normal stream reader. You should compare > > QXmlStreamReader plus the code that calls it, i.e. the odf parser on top of > > it to this class plus the code that calls _it_. > > > > The purpose of this class is not primarily to make the code faster but > > to make it nicer. As I wrote in the description of the review request it > > will result in much easier to read code. See the description for more > > details. > > > > Regarding the plans, I have 3 terms: > > > > Short term, I want to use this for the docx export filter. After good > > feedback to the odf traversing library I realized that stream reading was a > > much better approach. I will port the odf traverser to use stream reading > > and use that for the filter. > > > > Medium term I think that we could port the KoXmlReader (DOM based) to > > KoXmlStreamReader. We would have to do extensive testing before doing that. > > This would let us remove the call to the extremely ugly fixNamespaces() > > function and therefore make it a lot faster. Since > > KoXmlReader.setDocument() uses around 15% of the total parsing time I think > > we could get a nice speedup there. > > > > Long term it would be very nice if we could start to use stream reading > > in our loading code. But for that to happen we need to change the structure > > of loading so that we can combine DOM based loading with stream based. I > > don't have any particular plans but I think it will mature when the current > > classes are used more.
> This code is indeed slower than the normal QXmlStreamReader. :( I still thing the overall idea may indeed not bad if done under performance pov. I mean there certainly is room for optimization, always :) > The purpose of this class is not primarily to make the code faster Then we may disagree about priorities. For me that code-path is very hot and I know that I am not the only one who spend *WEEKS* on profiling that to win as less as a single digit percent more performance. Same goes for memory-consumption which is the other hot topic in that code. > but to make it nicer. I am not even sure it is nicer. Sure it is at lease less verbose if you compare: if (e.ns() == KoXML::foo && e.tagName() == "tag") with if (e.name() == "foo:tag") but the first statement can be easily made nicer without any performance hit using a macro or a template function like: inline bool elementEquals(e, ns, tag) {} if (elementEquals(e, KoXML::foo, "tag")) or since we have a well known small number of namespaces we could even go with: inline bool elementEqualsFoo(e, tagname) {} if (elementEqualsFoo(e, "tag")) or if we are very funny we could inherit from QXMLStreamReader and add a == operator and restructure the code to actually use the namespace in a useful manner like: if (e == KoXML::foo) { if(e == "tag1") {} else if(e == "tag2") {} } else if (e == KoXML::bar) { if(e == "tag3") {} else if(e == "tag4") {} } or add proper specialized operators or even methods to use like: if (e.isFoo("tag")) > let us remove the call to the extremely ugly fixNamespaces() function and > therefore make it a lot faster. Well, it cannot be removed. Its clear defined how namespaces work and hard-coding the prefix violates XML and ODF. See https://tools.oasis-open.org/issues/browse/OFFICE-3712 "Instances of XML documents can employ any [xml-names]-admissable declarations of prefixes and default namespaces that accomplish binding of the listed namespaces." What we could do is indeed to make that if-conditions smaller/nicer since that's what this is about or? For that we not need to write our own StreamReader that introduces as side-effect performance and probably memory consumption regressions. Definitively not. - Sebastian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/109887/#review30755 ----------------------------------------------------------- On April 6, 2013, 3:56 p.m., Inge Wallin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://git.reviewboard.kde.org/r/109887/ > ----------------------------------------------------------- > > (Updated April 6, 2013, 3:56 p.m.) > > > Review request for Calligra, Jarosław Staniek and Jos van den Oever. > > > Description > ------- > > This patch contains a new XML stream reader based on the QXmlStreamReader > that is better suited for ODF. > > Much ODF parsing code in Calligra looks like: > > if (el.namespaceUri() == KoXml::fo && el.name == "border") { ... } > > The reason for this complicated construction is that the prefix (the "fo" in > "fo:border") is not unique but is declared at the beginning of each Xml file. > Even though "fo" is the normal prefix there is no guarantee that it is the > same in every document. > > However, it is a very rare document where it is *not* the normal prefix, so > what we want to do is to be able to write code like this: > > If (el.qualifiedName() == "fo:border") { ... } > > and make the XML stream reader or dom tree rewrite the qualified name in the > very rare cases that the prefix does not match what we want. > > This is exactly what the KoXmlStreamReader does. It allows you to write > easier and faster code while still be correct in the few cases where the > prefixes are not the expected ones. It does this by letting the user enter > the expected namespace declarations and then compare those to the actual > namespace declarations in the document. If they match everything will be as > fast as possible. If they don't, it will be slower but still correct. > > As an extra feature I have allowed the user to declare some extra namespaces > (see fixNamespace() in KoXmlReader.cpp). This will let documents created by > old versions of OpenOffice.org be read even though they use other namespaces. > > I have code that uses this file but that is not yet ready for review. I > wanted to put this up early to get feedback while the rest of the yet > unfinished code is maturing. > > > Diffs > ----- > > libs/odf/CMakeLists.txt 3680486 > libs/odf/KoXmlStreamReader.h PRE-CREATION > libs/odf/KoXmlStreamReader.cpp PRE-CREATION > > Diff: http://git.reviewboard.kde.org/r/109887/diff/ > > > Testing > ------- > > Not much. I will do that when the code that uses this code is ready. This > review is for getting feedback on the ideas and implementation details. > > > Thanks, > > Inge Wallin > >
_______________________________________________ calligra-devel mailing list calligra-devel@kde.org https://mail.kde.org/mailman/listinfo/calligra-devel