Torsten Curdt wrote:

Hi, folks!

The numbers of the XMLByteStreamCompilerInterpreterTestCase and the SaxBufferTestCase gave me some RT
--
If you have a look at the testcases it's quite obvious that the SaxBuffer is *much* faster than the XMLByteStream classes. As a thumb rule -just to get the dimensions- we could say:


XMLC/XMLI is about 15 times faster than Xerces SaxBuffer is about 100 times faster than Xerces

Of course this depends heavily on the document. But it should be enough to grasp the magnitude. Which was a bit of a surprise for me. I personally did not expect this *huge* difference. Especially because the SaxBuffer creates much more objects than the XMLC.


I'm not very surprised by these numbers: XMLC does a pretty heavy job to serialize Strings to bytes.

Furthermore, I just looked at the XMLByteStreamCompiler.write() which shows that it spends most of its time resizing the byte buffer, as resizing is limited to the actual number of bytes needed for the current write, and not by a larger growth increment.

It would be interesting to redo the test by introducing this growth increment. BTW, I don't understand the "this.buf.length << 1" in the write() method.

But the huge difference between the SaxBuffer and the XMLC is that the XMLC serializes the SAX event on the fly. The SaxBuffer does not support serialization but keeps the events as objects.

IMO spending time on the serialization only makes sense if

 a) the memory consumption is too high otherwise
 b) the SAX stream is being saved to disk

Maybe we can extend the testcases to compare the memory consumption. For the question of the destination we could let the store decide.

Anyway both classes make sense. But maybe they would make even more sense if they would share the same interface and would become interchangeable.

The SAX stream buffering is a vital component of cocoon. Looking at the numbers the impact on the performance could be tremendous.

What do you think?


Can't we merge both: use SAXBuffer for in-memory storage, and use XMLC/XMLI to serialize it? This could even be done transparently by having SAXBuffer implementing Serializable and use XMLC/XMLI to implement readObject() and writeObject().

Sylvain

--
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -  http://www.orixo.com




Reply via email to