From: Jason Feng <[email protected]> > I am using XML::Parser::PerlSAX > to parse a 300M XML file. I meet a strange issue with handler characters. > This handler is supposed to return > all the contents between start markup and end markup. But sometimes it just > returns one part of the whole contents. On the second call, perhaps it returns > the rest part of the contents.
That is to be expected. >From the docs of XML::Parser: Char (Expat, String) This event is generated when non-markup is recognized. The non-markup sequence of characters is in String. A single non-markup sequence of characters may generate multiple calls to this handler. Whatever the encoding of the string in the original document, this is given to the handler in UTF-8. Write your code so that it handles this. Or use a module that does this for you. Jenda ===== [email protected] === http://Jenda.Krynicky.cz ===== When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] http://learn.perl.org/
