On Wed, 2004-04-14 at 00:06, Jefferson Ogata wrote: > > I'm suggesting the pcap storage format be XML. A raw capture, without using > protocol dissectors, would just be a sequence of base64-encoded (perhaps) frames > and metadata.
But once you're using raw base64-encoded (or whatever), you're losing the benefit of any xml-enabled app to understand what's contained. > Tools like the tcpdump protocol dissectors and tethereal could then just be XML > filters that take a raw XML input frame and annotate it with protocol elements, > as in the rough example I posted. Existing XML tools, e.g. xsltproc, could > generate reports from the annotated XML using XSLT. The reports could as easily > be HTML output as plain text or more XML. I really doubt that a feature like HTML output is what the majority of pcap users need ... > Additional protocol dissectors for protocols unknown to tcpdump/tethereal could > be written in any language with XML support (preferably event-based). In fact, > many protocol analyzers could be written directly in XSLT/XPath and processed > using xsltproc. Among other things, this provides many means to eliminate the > continuing problem of buffer overflows. tcpdump could have a plugin architecture > with an XML filter for each protocol/frame type. Well I think to be consistent you'd have to make those pcap plugins (as pcap will be the component writing out the trace files). If at any point you want a plugin to convert base64-encoded raw data into structured xml then I don't see how that will prevent buffer overflows. Sure, as long as you're within the xml world, that problem will be reduced. > I'm suggesting that we use XML as the capture file format so that tcpdump > becomes an extensible XML filter. I also believe that the performance hit of parsing the packet data into XML before writing the packets out will be too hight for applications that want to get the packets to disk as quickly as possible. And if in that case, you turn off any analyzers and output raw base64, then you lose all the benefits anyway. > Or you can throw all that musing away. Just pay attention to the discussion for > a little while -- Hey I am :) > it revolves around timestamp and metadata formats, sizes of > fields, and other esoterica that are sounding a bit archaic in today's computing > environment. I think we should take a hard look at whether it's really > appropriate to define yet another hard binary file format when XML can provide > the same functionality with modest storage overhead, and has many added benefits. Trust me, I am not one of the default hardcore XML haters, but I don't see why a tagged binary format isn't enough in the case at hand. If somebody finds that some hash value bitfield isn't large enough then create another tag format -- I don't see that problem there, you can always encode length values in those headers anyway to keep things flexible in the first place. I don't like the idea of XML as the lowest common denominator for a capture format -- as a processing-stage output it sounds great to me. Regards, Christian. -- ________________________________________________________________________ http://www.cl.cam.ac.uk/~cpk25 http://www.whoop.org - This is the tcpdump-workers list. Visit https://lists.sandelman.ca/ to unsubscribe.