On Wed, 2004-04-14 at 00:06, Jefferson Ogata wrote:
>
> I'm suggesting the pcap storage format be XML. A raw capture, without using 
> protocol dissectors, would just be a sequence of base64-encoded (perhaps) frames 
> and metadata.

But once you're using raw base64-encoded (or whatever), you're losing
the benefit of any xml-enabled app to understand what's contained.

> Tools like the tcpdump protocol dissectors and tethereal could then just be XML 
> filters that take a raw XML input frame and annotate it with protocol elements, 
> as in the rough example I posted. Existing XML tools, e.g. xsltproc, could 
> generate reports from the annotated XML using XSLT. The reports could as easily 
> be HTML output as plain text or more XML.

I really doubt that a feature like HTML output is what the majority of
pcap users need ...

> Additional protocol dissectors for protocols unknown to tcpdump/tethereal could 
> be written in any language with XML support (preferably event-based). In fact, 
> many protocol analyzers could be written directly in XSLT/XPath and processed 
> using xsltproc. Among other things, this provides many means to eliminate the 
> continuing problem of buffer overflows. tcpdump could have a plugin architecture 
> with an XML filter for each protocol/frame type.

Well I think to be consistent you'd have to make those pcap plugins (as
pcap will be the component writing out the trace files). If at any point
you want a plugin to convert base64-encoded raw data into structured xml
then I don't see how that will prevent buffer overflows. Sure, as long
as you're within the xml world, that problem will be reduced.

> I'm suggesting that we use XML as the capture file format so that tcpdump 
> becomes an extensible XML filter.

I also believe that the performance hit of parsing the packet data into
XML before writing the packets out will be too hight for applications
that want to get the packets to disk as quickly as possible. And if in
that case, you turn off any analyzers and output raw base64, then you
lose all the benefits anyway.

> Or you can throw all that musing away. Just pay attention to the discussion for 
> a little while --

Hey I am :)

>  it revolves around timestamp and metadata formats, sizes of 
> fields, and other esoterica that are sounding a bit archaic in today's computing 
> environment. I think we should take a hard look at whether it's really 
> appropriate to define yet another hard binary file format when XML can provide 
> the same functionality with modest storage overhead, and has many added benefits.

Trust me, I am not one of the default hardcore XML haters, but I don't
see why a tagged binary format isn't enough in the case at hand. If
somebody finds that some hash value bitfield isn't large enough then
create another tag format -- I don't see that problem there, you can
always encode length values in those headers anyway to keep things
flexible in the first place.

I don't like the idea of XML as the lowest common denominator for a
capture format -- as a processing-stage output it sounds great to me.

Regards,
Christian.
-- 
________________________________________________________________________
                                          http://www.cl.cam.ac.uk/~cpk25
                                                    http://www.whoop.org


-
This is the tcpdump-workers list.
Visit https://lists.sandelman.ca/ to unsubscribe.

Reply via email to