Folks,

I've been trying to tackle the measurement of an irregular statistic on an 
embedded platform: embedded JVM garbage collection. During any given "interval" 
for collectd, I may have no GC activity or I might have a dozen instances where 
the JVM performed garbage collection. I have a file wherein the GC numbers are 
stored (time of occurrence, JVM heap before, JVM heap after, time required to 
garbage-collect), so I can write a read plugin to simply read the file. It 
should be easy enough to keep track of the last time it ran, so I can know 
exactly when/where to index in that file so I can start the file read at the 
right point (and read to the end). But the fact that I may have multiple values 
in any one interval is throwing me off.

So if I have a set of N data points (including timestamp), can I simply iterate 
through a list, calling plugin_dispatch_values( &vl ) where I've not only set 
up the "standard" vl data elements but also the vl.time element also, with the 
appropriate timestamp?

E.G. (very psedocode-ish) :
// Iterate through the N values for heap during this interval
for iter=0; iter<N; iter++
{
  gcdata = dataArray[iter];
  vl.values = gcdata.heap;
  vl.time = gcdata.timestamp;
  sstrncpy( ... host, plugin, type, type_instance, etc ...);
  plugin_dispatch_values(&vl);
}

Other obvious alternatives would be (1) to write the plugin so it would average 
all values of interest and just report ONE set of data (and perhaps a metric 
for the number of GCs that occurred during that interval); or (2) to only 
report the most RECENT set of data, or (3) to have the read plugin interval 
much shorter than how fast I expect the GC to run. But if it's possible, I'd 
rather get all of the instances recorded. Have any folks had to deal with such 
irregular values before?

Thanks,
Dave

_______________________________________________
collectd mailing list
[email protected]
http://mailman.verplant.org/listinfo/collectd

Reply via email to