I could point at the "main text flow" (usually, a table cell) and say "this is ham", and I could point at a block of "sponsored links" or "text advertisments" and say "this is spam".
The granualarity of content is usually a range of table cells or a frame. You can guess what's spam or not by page location and/or content.
Get started people! :-)
Done.
http://www.mozilla.org/mailnews/spam.html
-- Steve C. Lamb | I'm your priest, I'm your shrink, I'm your PGP Key: 8B6E99C5 | main connection to the switchboard of souls. -------------------------------+---------------------------------------------
pgp00000.pgp
Description: PGP signature