Hi Steve, One thing to keep in mind is that the counters in Hadoop are passed via heartbeats, so you'll see updates only 2 seconds or so. I have seen implementations with 1,000s of counters w/o a noticeable performance impact (since they are passed via heartbeats and pre-aggregated in a reducer the frequency does not matter).
-- Alex Kozlov PhD Senior Solutions Architect Cloudera, Inc Cloudera in Open Source http://www.cloudera.com/company/open-source/ <http://www.cloudera.com/company/press-center/hadoop-world-nyc/> On Thu, Dec 8, 2011 at 10:48 AM, Steve Lewis <[email protected]> wrote: > I have several counters that I maintain to allow me to keep statistics on > critical operations. I have my code incrementing the counters in an inner > loop > partly to make sure my job is not killed for not making progress. It would > be very easy to keep an internal counter and increment the Hadoop value > less frequently. Assuming > I am currently incrementing a counter several million times in a reduce > task- is this costing me performance and would I be better off incrementing > less frequently > > -- > Steven M. Lewis PhD > 4221 105th Ave NE > Kirkland, WA 98033 > 206-384-1340 (cell) > Skype lordjoe_com > > >
