Jay, On Mon, Apr 23, 2012 at 6:43 PM, JAX <[email protected]> wrote: > Curious : Seems like you could aggregate the results in the mapper as a local > variable or list of strings--- is there a way to know that your mapper has > just read the LAST line of an input split?
True. Can be one way to do it (unless aggregation of 'records' needs to happen live, and you don't wish to store it all in memory). > Is there a "cleanup" or "finalize" method in mappers that is run at the end > of a whole steam read to support these sort of chunked, in memor map/r > operations? Yes there is. See: Old API: http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/Mapper.html (See Closeable's close()) New API: http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapreduce/Mapper.html#cleanup(org.apache.hadoop.mapreduce.Mapper.Context) -- Harsh J
