>> Next caliper allows to get a lot of diagnostics from the cpu (also >> because >> ia64 supports all that while x86-64 does not AFAICT) like number of >> bubbles >> in the pipeline, L2-cache misses, clock-cycles per line of C-code etc. > > these are just the performance-counting MSR's, which are available > on Opterons as well as Xeons too.
Even back to the PIII processors (and more?). Check out PAPI (http://icl.cs.utk.edu/papi/) for more details but, as an example, here is the output from an old cluster node: e...@thinkbig1 ~ $ papi_avail -a Available events and hardware information. ------------------------------------------------------------------------- Vendor string and code : AuthenticAMD (2) Model string and code : AMD K7 (9) CPU Revision : 0.000000 CPU Megahertz : 2083.157959 CPU's in this Node : 1 Nodes in this System : 1 Total CPU's : 1 Number Hardware Counters : 4 Max Multiplex Counters : 32 ------------------------------------------------------------------------- The following correspond to fields in the PAPI_event_info_t structure. Name Derived Description (Mgr. Note) PAPI_L1_DCM Yes Level 1 data cache misses PAPI_L1_ICM No Level 1 instruction cache misses PAPI_L2_DCM No Level 2 data cache misses PAPI_L2_ICM No Level 2 instruction cache misses PAPI_L1_TCM Yes Level 1 cache misses PAPI_L2_TCM Yes Level 2 cache misses PAPI_TLB_DM No Data translation lookaside buffer misses PAPI_TLB_IM No Instruction translation lookaside buffer misses PAPI_TLB_TL Yes Total translation lookaside buffer misses PAPI_L1_LDM No Level 1 load misses PAPI_L1_STM No Level 1 store misses PAPI_L2_LDM No Level 2 load misses PAPI_L2_STM No Level 2 store misses PAPI_HW_INT No Hardware interrupts PAPI_BR_UCN No Unconditional branch instructions PAPI_BR_CN No Conditional branch instructions PAPI_BR_TKN No Conditional branch instructions taken PAPI_BR_NTK Yes Conditional branch instructions not taken PAPI_BR_MSP No Conditional branch instructions mispredicted PAPI_BR_PRC Yes Conditional branch instructions correctly predicted PAPI_TOT_INS No Instructions completed PAPI_BR_INS No Branch instructions PAPI_RES_STL No Cycles stalled on any resource PAPI_TOT_CYC No Total cycles PAPI_L1_DCH Yes Level 1 data cache hits PAPI_L2_DCH No Level 2 data cache hits PAPI_L1_DCA No Level 1 data cache accesses PAPI_L2_DCA Yes Level 2 data cache accesses PAPI_L2_DCR No Level 2 data cache reads PAPI_L2_DCW No Level 2 data cache writes PAPI_L1_ICA No Level 1 instruction cache accesses PAPI_L2_ICA No Level 2 instruction cache accesses PAPI_L1_ICR No Level 1 instruction cache reads PAPI_L1_TCA Yes Level 1 total cache accesses ------------------------------------------------------------------------- avail.c PASSED And from a newer cluster node. Note the addition of floating point metrics now available: e...@h2 ~ $ papi_avail -a Available events and hardware information. -------------------------------------------------------------------------------- Vendor string and code : GenuineIntel (1) Model string and code : Intel Core 2 (18) CPU Revision : 11.000000 CPU Megahertz : 2394.000000 CPU Clock Megahertz : 2394 CPU's in this Node : 4 Nodes in this System : 1 Total CPU's : 4 Number Hardware Counters : 5 Max Multiplex Counters : 32 -------------------------------------------------------------------------------- The following correspond to fields in the PAPI_event_info_t structure. Name Code Deriv Description (Note) PAPI_L1_DCM 0x80000000 No Level 1 data cache misses PAPI_L1_ICM 0x80000001 No Level 1 instruction cache misses PAPI_L2_DCM 0x80000002 Yes Level 2 data cache misses PAPI_L2_ICM 0x80000003 No Level 2 instruction cache misses PAPI_L1_TCM 0x80000006 No Level 1 cache misses PAPI_L2_TCM 0x80000007 No Level 2 cache misses PAPI_CA_SHR 0x8000000a No Requests for exclusive access to shared cache line PAPI_CA_CLN 0x8000000b No Requests for exclusive access to clean cache line PAPI_CA_ITV 0x8000000d No Requests for cache line intervention PAPI_TLB_DM 0x80000014 No Data translation lookaside buffer misses PAPI_TLB_IM 0x80000015 No Instruction translation lookaside buffer misses PAPI_L1_LDM 0x80000017 No Level 1 load misses PAPI_L1_STM 0x80000018 No Level 1 store misses PAPI_L2_LDM 0x80000019 Yes Level 2 load misses PAPI_L2_STM 0x8000001a No Level 2 store misses PAPI_HW_INT 0x80000029 No Hardware interrupts PAPI_BR_CN 0x8000002b No Conditional branch instructions PAPI_BR_TKN 0x8000002c No Conditional branch instructions taken PAPI_BR_NTK 0x8000002d No Conditional branch instructions not taken PAPI_BR_MSP 0x8000002e No Conditional branch instructions mispredicted PAPI_BR_PRC 0x8000002f Yes Conditional branch instructions correctly predicted PAPI_TOT_IIS 0x80000031 No Instructions issued PAPI_TOT_INS 0x80000032 No Instructions completed PAPI_FP_INS 0x80000034 No Floating point instructions PAPI_BR_INS 0x80000037 No Branch instructions PAPI_VEC_INS 0x80000038 No Vector/SIMD instructions PAPI_RES_STL 0x80000039 No Cycles stalled on any resource PAPI_TOT_CYC 0x8000003b No Total cycles PAPI_L1_DCH 0x8000003e Yes Level 1 data cache hits PAPI_L1_DCA 0x80000040 No Level 1 data cache accesses PAPI_L2_DCA 0x80000041 Yes Level 2 data cache accesses PAPI_L2_DCR 0x80000044 No Level 2 data cache reads PAPI_L2_DCW 0x80000047 No Level 2 data cache writes PAPI_L1_ICH 0x80000049 Yes Level 1 instruction cache hits PAPI_L2_ICH 0x8000004a Yes Level 2 instruction cache hits PAPI_L1_ICA 0x8000004c No Level 1 instruction cache accesses PAPI_L2_ICA 0x8000004d No Level 2 instruction cache accesses PAPI_L2_TCH 0x80000056 Yes Level 2 total cache hits PAPI_L1_TCA 0x80000058 Yes Level 1 total cache accesses PAPI_L2_TCA 0x80000059 No Level 2 total cache accesses PAPI_L2_TCR 0x8000005c Yes Level 2 total cache reads PAPI_L2_TCW 0x8000005f No Level 2 total cache writes PAPI_FML_INS 0x80000061 No Floating point multiply instructions PAPI_FDV_INS 0x80000063 No Floating point divide instructions PAPI_FP_OPS 0x80000066 No Floating point operations ------------------------------------------------------------------------- Of 45 available events, 10 are derived. avail.c PASSED The limiting factor here is the number of available hardware counters (ie: 5 for the Q6600)...check out Blue Gene's table ;) : http://www.nic.uoregon.edu/mediawiki-tau/index.php?title=Guide:BlueGene_PAPI_Counter_Analysis&printable=yes#PAPI_Events_Available_on_Blue_Gene Eric _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf