Performance monitor

perf-monitor.tar.gz

Usage

perf-monitor accumulate-mode processor syscall-number filename program argument*

Description

The performance monitor is a small hack that uses the on-chip counters on UltraSPARC-I/II processors to gather statistics. Both user and system mode can be counted. Since everything on that processor is counted, some runs may be inaccurate due to activity by other processes.

The performance monitor uses a loadable kernel module to access privileged registers. The module installs a system call that is used by the perf-monitor program. If the module has not been loaded, the behavior of perf-monitor is undefined. The module is loaded as root with the command:

The output from modload is the module identifier and system call number.

If accumulate-mode is 1 data will be accumulated which is good for graph generation. If 0 you get the raw count.

Processor is the processor number that the program it to be run on. Note that the processor number are not always sequential (schuetz has processors numbered 0, 1, 4, and 5).

Syscall-number is the number of the system call loaded from inst_sync.

Filename is a file where perf-monitor writes info about every run it makes. There is one line for each run with the format:

<ticks> <clocks> <event> <count> <event> <count> <mode>

Program is the executable that is to be monitored. Note that if the program forks, the created processes may be scheduled on different processors and therefor not counted. The program program is run with the arguments specified last on the perf-monitor command line.

Configuration

Perf-monitor reads the file perf-monitor.conf to determine what to count and how. The file format is:

<conf-file> ::= <conf-line>*

<cond-line> ::= <counter number> <event id> <mode>

Counter number is the column in the output that the event should be associated with. All events associated with one counter are accumulated in that counter. Counter numbers over 100 indicate that accumulation should be turned off for that counter (useful for Cycle_cnt).

Event id is a countable event as listed in the UltraSPARC user's manual (and in the table below).

Cycle_cnt Accumulated cycles
Instr_cnt The number of instructions completed
Dispatch0_IC_miss Cycles I-buffer empty from I-Cache miss
Dispatch0_mispred Cycles I-buffer empty from branch misprediction
Dispatch0_storeBuf Cycles store buffer full
Dispatch0_FP_use Cycles stalled waiting for fp dependency
Load_use Cycles stalled waiting for load
Load_use_RAW Cycles stalled on some weird internal condition
IC_ref I-Cache references
IC_hit I-Cache hits
DC_rd D-Cache read references
DC_rd_hit D-Cache read hits
DC_wr D-Cache write references
DC_wr_hit D-Cache write hits
EC_ref E-Cache references
EC_hit E-Cache hits
EC_write_hit_RDO See User's guide
EC_wb E-Cache misses that do writebacks
EC_snoop_inv E-Cache invalidates
EC_snoop_cb E-Cache snoop copy-backs
EC_rd_hit E-Cache read hits from D-Cache misses
EC_ic_hit E-Cache read hits from I-Cache misses

Mode selects user/system mode as seen in the table below.

0 Nothing recorded
1 System events only
2 User events only
3 System and user events

Output

The output is send to perf-stats.dat which has a format suitable for building bar charts with gle.

Example

This example measures the CPI (Cycles Per Instruction) value for 147.vortex taken from the SPEC95 benchmark suite. For this metric we need to measure Cycle_cnt and Instr_cnt. That gives us the following perf-monitor.conf if we want to measure both system and user mode.

1 Cycle_cnt 2
2 Instr_cnt 2
3 Cycle_cnt 3
4 Instr_cnt 3

Run

hal> perf-monitor 0 0 176 vortex-cpi ./vortex.sim1 vortex.in
Run 0 will measure: 0(1) and 1(2)
Run 1 will measure: 0(3) and 1(4)
Performing run 1.
Performing run 2.
hal> cat vortex-cpi
3356285792 13400000 Cycle_cnt 3016264966 Instr_cnt 2418782563 2
3347298939 13410000 Cycle_cnt 3347301488 Instr_cnt 2533223764 3

From this data we calculate CPI(user) = 1.25, and CPI(user+system) = 1.32. Not bad for a quad-issue machine!

Bugs

Does not work on UltraSparc-I.

Magnus Christensson, mch@sics.se