'\" t .\" Title: perf-list .\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author] .\" Generator: DocBook XSL Stylesheets vsnapshot .\" Date: 2023-02-02 .\" Manual: perf Manual .\" Source: perf .\" Language: English .\" .TH "PERF\-LIST" "1" "2023\-02\-02" "perf" "perf Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" perf-list \- List all symbolic event types .SH "SYNOPSIS" .sp .nf \fIperf list\fR [\-\-no\-desc] [\-\-long\-desc] [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob] .fi .SH "DESCRIPTION" .sp This command displays the symbolic event types which can be selected in the various perf commands with the \-e option\&. .SH "OPTIONS" .PP \-d, \-\-desc .RS 4 Print extra event descriptions\&. (default) .RE .PP \-\-no\-desc .RS 4 Don\(cqt print descriptions\&. .RE .PP \-v, \-\-long\-desc .RS 4 Print longer event descriptions\&. .RE .PP \-\-debug .RS 4 Enable debugging output\&. .RE .PP \-\-details .RS 4 Print how named events are resolved internally into perf events, and also any extra expressions computed by perf stat\&. .RE .PP \-\-deprecated .RS 4 Print deprecated events\&. By default the deprecated events are hidden\&. .RE .PP \-\-unit .RS 4 Print PMU events and metrics limited to the specific PMU name\&. (e\&.g\&. \-\-unit cpu, \-\-unit msr, \-\-unit cpu_core, \-\-unit cpu_atom) .RE .PP \-j, \-\-json .RS 4 Output in JSON format\&. .RE .SH "EVENT MODIFIERS" .sp Events can optionally have a modifier by appending a colon and one or more modifiers\&. Modifiers allow the user to restrict the events to be counted\&. The following modifiers exist: .sp .if n \{\ .RS 4 .\} .nf u \- user\-space counting k \- kernel counting h \- hypervisor counting I \- non idle counting G \- guest counting (in KVM guests) H \- host counting (not in KVM guests) p \- precise level P \- use maximum detected precise level S \- read sample value (PERF_SAMPLE_READ) D \- pin the event to the PMU W \- group is weak and will fallback to non\-group if not schedulable, e \- group or event are exclusive and do not share the PMU .fi .if n \{\ .RE .\} .sp The \fIp\fR modifier can be used for specifying how precise the instruction address should be\&. The \fIp\fR modifier can be specified multiple times: .sp .if n \{\ .RS 4 .\} .nf 0 \- SAMPLE_IP can have arbitrary skid 1 \- SAMPLE_IP must have constant skid 2 \- SAMPLE_IP requested to have 0 skid 3 \- SAMPLE_IP must have 0 skid, or uses randomization to avoid sample shadowing effects\&. .fi .if n \{\ .RE .\} .sp For Intel systems precise event sampling is implemented with PEBS which supports up to precise\-level 2, and precise level 3 for some special cases .sp On AMD systems it is implemented using IBS (up to precise\-level 2)\&. The precise modifier works with event types 0x76 (cpu\-cycles, CPU clocks not halted) and 0xC1 (micro\-ops retired)\&. Both events map to IBS execution sampling (IBS op) with the IBS Op Counter Control bit (IbsOpCntCtl) set respectively (see the Core Complex (CCX) \(-> Processor x86 Core \(-> Instruction Based Sampling (IBS) section of the [AMD Processor Programming Reference (PPR)] relevant to the family, model and stepping of the processor being used)\&. .sp Manual Volume 2: System Programming, 13\&.3 Instruction\-Based Sampling)\&. Examples to use IBS: .sp .if n \{\ .RS 4 .\} .nf perf record \-a \-e cpu\-cycles:p \&.\&.\&. # use ibs op counting cycles perf record \-a \-e r076:p \&.\&.\&. # same as \-e cpu\-cycles:p perf record \-a \-e r0C1:p \&.\&.\&. # use ibs op counting micro\-ops .fi .if n \{\ .RE .\} .SH "RAW HARDWARE EVENT DESCRIPTOR" .sp Even when an event is not available in a symbolic form within perf right now, it can be encoded in a per processor specific way\&. .sp For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the layout of IA32_PERFEVTSELx MSRs (see [Intel\(rg 64 and IA\-32 Architectures Software Developer\(cqs Manual Volume 3B: System Programming Guide] Figure 30\-1 Layout of IA32_PERFEVTSELx MSRs) or AMD\(cqs PERF_CTL MSRs (see the Core Complex (CCX) \(-> Processor x86 Core \(-> MSR Registers section of the [AMD Processor Programming Reference (PPR)] relevant to the family, model and stepping of the processor being used)\&. .sp Note: Only the following bit fields can be set in x86 counter registers: event, umask, edge, inv, cmask\&. Esp\&. guest/host only and OS/user mode flags must be setup using EVENT MODIFIERS\&. .sp Example: .sp If the Intel docs for a QM720 Core i7 describe an event as: .sp .if n \{\ .RS 4 .\} .nf Event Umask Event Mask Num\&. Value Mnemonic Description Comment .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf A8H 01H LSD\&.UOPS Counts the number of micro\-ops Use cmask=1 and delivered by loop stream detector invert to count cycles .fi .if n \{\ .RE .\} .sp raw encoding of 0x1A8 can be used: .sp .if n \{\ .RS 4 .\} .nf perf stat \-e r1a8 \-a sleep 1 perf record \-e r1a8 \&.\&.\&. .fi .if n \{\ .RE .\} .sp It\(cqs also possible to use pmu syntax: .sp .if n \{\ .RS 4 .\} .nf perf record \-e r1a8 \-a sleep 1 perf record \-e cpu/r1a8/ \&.\&.\&. perf record \-e cpu/r0x1a8/ \&.\&.\&. .fi .if n \{\ .RE .\} .sp Some processors, like those from AMD, support event codes and unit masks larger than a byte\&. In such cases, the bits corresponding to the event configuration parameters can be seen with: .sp .if n \{\ .RS 4 .\} .nf cat /sys/bus/event_source/devices//format/ .fi .if n \{\ .RE .\} .sp Example: .sp If the AMD docs for an EPYC 7713 processor describe an event as: .sp .if n \{\ .RS 4 .\} .nf Event Umask Event Mask Num\&. Value Mnemonic Description .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf 28FH 03H op_cache_hit_miss\&.op_cache_hit Counts Op Cache micro\-tag hit events\&. .fi .if n \{\ .RE .\} .sp raw encoding of 0x0328F cannot be used since the upper nibble of the EventSelect bits have to be specified via bits 32\-35 as can be seen with: .sp .if n \{\ .RS 4 .\} .nf cat /sys/bus/event_source/devices/cpu/format/event .fi .if n \{\ .RE .\} .sp raw encoding of 0x20000038F should be used instead: .sp .if n \{\ .RS 4 .\} .nf perf stat \-e r20000038f \-a sleep 1 perf record \-e r20000038f \&.\&.\&. .fi .if n \{\ .RE .\} .sp It\(cqs also possible to use pmu syntax: .sp .if n \{\ .RS 4 .\} .nf perf record \-e r20000038f \-a sleep 1 perf record \-e cpu/r20000038f/ \&.\&.\&. perf record \-e cpu/r0x20000038f/ \&.\&.\&. .fi .if n \{\ .RE .\} .sp You should refer to the processor specific documentation for getting these details\&. Some of them are referenced in the SEE ALSO section below\&. .SH "ARBITRARY PMUS" .sp perf also supports an extended syntax for specifying raw parameters to PMUs\&. Using this typically requires looking up the specific event in the CPU vendor specific documentation\&. .sp The available PMUs and their raw parameters can be listed with .sp .if n \{\ .RS 4 .\} .nf ls /sys/devices/*/format .fi .if n \{\ .RE .\} .sp For example the raw event "LSD\&.UOPS" core pmu event above could be specified as .sp .if n \{\ .RS 4 .\} .nf perf stat \-e cpu/event=0xa8,umask=0x1,name=LSD\&.UOPS_CYCLES,cmask=0x1/ \&.\&.\&. .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf or using extended name syntax .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf perf stat \-e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\e\*(AqLSD\&.UOPS_CYCLES:cmask=0x1\e\*(Aq/ \&.\&.\&. .fi .if n \{\ .RE .\} .SH "PER SOCKET PMUS" .sp Some PMUs are not associated with a core, but with a whole CPU socket\&. Events on these PMUs generally cannot be sampled, but only counted globally with perf stat \-a\&. They can be bound to one logical CPU, but will measure all the CPUs in the same socket\&. .sp This example measures memory bandwidth every second on the first memory controller on socket 0 of a Intel Xeon system .sp .if n \{\ .RS 4 .\} .nf perf stat \-C 0 \-a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ \-I 1000 \&.\&.\&. .fi .if n \{\ .RE .\} .sp Each memory controller has its own PMU\&. Measuring the complete system bandwidth would require specifying all imc PMUs (see perf list output), and adding the values together\&. To simplify creation of multiple events, prefix and glob matching is supported in the PMU name, and the prefix \fIuncore_\fR is also ignored when performing the match\&. So the command above can be expanded to all memory controllers by using the syntaxes: .sp .if n \{\ .RS 4 .\} .nf perf stat \-C 0 \-a imc/cas_count_read/,imc/cas_count_write/ \-I 1000 \&.\&.\&. perf stat \-C 0 \-a *imc*/cas_count_read/,*imc*/cas_count_write/ \-I 1000 \&.\&.\&. .fi .if n \{\ .RE .\} .sp This example measures the combined core power every second .sp .if n \{\ .RS 4 .\} .nf perf stat \-I 1000 \-e power/energy\-cores/ \-a .fi .if n \{\ .RE .\} .SH "ACCESS RESTRICTIONS" .sp For non root users generally only context switched PMU events are available\&. This is normally only the events in the cpu PMU, the predefined events like cycles and instructions and some software events\&. .sp Other PMUs and global measurements are normally root only\&. Some event qualifiers, such as "any", are also root only\&. .sp This can be overridden by setting the kernel\&.perf_event_paranoid sysctl to \-1, which allows non root to use these events\&. .sp For accessing trace point events perf needs to have read access to /sys/kernel/tracing, even when perf_event_paranoid is in a relaxed setting\&. .SH "TRACING" .sp Some PMUs control advanced hardware tracing capabilities, such as Intel PT, that allows low overhead execution tracing\&. These are described in a separate intel\-pt\&.txt document\&. .SH "PARAMETERIZED EVENTS" .sp Some pmu events listed by \fIperf\-list\fR will be displayed with \fI?\fR in them\&. For example: .sp .if n \{\ .RS 4 .\} .nf hv_gpci/dtbp_ptitc,phys_processor_idx=?/ .fi .if n \{\ .RE .\} .sp This means that when provided as an event, a value for \fI?\fR must also be supplied\&. For example: .sp .if n \{\ .RS 4 .\} .nf perf stat \-C 0 \-e \*(Aqhv_gpci/dtbp_ptitc,phys_processor_idx=0x2/\*(Aq \&.\&.\&. .fi .if n \{\ .RE .\} .sp EVENT QUALIFIERS: .sp It is also possible to add extra qualifiers to an event: .sp percore: .sp Sums up the event counts for all hardware threads in a core, e\&.g\&.: .sp .if n \{\ .RS 4 .\} .nf perf stat \-e cpu/event=0,umask=0x3,percore=1/ .fi .if n \{\ .RE .\} .SH "EVENT GROUPS" .sp Perf supports time based multiplexing of events, when the number of events active exceeds the number of hardware performance counters\&. Multiplexing can cause measurement errors when the workload changes its execution profile\&. .sp When metrics are computed using formulas from event counts, it is useful to ensure some events are always measured together as a group to minimize multiplexing errors\&. Event groups can be specified using { }\&. .sp .if n \{\ .RS 4 .\} .nf perf stat \-e \*(Aq{instructions,cycles}\*(Aq \&.\&.\&. .fi .if n \{\ .RE .\} .sp The number of available performance counters depend on the CPU\&. A group cannot contain more events than available counters\&. For example Intel Core CPUs typically have four generic performance counters for the core, plus three fixed counters for instructions, cycles and ref\-cycles\&. Some special events have restrictions on which counter they can schedule, and may not support multiple instances in a single group\&. When too many events are specified in the group some of them will not be measured\&. .sp Globally pinned events can limit the number of counters available for other groups\&. On x86 systems, the NMI watchdog pins a counter by default\&. The nmi watchdog can be disabled as root with .sp .if n \{\ .RS 4 .\} .nf echo 0 > /proc/sys/kernel/nmi_watchdog .fi .if n \{\ .RE .\} .sp Events from multiple different PMUs cannot be mixed in a group, with some exceptions for software events\&. .SH "LEADER SAMPLING" .sp perf also supports group leader sampling using the :S specifier\&. .sp .if n \{\ .RS 4 .\} .nf perf record \-e \*(Aq{cycles,instructions}:S\*(Aq \&.\&.\&. perf report \-\-group .fi .if n \{\ .RE .\} .sp Normally all events in an event group sample, but with :S only the first event (the leader) samples, and it only reads the values of the other events in the group\&. .sp However, in the case AUX area events (e\&.g\&. Intel PT or CoreSight), the AUX area event must be the leader, so then the second event samples, not the first\&. .SH "OPTIONS" .sp Without options all known events will be listed\&. .sp To limit the list use: .sp .RS 4 .ie n \{\ \h'-04' 1.\h'+01'\c .\} .el \{\ .sp -1 .IP " 1." 4.2 .\} \fIhw\fR or \fIhardware\fR to list hardware events such as cache\-misses, etc\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 2.\h'+01'\c .\} .el \{\ .sp -1 .IP " 2." 4.2 .\} \fIsw\fR or \fIsoftware\fR to list software events such as context switches, etc\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 3.\h'+01'\c .\} .el \{\ .sp -1 .IP " 3." 4.2 .\} \fIcache\fR or \fIhwcache\fR to list hardware cache events such as L1\-dcache\-loads, etc\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 4.\h'+01'\c .\} .el \{\ .sp -1 .IP " 4." 4.2 .\} \fItracepoint\fR to list all tracepoint events, alternatively use \fIsubsys_glob:event_glob\fR to filter by tracepoint subsystems such as sched, block, etc\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 5.\h'+01'\c .\} .el \{\ .sp -1 .IP " 5." 4.2 .\} \fIpmu\fR to print the kernel supplied PMU events\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 6.\h'+01'\c .\} .el \{\ .sp -1 .IP " 6." 4.2 .\} \fIsdt\fR to list all Statically Defined Tracepoint events\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 7.\h'+01'\c .\} .el \{\ .sp -1 .IP " 7." 4.2 .\} \fImetric\fR to list metrics .RE .sp .RS 4 .ie n \{\ \h'-04' 8.\h'+01'\c .\} .el \{\ .sp -1 .IP " 8." 4.2 .\} \fImetricgroup\fR to list metricgroups with metrics\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 9.\h'+01'\c .\} .el \{\ .sp -1 .IP " 9." 4.2 .\} If none of the above is matched, it will apply the supplied glob to all events, printing the ones that match\&. .RE .sp .RS 4 .ie n \{\ \h'-04'10.\h'+01'\c .\} .el \{\ .sp -1 .IP "10." 4.2 .\} As a last resort, it will do a substring search in all event names\&. .RE .sp One or more types can be used at the same time, listing the events for the types specified\&. .sp Support raw format: .sp .RS 4 .ie n \{\ \h'-04' 1.\h'+01'\c .\} .el \{\ .sp -1 .IP " 1." 4.2 .\} \fI\-\-raw\-dump\fR, shows the raw\-dump of all the events\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 2.\h'+01'\c .\} .el \{\ .sp -1 .IP " 2." 4.2 .\} \fI\-\-raw\-dump [hw|sw|cache|tracepoint|pmu|event_glob]\fR, shows the raw\-dump of a certain kind of events\&. .RE .SH "SEE ALSO" .sp \fBperf-stat\fR(1), \fBperf-top\fR(1), \fBperf-record\fR(1), \m[blue]\fBIntel\(rg 64 and IA\-32 Architectures Software Developer\(cqs Manual Volume 3B: System Programming Guide\fR\m[]\&\s-2\u[1]\d\s+2, \m[blue]\fBAMD Processor Programming Reference (PPR)\fR\m[]\&\s-2\u[2]\d\s+2 .SH "NOTES" .IP " 1." 4 Intel\(rg 64 and IA-32 Architectures Software Developer\(cqs Manual Volume 3B: System Programming Guide .RS 4 \%http://www.intel.com/sdm/ .RE .IP " 2." 4 AMD Processor Programming Reference (PPR) .RS 4 \%https://bugzilla.kernel.org/show_bug.cgi?id=206537 .RE