.\" Automatically generated by Pandoc 3.1.3 .\" .\" Define V font for inline verbatim, using C font in formats .\" that render this, and otherwise B font. .ie "\f[CB]x\f[]"x" \{\ . ftr V B . ftr VI BI . ftr VB B . ftr VBI BI .\} .el \{\ . ftr V CR . ftr VI CI . ftr VB CB . ftr VBI CBI .\} .TH "fi_mon_sampler" "1" "2025\-06\-06" "Libfabric Programmer\[cq]s Manual" "Libfabric v2.2.0" .hy .SH NAME .PP fi_mon_sampler - Simple sampler for ofi_hook_monitor provider. .SH SYNOPSIS .IP .nf \f[C] fi_mon_sampler [OPTIONS] sample from file(s) at \f[R] .fi .SH DESCRIPTION .PP Extract data from the ofi_hook_monitor provider via communication files. \f[V]\f[R] can either be one communication file or a folder of files. Data is exported based on \f[V]-f \f[R] and either printed to stdout (only for single files), or stored per communication file at \f[V]-o \f[R]. The sampler can watch the communication files for changes via the option \f[V]-w \f[R] for repeated sampling. .PP The name format of the output files is based on the ofi_hook_monitor provider and is as follows: \f[V]____\f[R]. \f[V]ppid\f[R] and \f[V]pid\f[R] are taken from the perspective of the monitored application. In a batched environment running SLURM, \f[V]job id\f[R] is set to the SLURM job ID, otherwise it is set to 0. .SH HOW TO RUN .PP Launch a libfabric application with \f[V]FI_HOOK=monitor\f[R] to enable the ofi_hook_monitor provider. Adjust the monitor provider settings according to \f[V]fi_hook\f[R](7). .PP Then launch the sampler via \f[V]fi_mon_sampler -o \f[R]. By default, the ofi_hook_monitor provider stores data at \f[V]/dev/shm/ofi//\f[R]. .PP The sampler will generate output files in the directory specified at \f[V]\f[R], one for each monitored provider. .SH OPTIONS .TP \f[I]-w \f[R] Watch files for changes, check every milliseconds. .TP \f[I]-f \f[R] Output format. Currently only supports CSV. .TP \f[I]-o \f[R] Output file path. Uses stdout if unset. .SH USAGE EXAMPLES .PP Launch a libfabric application and enable the ofi_hook_monitor provider: .IP .nf \f[C] FI_HOOK=monitor fi_pingpong [OPTIONS] \f[R] .fi .PP Launch another \f[V]fi_pingpong\f[R] with the respective settings. .PP Finally, launch the sampler: .IP .nf \f[C] fi_mon_sampler -o $HOME -w 1000 -f csv /dev/shm/ofi/$UID/$HOSTNAME \f[R] .fi .SH OUTPUT .PP Output files will be generated in the folder specified at \f[V]-o \f[R]. .PP In \f[V]-f csv\f[R] mode, this will contain a CSV file with data for all monitored libfabric functions. For each function, both the \f[V]count\f[R] and \f[V]sum\f[R] counters are exported, indicated by the column name suffix \f[V]_c\f[R] and \f[V]_s\f[R] respectively. In addition, each function is monitored for each data size bucket. Refer to \f[V]fi_hook\f[R](7) for more details. .PP Example CSV output, first four columns, first three rows: .IP .nf \f[C] mon_recv_0_64_c,mon_recv_0_64_s,mon_recv_64_512_c,mon_recv_64_512_s 0,0,0,0 22529,0,0,0 113664,0,0,0 \f[R] .fi .SH SEE ALSO .PP \f[V]fi_hook\f[R](7) .SH AUTHORS OpenFabrics.