cpuset(7) Miscellaneous Information Manual cpuset(7) cpuset - (cpuset) -- - , . , /dev/cpuset. On systems with kernels compiled with built in support for cpusets, all processes are attached to a cpuset, and cpusets are always present. If a system supports cpusets, then it will have the entry nodev cpuset in the file /proc/filesystems. By mounting the cpuset filesystem (see the EXAMPLES section below), the administrator can configure the cpusets on a system to control the processor and memory placement of processes on that system. By default, if the cpuset configuration on a system is not modified or if the cpuset filesystem is not even mounted, then the cpuset mechanism, though present, has no effect on the system's behavior. . , , , , Hyper-Threads . ; SMP-, , , , NUMA (non-uniform memory access) . - , (/dev/cpuset) ( ) -- , . . . , . fork(2), . , . , , , . , , , . (scheduling affinity) sched_setaffinity(2) mbind(2) set_mempolicy(2) . . , , . . , , , , . Typically, a cpuset is used to manage the CPU and memory-node confinement for a set of cooperating processes, such as a batch scheduler job, and these other mechanisms are used to manage the placement of individual processes or memory regions within that set or job. /dev/cpuset -, . mkdir(2) mkdir(1). , : , , , . - , mkdir(2). -. , , , rmdir(2) rmdir(1). - . - , , cat(1) echo(1), - , open(2), read(2), write(2) close(2). - . . tasks List of the process IDs (PIDs) of the processes in that cpuset. The list is formatted as a series of ASCII decimal numbers, each followed by a newline. A process may be added to a cpuset (automatically removing it from the cpuset that previously contained it) by writing its PID to that cpuset's tasks file (with or without a trailing newline). : tasks PID. PID, . notify_on_release (0 1). (1), , , ( ), . . cpuset.cpus , . cpus . cpus. cpuset.cpu_exclusive (0 1). (1), ( (sibling) (cousin) ). (0). (0). , /dev/cpuset. , . cpu_exclusive, -- , cpus, cpus , cpus cpus . cpuset.mems , . mems . cpuset.mem_exclusive (0 1) (1), ( ). , (1), -- Hardwall ( ). (0). (0). mem_exclusive, -- , , . cpuset.mem_hardwall ( Linux 2.6.26) (0 1) (1), -- Hardwall ( ). mem_exclusive, mem_hardwall ( ) . (0). (0). cpuset.memory_migrate ( Linux 2.6.16) (0 1). (1), (migration) . (0). . cpuset.memory_pressure ( Linux 2.6.16) , . . memory_pressure_enabled , (0). . . cpuset.memory_pressure_enabled ( Linux 2.6.16) (0 1). , /dev/cpuset. (1), memory_pressure . (0). . cpuset.memory_spread_page ( Linux 2.6.17) Flag (0 or 1). If set (1), pages in the kernel page cache (filesystem buffers) are uniformly spread across the cpuset. By default, this is off (0) in the top cpuset, and inherited from the parent cpuset in newly created cpusets. See the Memory Spread section, below. cpuset.memory_spread_slab ( Linux 2.6.17) Flag (0 or 1). If set (1), the kernel slab caches for file I/O (directory and inode structures) are uniformly spread across the cpuset. By default, this is off (0) in the top cpuset, and inherited from the parent cpuset in newly created cpusets. See the Memory Spread section, below. cpuset.sched_load_balance ( Linux 2.6.24) Flag (0 or 1). If set (1, the default) the kernel will automatically load balance processes in that cpuset over the allowed CPUs in that cpuset. If cleared (0), the kernel will avoid load balancing processes in this cpuset, unless some other cpuset with overlapping CPUs has its sched_load_balance flag set. See Scheduler Load Balancing, below, for further details. cpuset.sched_relax_domain_level ( Linux 2.6.26) Integer, between -1 and a small positive value. The sched_relax_domain_level controls the width of the range of CPUs over which the kernel scheduler performs immediate rebalancing of runnable tasks across CPUs. If sched_load_balance is disabled, then the setting of sched_relax_domain_level does not matter, as no such load balancing is done. If sched_load_balance is enabled, then, the higher the value of the sched_relax_domain_level, the wider the range of CPUs over which immediate load balancing is attempted. See Scheduler Relax Domain Level, below, for further details. In addition to the above pseudo-files in each directory below /dev/cpuset, each process has a pseudo-file, /proc/pid/cpuset, that displays the path of the process's cpuset directory relative to the root of the cpuset filesystem. Also the /proc/pid/status file for each process has four added lines, displaying the process's Cpus_allowed (on which CPUs it may be scheduled) and Mems_allowed (on which memory nodes it may obtain memory), in the two formats Mask Format and List Format (see below) as shown in the following example: Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff Cpus_allowed_list: 0-127 Mems_allowed: ffffffff,ffffffff Mems_allowed_list: 0-63 <> Linux 2.6.24; <> Linux 2.6.26. cpus mems , . cpu_exclusive mem_exclusive, , , . mem_exclusive , . , mem_exclusive , . , , . mem_exclusive mem_exclusive, . , , mem_exclusive. Hardwall , mem_exclusive mem_hardwall hardwall. hardwall , , . , hardwall , . , , , . hardwall hardwall, . , , hardwall. notify_on_release (1), ( ) , /sbin/cpuset_release_agent, ( ) . . notify_on_release (0). notify_on_release . /sbin/cpuset_release_agent argv[1] ( /dev/cpuset) . , /sbin/cpuset_release_agent -- : #!/bin/sh rmdir /dev/cpuset/$1 , , , ASCII- 0 1 ( ), . memory_pressure , , . , , , , . , , , , -- , , , , , , . . , . Unless memory pressure calculation is enabled by setting the pseudo-file /dev/cpuset/cpuset.memory_pressure_enabled, it is not computed for any cpuset, and reads from any memory_pressure always return zero, as represented by the ASCII string "0\n". See the WARNINGS section, below. (running average) : o , , , , , . o , , , . o Because this meter is per-cpuset rather than per-process, the batch scheduler can obtain the key information--memory pressure in a cpuset--with a single read, rather than having to query and accumulate results over all the (dynamically changing) set of processes in the cpuset. memory_pressure , . , , , (kernel direct reclaim code). , - (repurpose) , . . , , . cpuset.memory_pressure , ( (half-life) 10 ) , , , 1000. -- , , . cpuset.memory_spread_page cpuset.memory_spread_slab. - cpuset.memory_spread_page, ( ) , , , . - cpuset.memory_spread_slab, slab- , , inode , , , , . ( brk(2)) . , , , . NUMA , . . - slab NUMA . , , , mbind(2) set_mempolicy(2). NUMA , - , . , NUMA, . cpuset.memory_spread_page cpuset.memory_spread_slab . , <<0>>; , . <<1>>, . , , ( ) . , , , : o , , ; o , - . , , . , cpuset.memory_migrate () ( ), , , , mems . , mems , , , , . , , memory_migrate, , , , , . , . , , , . , . , , , sched_setaffinity(2). , , , . , , , ( , , ). sched_load_balance . , , <> ( ). , , : o . , . o , , , , . sched_load_balance ( ), , ( sched_setaffinity(2)) . sched_load_balance , , - - sched_load_balance. , , sched_load_balance, , sched_load_balance , . sched_load_balance , . , , , , , . , . , , , sched_load_balance, . (immediate) . , . , time(7). sched_relax_domain_level . sched_relax_domain_level ( sched_load_balance). , , sched_setaffinity(2). . , , . sched_relax_domain_level . . Linux 2.6.26, sched_relax_domain_level : 1 Hyper-Thread . 2 . 3 . 4 ( ) ( NUMA). 5 ( NUMA). sched_relax_domain_level (0) , , , . sched_relax_domain_level (-1) . . <>. In the case of multiple overlapping cpusets which have conflicting sched_relax_domain_level values, then the highest such value applies to all CPUs in any of the overlapping cpusets. In such cases, -1 is the lowest value, overridden by any other value, and 0 is the next lowest value. : The Mask Format is used to represent CPU and memory-node bit masks in the /proc/pid/status file. 32- ( ASCII <<0>>-<<9>> <>-<>); , . . , . . 32- , , . Examples of the Mask Format: 00000001 # 0 40000000,00000000,00000000 # 94 00000001,00000000,00000000 # 64 000000ff,00000000 # 32-39 00000000,000e3862 # 1,5,6,11-13,17-19 0, 1, 2, 4, 8, 16, 32 64 : 00000001,00000001,00010117 <<1>> 64, -- 32, -- 16, -- 8, -- 4 <<7>> -- 2, 1 0. The List Format for cpus and mems is a comma-separated list of CPU or memory-node numbers and ranges of numbers, in ASCII decimal. Examples of the List Format: 0-4,9 # 0, 1, 2, 3, 4 9 0-2,7,12-14 # 0, 1, 2, 7, 12, 13 14 : o (, ) . o cpu_exclusive , . o mem_exclusive , . o cpu_exclusive, . o If it is mem_exclusive, its memory nodes may not overlap any sibling. - , /dev/cpuset. , ( ), tasks . tasks. . , ( kill(2)). , . , ( ) cpus mems. , . , . , , . , ( cd chdir(2) /dev/cpuset, ) . , , (, , /dev/cpuset). , , , /dev/cpuset. - /dev/cpuset/tasks, , . memory_pressure , cpuset.memory_pressure (0). <<1>> - /dev/cpuset/cpuset.memory_pressure_enabled, memory_pressure. echo echo echo; write(2). , : echo 19 > cpuset.mems - , 19 (, 19), echo . /bin/echo , write(2), : /bin/echo 19 > cpuset.mems /bin/echo: write error: Invalid argument : , , cpus_allowed , , . , . , , , . , , . , GFP_ATOMIC, . - , . , , . , . , , . rename(2). ; -- . Linux errno . errno , : E2BIG write(2) , . EACCES write(2) ID (PID) tasks . EACCES , write(2), , . EACCES , write(2), cpuset.cpu_exclusive cpuset.mem_exclusive , . EACCES write(2) cpuset.memory_pressure. EACCES . EBUSY , rmdir(2), . EBUSY , rmdir(2), . EBUSY , . EEXIST , mkdir(2), , . EEXIST rename(2) , . EFAULT read(2) write(2) , . EINVAL write(2) , cpu_exclusive mem_exclusive . EINVAL write(2) cpuset.cpus cpuset.mems , . EINVAL write(2) cpuset.cpus cpuset.mems, , . EINVAL write(2) cpuset.cpus cpuset.mems, . EINVAL write(2) cpuset.cpus, . EINVAL write(2) cpuset.mems, . EINVAL write(2) cpuset.mems, . EIO write(2) tasks , ASCII. EIO rename(2) . ENAMETOOLONG Attempted to read(2) a /proc/pid/cpuset file for a cpuset path that is longer than the kernel page size. ENAMETOOLONG mkdir(2) , 255 . ENAMETOOLONG mkdir(2) , , ( <>), 4095 . ENODEV , write(2) - . ENOENT mkdir(2) , . ENOENT access(2) open(2) . ENOMEM ; , . ENOSPC write(2) ID (PID) tasks, cpuset.cpus cpuset.mems. ENOSPC write(2) cpuset.cpus cpuset.mems , . ENOTDIR rename(2) . EPERM . ERANGE cpuset.cpus cpuset.mems , , . ESRCH write(2) ID (PID) tasks. Cpusets appeared in Linux 2.6.12. , pid ID , . pid , gettid(2). cpuset.memory_pressure , , write(2) errno, EACCES, open(2) . . . : (1) mkdir /dev/cpuset ( ) (2) mount -t cpuset none /dev/cpuset ( ) (3) mkdir(1). (4) . (5) . , <>, 2 3, 1, . $ mkdir /dev/cpuset $ mount -t cpuset cpuset /dev/cpuset $ cd /dev/cpuset $ mkdir Charlie $ cd Charlie $ /bin/echo 2-3 > cpuset.cpus $ /bin/echo 1 > cpuset.mems $ /bin/echo $$ > tasks # The current shell is now running in cpuset Charlie # The next line should display '/Charlie' $ cat /proc/self/cpuset . ( , ) , , , : (1) Let's say we want to move the job in cpuset alpha (CPUs 4-7 and memory nodes 2-3) to a new cpuset beta (CPUs 16-19 and memory nodes 8-9). (2) beta. (3) Then allow CPUs 16-19 and memory nodes 8-9 in beta. (4) memory_migration beta. (5) alpha beta. : $ cd /dev/cpuset $ mkdir beta $ cd beta $ /bin/echo 16-19 > cpuset.cpus $ /bin/echo 8-9 > cpuset.mems $ /bin/echo 1 > cpuset.memory_migrate $ while read i; do /bin/echo $i; done < ../alpha/tasks > tasks The above should move any processes in alpha to beta, and any memory held by these processes on memory nodes 2-3 to memory nodes 8-9, respectively. , : $ cp ../alpha/tasks tasks while, cp(1), , PID tasks. ( PID ) while , , : -u ( ) sed(1): $ sed -un p < ../alpha/tasks > tasks taskset(1), get_mempolicy(2), getcpu(2), mbind(2), sched_getaffinity(2), sched_setaffinity(2), sched_setscheduler(2), set_mempolicy(2), CPU_SET(3), proc(5), cgroups(7), numa(7), sched(7), migratepages(8), numactl(8) Documentation/admin-guide/cgroup-v1/cpusets.rst in the Linux kernel source tree () Azamat Hackimov , Dmitriy S. Seregin , Dmitry Bolkhovskikh , Katrin Kutepova , Yuri Kozlov ; GNU (GNU General Public License - GPL, 3 ) , - . - , , () () () <>. Linux 6.17 8 2026 . cpuset(7)