| io_uring_setup_flags(7) | Linux Programmer's Manual | io_uring_setup_flags(7) |
NAME
io_uring_setup_flags - io_uring ring setup flags overview
DESCRIPTION
When creating an io_uring instance with io_uring_queue_init_params(3) or io_uring_setup(2), various flags control the ring's behavior. These flags are set in the flags field of struct io_uring_params.
Choosing the right flags can significantly impact performance. This page provides an overview of available flags, their purposes, and common combinations.
Polling flags
These flags control how I/O completion and submission polling works.
IORING_SETUP_IOPOLL
- Files opened with O_DIRECT
- Hardware and drivers that support polling
- The application to call io_uring_enter(2) to reap completions (busy-polling)
- Storage device configuration for polling support
IOPOLL rings cannot use IRQ-driven completion; the application must poll. Only request types that support polling may be issued on an IOPOLL ring. This mode is commonly used for scenarios that purely do polled I/O on storage devices like NVMe.
Using IOPOLL generally requires storage device setup. For NVMe devices, the kernel parameter nvme.poll_queues=X must be set, where X is the number of completion queues on the NVMe device to set aside for polling operations.
IORING_SETUP_SQPOLL
IORING_SETUP_SQ_AFF
IORING_SETUP_HYBRID_IOPOLL
Task run flags
These flags control when and how completion processing runs.
IORING_SETUP_COOP_TASKRUN
This improves performance by eliminating asynchronous interrupts but requires the application to regularly enter the kernel to process completions. Recommended for most applications that have an event loop.
IORING_SETUP_TASKRUN_FLAG
IORING_SETUP_DEFER_TASKRUN
This flag should be considered the default mode for applications setting up a ring. It requires IORING_SETUP_SINGLE_ISSUER and a ring created per-thread. The application must regularly call io_uring_enter(2) (via io_uring_submit(3), io_uring_wait_cqe(3), or similar) to process deferred work; failing to do so will stall completions.
Some features require this flag:
- Ring resizing (io_uring_register_resize_rings(3))
- Zero-copy receive (IORING_OP_RECV_ZC)
IORING_SETUP_SINGLE_ISSUER
Each thread or task having its own ring is the idiomatic use case for io_uring. Sharing a ring between multiple threads or tasks is discouraged as it requires additional synchronization and prevents many optimizations. Applications should create a ring per thread rather than sharing rings.
Ring sizing flags
These flags control the size and layout of the submission and completion queues.
IORING_SETUP_CQSIZE
Larger CQ sizes are useful when the application may submit many requests before processing completions, avoiding CQ overflow.
IORING_SETUP_CLAMP
IORING_SETUP_SQE128
IORING_SETUP_CQE32
IORING_SETUP_NO_SQARRAY
IORING_SETUP_SQ_REWIND
Requires IORING_SETUP_NO_SQARRAY. Not compatible with IORING_SETUP_SQPOLL.
This mode keeps SQEs hot in cache by always accessing the same memory locations at the start of the ring, improving performance for workloads that submit small batches frequently.
IORING_SETUP_CQE_MIXED
This is useful when certain operations require 32-byte CQEs (such as some passthrough commands) but most operations do not. Using mixed mode instead of IORING_SETUP_CQE32 alone provides efficiency benefits in terms of memory bandwidth and usage, since the smaller 16-byte CQEs are used for operations that do not need the extra space.
IORING_SETUP_SQE_MIXED
This is useful when certain operations require 128-byte SQEs (such as IORING_OP_URING_CMD) but most operations do not. Using mixed mode instead of IORING_SETUP_SQE128 alone provides efficiency benefits in terms of memory bandwidth and usage, since the smaller 64-byte SQEs are used for operations that do not need the extra space.
Memory and setup flags
These flags control memory allocation and ring initialization.
IORING_SETUP_NO_MMAP
This is useful for placing rings in specific memory (huge pages, shared memory, etc.) or for creating rings without mmap.
IORING_SETUP_REGISTERED_FD_ONLY
Requires IORING_SETUP_NO_SQARRAY. The application must use io_uring_register_ring_fd(3) to use the ring or access it via the registered index.
IORING_SETUP_R_DISABLED
Submission flags
These flags control submission behavior.
IORING_SETUP_SUBMIT_ALL
The failed SQE still generates a CQE with the error; this flag only affects whether subsequent SQEs are submitted. This is probably the behavior most applications expect, since CQEs are generated for failed submissions anyway and the application must handle them regardless.
Workqueue flags
These flags control the async worker threads.
IORING_SETUP_ATTACH_WQ
When combined with IORING_SETUP_SQPOLL, the SQPOLL thread is also shared.
Common flag combinations
High-performance single-threaded application:
IORING_SETUP_DEFER_TASKRUN |
IORING_SETUP_COOP_TASKRUN
This combination provides the best latency and throughput for applications where each thread has its own ring and processes completions in a dedicated event loop.
Low-latency storage with polling:
IORING_SETUP_SINGLE_ISSUER |
IORING_SETUP_DEFER_TASKRUN
For NVMe or other devices that support polling, this eliminates interrupt overhead. Combined with DEFER_TASKRUN for optimal completion handling.
System call-free submission:
IORING_SETUP_SQ_AFF
For workloads that benefit from eliminating submission syscall overhead. See io_uring_sqpoll(7).
Multiple rings sharing resources:
/* First ring */ p1.flags = IORING_SETUP_SQPOLL; /* Subsequent rings */ p2.flags = IORING_SETUP_SQPOLL | IORING_SETUP_ATTACH_WQ; p2.wq_fd = ring1_fd;
Reduces kernel thread and workqueue overhead when using multiple rings.
NOTES
- Not all flag combinations are valid. The kernel returns -EINVAL for incompatible combinations.
- Some flags require specific kernel versions. Check io_uring_setup(2) for version requirements.
- The io_uring_queue_init_params(3) function handles the complexity of ring setup. Using the raw io_uring_setup(2) syscall requires careful mmap setup.
- For most applications with a proper event loop, IORING_SETUP_DEFER_TASKRUN combined with IORING_SETUP_SINGLE_ISSUER is the recommended default. This provides the best control over when completion work runs and optimal cache locality.
SEE ALSO
io_uring(7), io_uring_sqpoll(7), io_uring_setup(2), io_uring_queue_init_params(3), io_uring_register_restrictions(3), io_uring_enable_rings(3)
| January 18, 2025 | Linux |