Version info:
- LINSTOR server: 1.33.1
- LINSTOR client: 1.27.1
- Kernel: 6.8.12 (Debian 12 / Proxmox VE)
- Transport: NVMe-oF over RDMA (RoCEv2)
Description:
When LINSTOR creates an nvmet subsystem, attr_qid_max defaults to 128. The nvme-rdma initiator creates one RDMA queue pair per I/O queue, and on hosts with many CPUs it requests all 128 queues. Each QP requires a full RDMA connection setup (rdma_cm resolve + QP creation), causing:
- Connect: ~32 seconds to establish 128 RDMA QPs
- Disconnect: 80+ seconds with fabrics command timeouts
Setting attr_qid_max=16 on the nvmet subsystem reduces this to ~5s connect / ~3s disconnect, and actually improves throughput:
|
128 queues (default) |
16 queues |
| Connect time |
~32s |
~5s |
| Disconnect time |
80+s (fabrics timeout) |
~3s |
| Read throughput (fio, 16 jobs, iodepth=16) |
9.8 GiB/s |
19.0 GiB/s |
128 RDMA QPs cause contention and halve throughput compared to 16 queues.
Reproduction steps:
- Create an NVMe resource group and spawn a resource (target created on host B):
linstor resource-group create nvme-rg --storage-pool <pool> --layer-list nvme,storage --place-count 1
linstor volume-group create nvme-rg
linstor resource-group spawn-resources nvme-rg test-nvme 100G
- Create an NVMe initiator on host A (a machine with many CPU cores):
linstor resource create --nvme-initiator <hostA> test-nvme
- Observe ~30s delay before the command returns. dmesg on host A shows:
nvme nvme3: creating 128 I/O queues.
nvme nvme3: mapped 128/0/0 default/read/poll queues.
- Manually setting
attr_qid_max on the target resolves the issue:
echo 16 > /sys/kernel/config/nvmet/subsystems/<subsystem>/attr_qid_max
Subsequent initiator connections create only 16 queues and complete in ~5 seconds.
Environment details:
The initiator host has 224 CPU cores, so nvme-rdma requests the maximum 128 I/O queues from the target. This is common in HPC/virtualization environments with high core counts.
Suggestion:
- Target-side: An
NVMe/QidMax property (settable on controller, resource-definition, or resource-group) that writes to attr_qid_max after creating the nvmet subsystem.
- Initiator-side: Pass
--nr-io-queues=<N> to nvme connect when creating initiators, controlled by the same or a separate property.
- A sensible default (e.g., 32) would also help out of the box.
Version info:
Description:
When LINSTOR creates an nvmet subsystem,
attr_qid_maxdefaults to 128. The nvme-rdma initiator creates one RDMA queue pair per I/O queue, and on hosts with many CPUs it requests all 128 queues. Each QP requires a full RDMA connection setup (rdma_cm resolve + QP creation), causing:Setting
attr_qid_max=16on the nvmet subsystem reduces this to ~5s connect / ~3s disconnect, and actually improves throughput:128 RDMA QPs cause contention and halve throughput compared to 16 queues.
Reproduction steps:
attr_qid_maxon the target resolves the issue:Environment details:
The initiator host has 224 CPU cores, so nvme-rdma requests the maximum 128 I/O queues from the target. This is common in HPC/virtualization environments with high core counts.
Suggestion:
NVMe/QidMaxproperty (settable on controller, resource-definition, or resource-group) that writes toattr_qid_maxafter creating the nvmet subsystem.--nr-io-queues=<N>tonvme connectwhen creating initiators, controlled by the same or a separate property.