NVMe layer: support configuring `attr_qid_max` on nvmet subsystems

**Version info:**
- LINSTOR server: 1.33.1
- LINSTOR client: 1.27.1
- Kernel: 6.8.12 (Debian 12 / Proxmox VE)
- Transport: NVMe-oF over RDMA (RoCEv2)

**Description:**

When LINSTOR creates an nvmet subsystem, `attr_qid_max` defaults to 128. The nvme-rdma initiator creates one RDMA queue pair per I/O queue, and on hosts with many CPUs it requests all 128 queues. Each QP requires a full RDMA connection setup (rdma_cm resolve + QP creation), causing:

- **Connect: ~32 seconds** to establish 128 RDMA QPs
- **Disconnect: 80+ seconds** with fabrics command timeouts

Setting `attr_qid_max=16` on the nvmet subsystem reduces this to ~5s connect / ~3s disconnect, and actually **improves throughput**:

| | 128 queues (default) | 16 queues |
|---|---|---|
| Connect time | ~32s | ~5s |
| Disconnect time | 80+s (fabrics timeout) | ~3s |
| Read throughput (fio, 16 jobs, iodepth=16) | 9.8 GiB/s | 19.0 GiB/s |

128 RDMA QPs cause contention and halve throughput compared to 16 queues.

**Reproduction steps:**

1. Create an NVMe resource group and spawn a resource (target created on host B):
   ```
   linstor resource-group create nvme-rg --storage-pool <pool> --layer-list nvme,storage --place-count 1
   linstor volume-group create nvme-rg
   linstor resource-group spawn-resources nvme-rg test-nvme 100G
   ```
2. Create an NVMe initiator on host A (a machine with many CPU cores):
   ```
   linstor resource create --nvme-initiator <hostA> test-nvme
   ```
3. Observe ~30s delay before the command returns. dmesg on host A shows:
   ```
   nvme nvme3: creating 128 I/O queues.
   nvme nvme3: mapped 128/0/0 default/read/poll queues.
   ```
4. Manually setting `attr_qid_max` on the target resolves the issue:
   ```
   echo 16 > /sys/kernel/config/nvmet/subsystems/<subsystem>/attr_qid_max
   ```
   Subsequent initiator connections create only 16 queues and complete in ~5 seconds.

**Environment details:**

The initiator host has 224 CPU cores, so nvme-rdma requests the maximum 128 I/O queues from the target. This is common in HPC/virtualization environments with high core counts.

**Suggestion:**

- **Target-side**: An `NVMe/QidMax` property (settable on controller, resource-definition, or resource-group) that writes to `attr_qid_max` after creating the nvmet subsystem.
- **Initiator-side**: Pass `--nr-io-queues=<N>` to `nvme connect` when creating initiators, controlled by the same or a separate property.
- A sensible default (e.g., 32) would also help out of the box.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVMe layer: support configuring `attr_qid_max` on nvmet subsystems #483

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	128 queues (default)	16 queues
Connect time	~32s	~5s
Disconnect time	80+s (fabrics timeout)	~3s
Read throughput (fio, 16 jobs, iodepth=16)	9.8 GiB/s	19.0 GiB/s

NVMe layer: support configuring attr_qid_max on nvmet subsystems #483

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

NVMe layer: support configuring `attr_qid_max` on nvmet subsystems #483