I. Environment Preparation
Hardware and Driver Installation Confirm that the network card firmware supports RoCEv2 (supported by default).
Install the latest Mellanox WinOF-2 driver (including NDK driver).
Install Mellanox Firmware Tools (MFT) for firmware management.
2. Development Tool Installation
Visual Studio 2019/2022 (with C++17 support required).
Install the Mellanox Windows Software Development Kit (SDK).
This includes header files (mlx4_win.h, mlx5_win.h, etc.).
Static library files (.lib) and dynamic link libraries (.dll).
3. Network Configuration
Enable RoCEv2 mode: Configure through the Mellanox driver configuration tool.
Configure the switch to support PFC and ECN (to ensure a lossless network).
Set the Windows firewall to allow RoCEv2 traffic (UDP port 4791).
II. Core Development Process
- RDMA Initialization
#include <winverbs.h> // Mellanox Windows Verbs API |
// Initialize device list |
ibv_device** dev_list = ibv_get_device_list(NULL); |
ibv_context* context = ibv_open_device(dev_list[0]); // Select the first device |
// Allocate Protection Domain (PD) |
ibv_pd* pd = ibv_alloc_pd(context); |
// Create Completion Queue (CQ) |
ibv_cq* cq = ibv_create_cq(context, CQ_DEPTH, nullptr, nullptr, 0); |
2. Configure Queue Pair (QP)
ibv_qp_init_attr qp_init_attr = {}; | |
| qp_init_attr.qp_type = IBV_QPT_UD; // RoCEv2 uses Unreliable Datagram |
| qp_init_attr.send_cq = cq; |
| qp_init_attr.recv_cq = cq; |
| qp_init_attr.cap.max_send_wr = MAX_WR; |
| qp_init_attr.cap.max_recv_wr = MAX_WR; |
| ibv_qp* qp = ibv_create_qp(pd, &qp_init_attr); |
| // Transition QP state to INIT |
| ibv_qp_attr qp_attr = {}; |
| qp_attr.qp_state = IBV_QPS_INIT; |
| qp_attr.pkey_index = 0; |
| qp_attr.port_num = PORT_NUM; // Physical port number |
| ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT); |
- Memory Registration
// Register memory buffer |
ibv_mr* mr = ibv_reg_mr(pd, buffer, buffer_size, |
IBV_ACCESS_LOCAL_WRITE | |
IBV_ACCESS_REMOTE_READ); |
- Connection Management
// Exchange QP information (custom protocol required) |
struct QPInfo { |
uint16_t lid; |
uint32_t qpn; |
uint32_t psn; |
} local_info, remote_info; |
// Transition QP state to RTR (Ready to Receive) |
qp_attr.qp_state = IBV_QPS_RTR; |
ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE); |
// Transition to RTS (Ready to Send) |
qp_attr.qp_state = IBV_QPS_RTS; |
qp_attr.sq_psn = local_info.psn; |
ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_SQ_PSN); |
- Data Transfer
// Construct send request |
ibv_sge sge = {}; |
sge.addr = (uintptr_t)buffer; |
sge.length = data_len; |
sge.lkey = mr->lkey; |
ibv_send_wr wr = {}; |
wr.wr_id = 0x1234; // Custom identifier |
wr.opcode = IBV_WR_SEND; |
wr.sg_list = &sge; |
wr.num_sge = 1; |
wr.send_flags = IBV_SEND_SIGNALED; |
ibv_send_wr* bad_wr; |
ibv_post_send(qp, &wr, &bad_wr); |
// Poll Completion Queue |
ibv_wc wc; |
int ret; |
do { |
ret = ibv_poll_cq(cq, 1, &wc); |
} while (ret == 0); |
III. Key Optimization Points
- Zero-Copy Technology
Enable the IBV_ACCESS_ZERO_BASED flag when registering memory with ibv_reg_mr.
Cooperate with DMA to directly access user-space memory.
- Batch Operations
Use ibv_post_send to submit multiple WRs in bulk.
Reduce the overhead of transitions between user-space and kernel-space.
- Asynchronous Event Handling
Bind Windows IOCP with the completion queue.
Use ibv_get_async_event to listen for hardware events.
Ⅳ. Debugging and Testing Tools
1.Performance Testing
Use the mlx5_win_perf tool to test throughput and latency.
Validate bandwidth with custom Benchmark tools.
2.Protocol Analysis
Install the RoCEv2 parsing plugin for Wireshark.
Filter on udp.port == 4791 to inspect data packets.
3.Mellanox Diagnostic Tools
Run mlx_fw_checker to verify firmware status.
Use mlxlink to check physical link quality.
Ⅴ. Precautions
1.Windows-Specific Behaviors
Programs need to be run with administrative privileges.
Some APIs require dynamic invocation via MLX5_WIN.dll.
2.Compatibility Issues
Ensure byte order consistency when communicating with Linux endpoints.
Verify MTU configuration matches (4096 is recommended).
3.Security Mechanisms
Enable CMA (Connection Manager Abstraction) for access control.
Use IPSEC to encrypt RoCEv2 traffic (hardware support required).
Ⅵ. Reference Resources
1.Official Documentation:
Mellanox WinOF-2 User Manual
RDMA Aware Networks Programming User Manual
2.Sample Code:
windows_examples branch on the Mellanox GitHub repository.
Windows Direct Access samples on MSDN.
3.Community Support:
Mellanox Developer Forum
Windows Hardware Dev Center
By following this plan, gradual integration of RoCEv2 functionality can be achieved. It is recommended to start with a simple PingPong test program and gradually expand to a complete application.