3. Testing IB network
3.1. Discovering all devices on IB network
There are several utilities that are installed by the @infiniband
group
that allow you to learn more about your IB fabric. Here is a few:
ibnetdiscover
Generate list of all devices on the IB fabric
ibdiagnet
Simple IB network diagnostics utility
iblinkinfo
Display link information for all ports on your switches
ibqueryerrors
Check for error counters in the IB network, useful to detect broken HCAs or cables
3.2. IB RDMA Ping Pong
One way to check if the IB connection is working between two nodes is by using the ibv_rc_pingpong
utility.
Create two terminals, one on
master
and on one of your compute nodes (e.g.c01
)On the master launch the server:
[root@master ~]# ibv_rc_pingpong -d mlx4_0 -g 0 -i 1 local address: LID 0x000a, QPN 0x00020f, PSN 0xd26d63, GID fe80::2:c903:b:86f9
On the compute nodes, launch the client:
[root@c01 ~]# ibv_rc_pingpong -g 0 -d mlx4_0 -i 1 192.168.16.1 local address: LID 0x0012, QPN 0x00020e, PSN 0x24731a, GID fe80::2:c903:b:8965 remote address: LID 0x000a, QPN 0x00020f, PSN 0xd26d63, GID fe80::2:c903:b:86f9 8192000 bytes in 0.01 seconds = 6926.96 Mbit/sec 1000 iters in 0.01 seconds = 9.46 usec/iter
As soon as the client tries to connect, you will see that both will complete, providing some stats on the master too.
[root@master ~]# ibv_rc_pingpong -d mlx4_0 -g 0 -i 1
local address: LID 0x000a, QPN 0x00020f, PSN 0xd26d63, GID fe80::2:c903:b:86f9
remote address: LID 0x0012, QPN 0x00020e, PSN 0x24731a, GID fe80::2:c903:b:8965
8192000 bytes in 0.01 seconds = 6911.62 Mbit/sec
1000 iters in 0.01 seconds = 9.48 usec/iter
3.3. IB Point to Point Bandwidth Test
Another test is the IB write bandwidth test using ib_write_bw
. It will allow you to
measure the actual bandwidth you can achieve with your IB links.
On the master node launch the server
[root@master ~]# ib_write_bw -F --report_gbits ************************************ * Waiting for client to connect... * ************************************
On the compute node launch the client
[root@c01 ~]# ib_write_bw -F --report_gbits master --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x12 QPN 0x0249 PSN 0xa71d8e RKey 0xc0001101 VAddr 0x002ae6f7a9f000 remote address: LID 0x0a QPN 0x024a PSN 0x8c7893 RKey 0xc8001101 VAddr 0x007fa59d6e0000 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] 65536 5000 27.12 27.12 0.051731 ---------------------------------------------------------------------------------------
This test can also be run with multiple message sizes:
On the master:
[root@master ~]# ib_write_bw -F --report_gbits -a ************************************ * Waiting for client to connect... * ************************************
# On the compute node:
[root@c01 ~]# ib_write_bw -F --report_gbits -a master --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 4096[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x12 QPN 0x024a PSN 0x36d14f RKey 0xc8001101 VAddr 0x002b361f755000 remote address: LID 0x0a QPN 0x024b PSN 0xaa0428 RKey 0xd0001101 VAddr 0x007fd6af938000 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps] 2 5000 0.060086 0.058591 3.661928 4 5000 0.15 0.15 4.628613 8 5000 0.30 0.29 4.577756 16 5000 0.60 0.59 4.621985 32 5000 1.20 1.18 4.621891 64 5000 2.36 2.33 4.558384 128 5000 4.81 4.20 4.103666 256 5000 10.12 9.80 4.783870 512 5000 21.59 20.70 5.054233 1024 5000 24.39 24.06 2.937238 2048 5000 25.90 25.89 1.580097 4096 5000 26.54 26.54 0.809866 8192 5000 26.85 26.79 0.408743 16384 5000 27.01 27.00 0.206011 32768 5000 27.09 27.09 0.103351 65536 5000 27.13 27.13 0.051745 131072 5000 27.15 27.15 0.025892 262144 5000 27.16 27.16 0.012951 524288 5000 27.17 27.17 0.006477 1048576 5000 27.17 27.17 0.003239 2097152 5000 27.17 27.17 0.001619 4194304 5000 27.16 27.16 0.000810 8388608 5000 27.16 27.16 0.000405 ---------------------------------------------------------------------------------------
3.4. IB Point to Point Latency Test
To test the latency between your nodes, you can run the ib_write_lat
utility. This can be useful if you want to see the latency of nodes on the same
IB switch compared to the latency to nodes on other IB switches.
On the master:
[root@master ~]# ib_write_lat -F --report_gbits ************************************ * Waiting for client to connect... * ************************************
On the compute node:
[root@c01 ~]# ib_write_lat -f --report_gbits master --------------------------------------------------------------------------------------- RDMA_Write Latency Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 1 Mtu : 4096[B] Link type : IB Max inline data : 220[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x12 QPN 0x024b PSN 0x40c6bf RKey 0xd0001101 VAddr 0x000000009a1000 remote address: LID 0x0a QPN 0x024c PSN 0x39bf21 RKey 0xd8001101 VAddr 0x00000000983000 --------------------------------------------------------------------------------------- #bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec] 2 1000 1.49 3.93 1.50 1.51 0.07 1.57 3.93 ---------------------------------------------------------------------------------------
You can also observe the increaes in latency as the message sizes grows:
On the master:
[root@master ~]# ib_write_lat -F --report_gbits -a ************************************ * Waiting for client to connect... * ************************************
On the compute node:
[root@c01 ~]# ib_write_lat -F --report_gbits -a master --------------------------------------------------------------------------------------- RDMA_Write Latency Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 1 Mtu : 4096[B] Link type : IB Max inline data : 220[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x12 QPN 0x024d PSN 0x952a0c RKey 0xe0001101 VAddr 0x002ac9f3a7a000 remote address: LID 0x0a QPN 0x024e PSN 0x24da19 RKey 0xe8001101 VAddr 0x007f12eaee3000 --------------------------------------------------------------------------------------- #bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec] 2 1000 1.49 5.46 1.50 1.51 0.10 1.62 5.46 4 1000 1.43 4.64 1.50 1.51 0.11 1.59 4.64 8 1000 1.49 4.02 1.51 1.51 0.06 1.59 4.02 16 1000 1.49 4.49 1.51 1.52 0.06 1.58 4.49 32 1000 1.54 4.06 1.55 1.56 0.08 2.00 4.06 64 1000 1.54 4.51 1.56 1.56 0.06 1.64 4.51 128 1000 1.66 4.46 1.68 1.69 0.06 1.77 4.46 256 1000 2.81 4.37 2.84 2.84 0.04 2.99 4.37 512 1000 3.05 5.10 3.07 3.08 0.05 3.23 5.10 1024 1000 3.47 6.36 3.50 3.51 0.09 3.74 6.36 2048 1000 4.31 4.87 4.34 4.35 0.03 4.56 4.87 4096 1000 5.98 6.45 6.01 6.02 0.03 6.13 6.45 8192 1000 7.16 9.17 7.20 7.20 0.08 7.44 9.17 16384 1000 9.49 10.50 9.53 9.55 0.06 9.76 10.50 32768 1000 14.12 15.07 14.18 14.20 0.05 14.41 15.07 65536 1000 24.21 25.69 24.27 24.28 0.06 24.55 25.69 131072 1000 43.48 44.44 43.57 43.59 0.08 43.91 44.44 262144 1000 82.04 83.06 82.15 82.18 0.09 82.55 83.06 524288 1000 159.20 160.24 159.35 159.38 0.11 159.70 160.24 1048576 1000 313.54 314.77 313.75 313.77 0.11 314.07 314.77 2097152 1000 622.19 623.23 622.56 622.56 0.14 622.91 623.23 4194304 1000 1239.97 1247.21 1240.87 1240.90 0.32 1241.73 1247.21 8388608 1000 2622.22 2626.76 2624.55 2624.55 0.61 2625.92 2626.76 ---------------------------------------------------------------------------------------