Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Route resolution failure on Client + Segfault on Mellanox NIC #48

Open
vsag96 opened this issue Oct 1, 2020 · 6 comments
Open

Route resolution failure on Client + Segfault on Mellanox NIC #48

vsag96 opened this issue Oct 1, 2020 · 6 comments

Comments

@vsag96
Copy link

vsag96 commented Oct 1, 2020

Hi,

OFED version 5.0.2
NIC Mellanox Connect X-5
OS Ubuntu 18
A tofino Switch does simple packet forwarding for us.

If I try with -DTransport=infiband and -DROCE=on I am able to build successfully and when I use the hello world app, on the client side I get the following error

Received connect response from [H: 192.168.1.4:31850, R: 0, S: XX] for session 0. Issue: Error [Routing resolution failure]

The server is receiving the initial connect packet from the client and then client segfaults and server prints the below statement in a loop. The error on the server is as follows.

Received connect request from [H: 192.168.1.3:31850, R: 0, S: 0]. Issue: Unable to resolve routing info [LID: 0, QPN: 449, GID interface ID 16601820732604482458, GID subnet prefix 33022]. Sending response.

In the README, you mentioned to use -Dtransport=raw for Mellanox NIC's. I was not able to build with that flag. Error Trace We want to use eRPC over ROCEv2 + DCQCN. We are okay with IB, unless you tell us otherwise. The RDMA devices are on on rdma link and ibdev2netdev.

@vsag96 vsag96 changed the title Route resolution failure on Client and Segfault on Mellanox NIC Route resolution failure on Client + Segfault on Mellanox NIC Oct 1, 2020
@anujkaliaiitd
Copy link
Collaborator

Hi. Thanks for reporting this issue.

Can you comfirm if ib_read_bw is working over RoCE?

@vsag96
Copy link
Author

vsag96 commented Oct 1, 2020

On the server I started with ib_read_bw and on the client with ib_read_bw with . Going by the output it, I believe it works. Nonetheless attaching the trace, as I am just starting with running RDMA applications.

The server side trace

                RDMA_Read BW Test

Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet

local address: LID 0000 QPN 0x02c9 PSN 0x2df5d7 OUT 0x10 RKey 0x003572 VAddr 0x007f65b20e9000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:01:04
remote address: LID 0000 QPN 0x0239 PSN 0x2d5540 OUT 0x10 RKey 0x003582 VAddr 0x007f1a3666e000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:01:03

#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
65536 1000 10222.53 6831.89 0.109310

On the client.

                RDMA_Read BW Test

Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON
TX depth : 128
CQ Moderation : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet

local address: LID 0000 QPN 0x0239 PSN 0x2d5540 OUT 0x10 RKey 0x003582 VAddr 0x007f1a3666e000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:01:03
remote address: LID 0000 QPN 0x02c9 PSN 0x2df5d7 OUT 0x10 RKey 0x003572 VAddr 0x007f65b20e9000
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:01:04

#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 1178.116000 != 800.186000. CPU Frequency is not max.
65536 1000 10222.53 6831.89 0.109310

@anujkaliaiitd
Copy link
Collaborator

Thanks for the details.

Could you please test if eRPC works on your setup with older Mellanox drivers (e.g., Mellanox OFED 4.4)? There have been a lot of recent NIC driver changes and I've not kept the code up to date.

I am aware that eRPC doesn't build anymore with the Raw transport with new Mellanox OFED versions (or rdma_core) because the ibverbs API has changed. I plan to fix this eventually but I'm not sure when I'll have the time.

@Dicridon
Copy link

Hi Dr. Kalia
I encountered the same issue. I have one 4-node cluster and one 2-node cluster. The former is equipped with CX5 NIC and the latter CX4 NIC. eRPC runs well within each cluster but not between them.

I use the 4-node cluster as servers 2-node cluster as clients. On the client side I have

96:964338 WARNG: Rpc 0: Received connect response from [H: 10.0.0.40:31851, R: 0, S: XX] for session 0. Issue: Error [Routing resolution failure].

First I thought it was due to invalid LIDs (ibv_devinfo shows all ports' LIDs are 0, which is invalid), but since eRPC worked within each cluster, maybe the 0 LIDs were just fine. Then I checked eRPC's source code and noticed that eRPC seemed not to be able to successfully create AH in IBTransport::create_ah. So I thought maybe the two clusters couldn't communicate using UD, but ib_send_bw -c UD and ib_read_bw both worked.

Could you give any advice for further troubleshooting?

@anujkaliaiitd
Copy link
Collaborator

Hi! The verbs address handle creation process is a bit complex so it's likely I missed something in my implementation of create_ah. The implementation is different for RoCE and InfiniBand (see

struct ibv_ah *IBTransport::create_ah(const ib_routing_info_t *ib_rinfo) const {
), so I assume you're passing -DROCE=on if you're using RoCE.

My suggestion to fix this would be to see how the perftest package implements address handle resolution, and use that information to try fixing eRPC's create_ah.

@Dicridon
Copy link

Dicridon commented Apr 5, 2022

Hi Dr. Kalia
Sorry for my late reply because I had a holiday and spent some time finding the create_ah issue.

Thanks to your precise analysis, I am able to find that the resolution failure error is caused by unmatched GIDs. In file

static constexpr size_t kDefaultGIDIndex = 1;

, the kDefaultGIDIndex works for the most of time, but unluckily, my two clusters have different NIC configurations. Thus the default value picks the wrong GID in one cluster when RoCE is enabled, thus the two clusters fail to communicate.

The reason ib_send_bw -c UD works is that it requires users to offer a GID index and device ID, thus it can always get the correct GID. I guess maybe it is also a good idea for eRPC to require users to offer an optional valid GID index?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants