rpc latency suddenly increases when client sleeps for a while after rpc call #104

jiangxiaosheng · 2023-10-04T20:41:32Z

Hi,

When I ported eRPC to my system, I measured the latency of the RPC call and surprisingly found it to be very high (up to 10s of ms). To reproduce the results, here is the simplified program based on the hello_world example.

#include <thread>
#include <chrono>
erpc::Rpc<erpc::CTransport> *rpc;
erpc::MsgBuffer req;
erpc::MsgBuffer resp;

#define enable_timer std::chrono::time_point<std::chrono::system_clock> _st, _et
#define start_timing  _st = std::chrono::system_clock::now()
#define end_timing(msg, args...)  do {  \
    _et = std::chrono::system_clock::now(); \
    double _elapsed = (double) std::chrono::duration_cast<std::chrono::nanoseconds>(_et - _st).count() / 1000000; \
    printf("[%.3fms] " msg "\n", _elapsed, ##args);  \
  } while (0)

void cont_func(void *, void *t) { 
  // printf("%s\n", resp.buf_);
  auto done = static_cast<bool *>(t);
  *done = true;
}

void sm_handler(int, erpc::SmEventType, erpc::SmErrType, void *) {}

int main() {
  std::string client_uri = kClientHostname + ":" + std::to_string(kUDPPort);
  erpc::Nexus nexus(client_uri);

  rpc = new erpc::Rpc<erpc::CTransport>(&nexus, nullptr, 0, sm_handler);

  std::string server_uri = kServerHostname + ":" + std::to_string(kUDPPort);
  int session_num = rpc->create_session(server_uri, 0);

  while (!rpc->is_connected(session_num)) rpc->run_event_loop_once();

  req = rpc->alloc_msg_buffer_or_die(kMsgSize);
  resp = rpc->alloc_msg_buffer_or_die(kMsgSize);

  enable_timer;
  for (int i = 0; i < 100; i++) {
    start_timing;
    bool done = false;
    rpc->enqueue_request(session_num, kReqType, &req, &resp, cont_func, static_cast<void *>(&done));
    while (!done) rpc->run_event_loop_once();
    end_timing("rpc latency");
    std::this_thread::sleep_for(std::chrono::milliseconds(10));
  }

  delete rpc;
}

So if you comment out the sleep function after the rpc call, the results are good - ~10us avg latency. However, if you do not, the latency rises up to 370us. And furthermore, if you sleep longer, like 100ms, the latency then increases to 4.5ms.

I've done some debugging on eRPC and found out that if I do sleep, there are re-transmissions for each request and it seems likely to be the reason for the latency issue.

I think this pattern should be very common though: user calls an RPC, does some processing (which may take some time as I simulated by a sleep), and repeats. Is it a bug or a feature? If it's a feature, how would you suggest I handle my use case? Thank you！

The text was updated successfully, but these errors were encountered:

ankalia · 2023-10-12T17:47:09Z

This is interesting, it's likely a bug in how eRPC implements packet retransmissions (

eRPC/src/rpc_impl/rpc_pkt_loss.cc

Line 12 in 094c17c

void Rpc<TTr>::pkt_loss_scan_st() {

).

Because the time between event loop invocations can be larger than eRPC's retransmission timeout, the code wrongly assumes that packets have been lost. This can be fixed. If you have a patch, I can merge it.

jiangxiaosheng · 2023-10-25T04:50:32Z

I'm afraid this is not the root cause - even after I commented out this if clause (

eRPC/src/rpc_impl/rpc_ev_loop.cc

Line 32 in 094c17c

if (unlikely(ev_loop_tsc_ - pkt_loss_scan_tsc_ > rpc_pkt_loss_scan_cycles_)) {

) for pkt loss scanning, which is the only place where pkt_loss_scan_st() is invoked, the latency is still very high. Could you please confirm? Thanks in advance.

jiangxiaosheng · 2023-10-31T04:15:31Z

@ankalia Do you think it might be related to rdtsc()? I found eRPC heavily relies on rdtsc() for timing. Since user applications can't guarantee to keep the CPU 100% busy, I suspect timing based on TSC would be inaccurate and may cause strange issues.

ankalia · 2023-10-31T16:36:30Z

rdtsc always ticks at the same speed regardless of whether the core is active/asleep.

JFYI: I currently don't have the cycles to debug this in detail.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc latency suddenly increases when client sleeps for a while after rpc call #104

rpc latency suddenly increases when client sleeps for a while after rpc call #104

jiangxiaosheng commented Oct 4, 2023 •

edited

Loading

ankalia commented Oct 12, 2023

jiangxiaosheng commented Oct 25, 2023 •

edited

Loading

jiangxiaosheng commented Oct 31, 2023

ankalia commented Oct 31, 2023

rpc latency suddenly increases when client sleeps for a while after rpc call #104

rpc latency suddenly increases when client sleeps for a while after rpc call #104

Comments

jiangxiaosheng commented Oct 4, 2023 • edited Loading

ankalia commented Oct 12, 2023

jiangxiaosheng commented Oct 25, 2023 • edited Loading

jiangxiaosheng commented Oct 31, 2023

ankalia commented Oct 31, 2023

jiangxiaosheng commented Oct 4, 2023 •

edited

Loading

jiangxiaosheng commented Oct 25, 2023 •

edited

Loading