Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check DataPlane latencies after Traffic Shapping at the host level #48

Open
alebre opened this issue Jan 19, 2017 · 3 comments
Open

Check DataPlane latencies after Traffic Shapping at the host level #48

alebre opened this issue Jan 19, 2017 · 3 comments
Labels

Comments

@alebre
Copy link
Member

alebre commented Jan 19, 2017

Once TC network constraints have been defined at the NICs level (whatever the number of NICs), what is the latency we can expect at the level of the DataPlane (i.e. from the VMs that are executed on the hosts).

@msimonin
Copy link
Contributor

msimonin commented Jan 25, 2017

Setup :

topology:
  grp1:
    parasilo:
      control: 1
      network: 1
      util: 1
  grp2:
    parasilo:
      compute: 1
  grp3:
    parasilo:
      compute: 1

network_constraints:
  default_delay: 10ms
  default_rate: 50mbit
  enable: true

Actually there are different cases to consider :

VM to VM traffic using private ips (same private network)

Here are the results of one ping between two VMs hosted in different compute nodes using their private ips.

ping 192.168.0.11 -c 5
PING 192.168.0.11 (192.168.0.11): 56 data bytes
64 bytes from 192.168.0.11: seq=0 ttl=64 time=20.952 ms
64 bytes from 192.168.0.11: seq=1 ttl=64 time=20.412 ms
64 bytes from 192.168.0.11: seq=2 ttl=64 time=20.447 ms
64 bytes from 192.168.0.11: seq=3 ttl=64 time=20.524 ms
64 bytes from 192.168.0.11: seq=4 ttl=64 time=20.425 ms

--- 192.168.0.11 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 20.412/20.552/20.952 ms

The overhead of 0.552 ms in average is probably due to the encapsulation/decapsulation of the packets. packets are encapsulated and use the ips of the relevant compute node while traversing vxlan tunnels. Traffic shaping rules will thus be applied as expected

VM to VM using private ips (different private networks - no DVR)

DVR = Distributed Virtual Routing

$ ping -c 5  10.0.0.11
PING 10.0.0.11 (10.0.0.11): 56 data bytes
64 bytes from 10.0.0.11: seq=0 ttl=63 time=40.801 ms
64 bytes from 10.0.0.11: seq=1 ttl=63 time=40.755 ms
64 bytes from 10.0.0.11: seq=2 ttl=63 time=40.714 ms
64 bytes from 10.0.0.11: seq=3 ttl=63 time=40.715 ms
64 bytes from 10.0.0.11: seq=4 ttl=63 time=40.672 ms

--- 10.0.0.11 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 40.672/40.731/40.801 ms

This is expected since all the traffic leverage the centralized L3 agent hosted in neutron.
Note that even if the VMs are on the same physical host the latency will be 40ms.

VM to VM using private ips (different private networks - with DVR)

  • VM not in the same host.
$ ping -c 5 10.0.0.6
PING 10.0.0.6 (10.0.0.6): 56 data bytes
64 bytes from 10.0.0.6: seq=0 ttl=63 time=20.562 ms
64 bytes from 10.0.0.6: seq=1 ttl=63 time=20.547 ms
64 bytes from 10.0.0.6: seq=2 ttl=63 time=20.517 ms
64 bytes from 10.0.0.6: seq=3 ttl=63 time=20.512 ms
64 bytes from 10.0.0.6: seq=4 ttl=63 time=20.496 ms

--- 10.0.0.6 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 20.496/20.526/20.562 ms

Traffic is encapsulated and goes from compute node 1 to compute node 2.

  • VM in the same host
$ ping -c 5 10.0.0.6
PING 10.0.0.6 (10.0.0.6): 56 data bytes
64 bytes from 10.0.0.6: seq=0 ttl=63 time=0.682 ms
64 bytes from 10.0.0.6: seq=1 ttl=63 time=0.298 ms
64 bytes from 10.0.0.6: seq=2 ttl=63 time=0.290 ms
64 bytes from 10.0.0.6: seq=3 ttl=63 time=0.231 ms
64 bytes from 10.0.0.6: seq=4 ttl=63 time=0.295 ms

--- 10.0.0.6 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.231/0.359/0.682 ms

Traffic don't got to the network node (DVR is activated) so traffic stays local.

VM to VM traffic using floating ips and no DVR

$ ping -c 5 10.24.90.1
PING 10.24.90.1 (10.24.90.1): 56 data bytes
64 bytes from 10.24.90.1: seq=0 ttl=63 time=40.654 ms
64 bytes from 10.24.90.1: seq=1 ttl=63 time=40.764 ms
64 bytes from 10.24.90.1: seq=2 ttl=63 time=40.747 ms
64 bytes from 10.24.90.1: seq=3 ttl=63 time=40.732 ms
64 bytes from 10.24.90.1: seq=4 ttl=63 time=40.684 ms

--- 10.24.90.1 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 40.654/40.716/40.764 ms

Using the floating introduce an extra hop (the network node). There was 20ms delay between each pair of host so we got 40ms delay using the floating ip (DVR wasn't activated). Once again packets leaving a compute are encapsulated and will be directed to the network node first and then to the second compute node.

VM to VM traffic using floating ips and DVR activated

$ ping -c 5 10.24.90.6
PING 10.24.90.6 (10.24.90.6): 56 data bytes
64 bytes from 10.24.90.6: seq=0 ttl=60 time=0.452 ms
64 bytes from 10.24.90.6: seq=1 ttl=60 time=0.428 ms
64 bytes from 10.24.90.6: seq=2 ttl=60 time=0.428 ms
64 bytes from 10.24.90.6: seq=3 ttl=60 time=0.422 ms
64 bytes from 10.24.90.6: seq=4 ttl=60 time=0.430 ms

--- 10.24.90.6 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.422/0.432/0.452 ms

Here traffic shaping rules aren't applied. This due to the way DVR handle packet and it needs more investigation. This is probably due to the fact that packet aren't encapsulated and the destination is directly the floating ip of the VM (and not the compute nodes's one as we would have if the packet was encapsulated)

VM to external

$ ping -c 5 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=53 time=25.229 ms
64 bytes from 8.8.8.8: seq=1 ttl=53 time=25.139 ms
64 bytes from 8.8.8.8: seq=2 ttl=53 time=25.117 ms
64 bytes from 8.8.8.8: seq=3 ttl=53 time=25.119 ms
64 bytes from 8.8.8.8: seq=4 ttl=53 time=25.145 ms

--- 8.8.8.8 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 25.117/25.149/25.229 ms

We got an extra latency due to the traffic between the compute and the network. This was expected.

External to VM (DVR)

root@parasilo-17-kavlan-5:~# ping -c 5 10.24.90.6
PING 10.24.90.6 (10.24.90.6) 56(84) bytes of data.
64 bytes from 10.24.90.6: icmp_seq=1 ttl=62 time=0.268 ms
64 bytes from 10.24.90.6: icmp_seq=2 ttl=62 time=0.210 ms
64 bytes from 10.24.90.6: icmp_seq=3 ttl=62 time=0.239 ms
64 bytes from 10.24.90.6: icmp_seq=4 ttl=62 time=0.215 ms
64 bytes from 10.24.90.6: icmp_seq=5 ttl=62 time=0.210 ms

--- 10.24.90.6 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3996ms
rtt min/avg/max/mdev = 0.210/0.228/0.268/0.026 ms

When DVR is activated, traffic doesn't follow traffic shapping rules. This is kind of expected due the previous result with DVR. This is probably due to the fact that traffic goes directly to the compute node hosting the VM.

External to VM (No DVR)

·> ping -c 5 10.24.90.3
PING 10.24.90.3 (10.24.90.3) 56(84) bytes of data.
64 bytes from 10.24.90.3: icmp_seq=1 ttl=62 time=20.6 ms
64 bytes from 10.24.90.3: icmp_seq=2 ttl=62 time=20.5 ms
64 bytes from 10.24.90.3: icmp_seq=3 ttl=62 time=20.6 ms
64 bytes from 10.24.90.3: icmp_seq=4 ttl=62 time=20.6 ms
64 bytes from 10.24.90.3: icmp_seq=5 ttl=62 time=20.5 ms

--- 10.24.90.3 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 20.538/20.591/20.631/0.132 ms

Without DVR this is consistent with the rules applied. This explained by the fact that all the external traffic flows through the network node and then is encapsulated in destination of the compute node hosting the VM.

@msimonin
Copy link
Contributor

@alebre I think the short answer is :
The way Enos applies traffic shapping rules is

  • ok, Without DVR
  • ok in every(1) cases except one with DVR :
    • inter VM communication isn't consistent with traffic shapping rules when using floating ips.

(1) Did I miss some other cases?

@alebre
Copy link
Member Author

alebre commented Jan 30, 2017

Important, please note that if you do not enable DVR, all L3 communications go thought the network controller.
If the controller has been deployed on a remote site, this mean that all VMs will suffer from the emulated latency (i.e. if you have two compute nodes on a remote site and two vms deployed, one on Compute 1 and one on Compute 2, when the two vms are communicating each other, the traffic goes through the network controller deployed remotely).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants