Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Stream Migration spec #406

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
217 changes: 217 additions & 0 deletions connections/stream-migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
# Stream migration

| Lifecycle Stage | Maturity | Status | Latest Revision |
|-----------------|---------------|--------|-----------------|
| 1A | Working Draft | Active | r0, 2022-04-13 |

Authors: [@marcopolo]

Interest Group: TODO
## Introduction

A peer may have many connections open for another peer and may be transmitting
data on less optimal connections. For example a peer could be connected to
another peer both directly and via a relay. In that case we'd like to move any
streams from the relay over to the the better direct connection. A similar
argument can be made with QUIC and TCP.

A peer `A` may also open a connection to another peer `B`, and at roughly the
same time the other peer `B` may open a connection to peer `A`. In this case we
end up with two connections and no protocol to consolidate these connections.

This protocol describes how an abstract stream can be moved from one underlying
stream to another stream (possibly on a different connection). This protocol
enables the peer to prune excess connections since they will no longer be used,
but it does not define how a node should pick which connection to keep and which
to prune, that is left as a topic for another spec.

## Requirements of this protocol

The design of this protocol is informed by the following requirements:
1. Transport agnostic. Really, this means migrating at the stream level.
1. Minimal overhead. Overhead should be at most a small per-stream cost (no
additional framing, etc.)
1. No interruption. Reading/writing should be continuous.
1. Transparent. Applications using migratable streams shouldn't notice anything.
1. Correct. There can't be any ambiguity (one side believing the migration
happened, the other side disagreeing, etc.).

## The Protocol

The goal of the protocol is to move traffic from one stream to another
seamlessly. The final state of the new stream should be the same as the initial
state of the old stream.

The protocol should only be used when the initiator knows the responder
understands the stream-migration protocol. Otherwise we waste 1 round trip.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
understands the stream-migration protocol. Otherwise we waste 1 round trip.
understands the stream-migration protocol. In that case the negotiation of the stream migration protocol can be pipelined with the negotiation of the application protocol, and therefore doesn't cost any additional round trips.


The protocol works as a prefix before another protocol. If we are creating a
stream for some user protocol `P`, we coordinate the stream-migration protocol
first, and then negotiate protocol `P` later. The stream-migration protocol
assigns an ID for the stream with the `Label` or `Migrate`
[message](#stream-migration-messages) so that both sides can know the ID for the
stream. This way when a peer decides to migrate the stream later on, it can
reference which stream it wants to migrate and both peers know which stream is
being referenced.

![stream-migration](./stream-migration/stream-migration.svg)

<details>
<summary>Instructions to reproduce diagram</summary>

``` plantuml
@startuml stream-migration
skinparam sequenceMessageAlign center
entity Initiator
entity Responder

note over Initiator, Responder
Assume both sides understand stream-migration.
end note

Initiator -> Responder: Open connection
Initiator -> Responder: Open multiplexed stream

Initiator -> Responder: Negotiate stream-migration protocol with ""<stream-migration protocol id>""

Initiator -> Responder: Send ""StreamMigration(type=Label(id=0))"" message

Initiator -> Responder: <i> continue negotiating underlying protocol </i>
... <i>Nodes use the stream as normal<i> ...

== Stream Migration ==

note over Initiator, Responder: Migrate <b>Stream 0</b> to <b>Stream 2</b>

Initiator -> Responder: Open new stream
Initiator -> Responder: Negotiate stream-migration protocol with ""<stream-migration protocol id>""

Initiator -> Responder: <b>Stream 2:</b> Send ""StreamMigration(type=Migrate(id=2, from=0))"" message

Initiator <- Responder: <b>Stream 2:</b> Send AckMigrate message

note over Responder
Treat any ""EOF"" on <b>stream 0</b> as a signal
that it should continue reading on <b>stream 2</b>
end note


note over Initiator
Close <b>stream 0</b> for writing.
Will only write to <b>stream 2</b> from now on.
end note

Initiator -> Responder: <b>Stream 2:</b> ""EOF""

note over Responder
When <i>Responder</i> reads ""EOF"" on <b>stream 0</b>
it will close <b>stream 0</b> for writing.
It will only write to <b>stream 2</b> from now on.
end note

Initiator <- Responder: <b>Stream 2:</b> ""EOF""

note over Initiator
Treat any ""EOF"" on <b>stream 0</b> as a signal
that it should continue reading on <b>stream 2</b>
end note

note over Initiator, Responder
At this point <b>stream 0</b> is closed for writing on
both sides, and both sides have read up to ""EOF"".
<b>stream 0</b> has been fully migrated to <b>stream 2</b>
end note

@enduml
```

To generate:
```bash
plantuml stream-migration.md -o stream-migration -tsvg
```
</details>

The responder may choose to deny the migration by responding with the AckMigrate
message and setting the deny_migrate field to true. In which case, the new
stream should be closed and the initiator should not make future attempts to
migrate this stream. A stream reset before an AckMigrate should be interpreted
by the initiator as a sign it should try again later.

### Stream Migration Protocol ID
The protocol id should be `/libp2p/stream-migration`.

### Stream IDs

Stream IDs are a uint64. They are defined by the stream initiator and conveyed to
the responder in the `StreamMigration` message. The ID should be unique to the
nodes involved. Even across connections. In other words, for two peers A and B,
every labelled stream between them should have a unique uint64 ID. To ensure
that this is true across implementations IDs generated by the lower peer id
should be even, and IDs generated by the higher peer ID should be odd.

Here is a possible strategy that implementors could use when labelling a
stream, but is not required as long as the above invariants hold true.
1. Define an atomic uint64 as a counter.
1. Grab and increment the counter, call this `ID`.
1. Bitshift-left the `ID` by one (i.e. multiply it by two).
1. Check if we are the lower peer ID.
1. If yes, do nothing
1. If no, add one to the `ID`
1. Use `ID` as the ID for the stream.

Note that this reduces the total counter space to be 63bits since the lowest bit
is used to signal which node labelled this stream. This limit should be plenty
high in practice, as no node will have more than `2^63` streams.

### Stream Migration Messages

Messages for stream migration are Protobuf messages defined in
[./stream-migration/streammigration.proto](./stream-migration/streammigration.proto).

The first StreamMigration message sent over the wire by the stream initiator can
be one of:
1. Label. This labels this stream with an ID (defined above).
1. Migrate. This starts the migration from a given id to this stream, it also
labels the stream with an ID.

The responder should only respond to the `Migrate` message with a `AckMigrate`
message that may optionally deny the migration.

## Who moves the stream

This protocol makes no assumption on which node starts the migration. Either
node may start the migration. However extensions to this protocol may want to
designate a single node to start the migration (i.e. picking the node with the
lowest peer id).

## Picking the best connection

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the peer misbehaves, e.g. when B just doesn’t send an EOF on stream B? While A can “consider” the stream as closed, it still needs to be closed explicitly, otherwise the stream multiplexer can’t garbage-collect the stream after it has been EOFed from both sides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the Responder misbehaves and doesn't close B then the stream migration hasn't finished since the new stream isn't in the same state as the old stream. I'm not sure what else we can do besides consider it not spec-compliant.

Picking the best connection to migrate streams towards is outside the scope of
this protocol. However, it's important to note that only the initiator of the
stream migration is concerned with which connection to pick. Both sides do not
have to have the same deterministic notion of what is best.

### Resets

If either stream is "reset" before both ends are closed, both streams must be
reset and the stream as a whole should be considered "aborted" (reset).

### Half closed streams

The final migrated stream should look the same as the initial stream. If the
initial stream `1` was half closed, then the final migrated stream `2` should
also be half closed. Note this may involve an extra step by one of the nodes.
If a node, had closed writes to its old stream before migration it should also
close writes to the new stream after migration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably call out what happens if a node sends new data if the stream was already closed in that direction: that is a connection error.


## Appendix

[Specs Issue](https://github.com/libp2p/specs/issues/328)

### Related Issues:

- <https://github.com/libp2p/go-libp2p/issues/634>

## Open Questions

Some questions that will probably be resolved when a PoC is implemented.
Loading