Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Liveness bug in VSR implementation: Client table and uncommitted ops after view change #16

Open
jorangreef opened this issue Sep 15, 2021 · 0 comments

Comments

@jorangreef
Copy link

jorangreef commented Sep 15, 2021

I believe there may be a very subtle liveness bug in https://github.com/UWSysLab/tapir/blob/master/replication/vr/replica.cc#L450 where the client table (effectively a record of committed replies) is touched on both the prepare and commit paths.

However, there is a difference between uncommitted ops and committed ops, as uncommitted ops may not survive a view change. Yet the implementation does not appear to account for this by fixing up the client table after a view change if it was modified by prepared ops that did not survive. This can then cause some client requests to be permanently blocked out, treated as duplicates, while they were never actually committed to the client table.

A cleaner approach might be to use the client table only for a single purpose i.e. only for committed data, and then to use the inflight pipeline to dedupe any uncommitted inflight ops. This way the client table never needs to be patched up after a view change.

@jorangreef jorangreef changed the title Liveness bug in VSR implementation: Client table is Liveness bug in VSR implementation: Client table and uncommitted ops after view change Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant