Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc. questions from assertion flows #30

Open
lrettig opened this issue Mar 23, 2020 · 10 comments
Open

Misc. questions from assertion flows #30

lrettig opened this issue Mar 23, 2020 · 10 comments
Labels
flows Questions that came up as part of assertion flows work question Further information is requested

Comments

@lrettig
Copy link
Member

lrettig commented Mar 23, 2020

Consensus

  • What would happen if we had a Hare but no Tortoise?
    • We could still achieve consensus but it would be a different kind of consensus. Tortoise gives us 1. irreversibility (the property that once something is valid it becomes harder and harder to reverse it with more blocks in the mesh, i.e., the way Bitcoin works), 2. self-healing, and we'd lose these properties. We'd also need to store Hare vote data on the mesh somehow so that future participants could verifiably calculate the global state.
  • Without the Hare, would pbase lag further behind? I.e., does the Hare help pbase "keep up"?
    • Yes, that's the main reason we have the Hare: it "helps convergence" since all honest parties vote the same way. All honest parties might vote the same way anyway (or very close to it), but the Hare makes a balancing attack harder to pull off.
  • How do inactive miners achieve Hare consensus with the active Hare committee participants? Do they just passively watch gossiped Hare protocol messages?
    • There is no single "committee"; the miners active in each round change and a miner must be ready to become active at any moment.
    • In theory you could run Hare entirely "off chain", without gossip, in private P2P channels, but then you'd need the participants (at least those in the final round) to sign and commit to the results and publish that (i.e., adding an extra round).
    • If this final, signed statement were published, any miner could just validate its signatures to validate the result.
    • As it stands, however, everyone needs to follow the messages from each round in order to validate the results.
  • How close do we expect pbase to stay to the current top layer? Is there some threshold that we expect it to stay within - let's say, assuming normal assumptions hold and there is no attack. Another way of phrasing the question: is there an N such that, if the difference between pbase and the current top layer exceeded N, we'd worry, or say there's an issue/some assumption had failed? Is there some formula we can use to calculate how large we expect N to be at any given time?
    • Barak: I don't have an exact formula, but one can be made. There should be some layer, after which the probability that pbase does not advance is reducing exponentially.
  • The Hare expects all of the blocks for a given layer to propagate throughout the network before the Hare kicks off, after the hare-wakeup-delta parameter, right? Doesn't this mean that it punishes blocks that arrive just a little bit late - i.e., blocks that could've arrived later if there were no Hare and were only a Tortoise?
    • yes, it does. that's the main point.
  • How, exactly, is the leader chosen in Hare round 2? Does each active participant in this round include the output of a VRF in their proposal? What does the Hare expected_leaders flag do?
    • All participants send VRF output, and lowest one is chosen as leader. We use a probabilistic threshold to maximize the likelihood that we have at least one leader.
  • When exactly, and how, does each node know its participation status (active/passive) in each Hare round, for each Hare execution cycle, in each layer?
    • Barak: Once a miner have the Hare beacon and the number of total eligible parties for some layer, it can calculate its role for all possible rounds in that layer. The computation is explained in the protocol paper. In practice, each miner computes its role at the beginning of each round.
  • Does Hare round two actually broadcast an "accept" message at the end of round two (as documented here)? This doesn't seem to line up with the ADDNR18 paper (and it may be my mistake).
    • No, fixed this
  • What happens if there's a problem in the Hare pre-round, e.g., there's no (or insufficient) overlap in the pre-round views? Would Hare terminate with an empty set?
    • Barak: yes, it should terminate with an empty set, according to the Hare validity properties.
  • from Write a wiki page explaining Spacemesh consensus in context #27 (comment): Tortoise doesn't involve message-passing among nodes, does it? So what does it mean to say that Hare is more efficient in terms of communication complexity?
    • The block votes are the messages
    • Barak: I am guessing that means that overall, less messages are needed in order to get (irreversible) consensus on a specific layer.
  • from Write a wiki page explaining Spacemesh consensus in context #27 (comment): Without Hare, is the network more vulnerable to attacks? If so, how? How would this play out in practice?
    • yes, a balancing attack
  • Blocks point to previous blocks, via the view and the Hare votes. Which layer should the block point to blocks in? Is there a limit? The white paper says “recent layer”, what does that mean?
    • Another way of asking the question: a miner would be incentivized to be “safe” by voting for older blocks, i.e., blocks before pbase - what incentivizes them to vote for more recent blocks? Don’t we want them to vote for the most recent ones they’ve seen? Due to latency won’t there be blocks that show up late and some miners consider contextually invalid? So I want to point at blocks where I’m certain there’s no chance someone would think it’s invalid
    • A block's view should contain all, and only, orphan blocks. View should contain all syntactically valid blocks that a miner knows about, regardless of contextual validity. Tortoise counts votes of all syntactically valid blocks - if a block arrived late, then other miners won't know about it and won't count its votes.
    • Explicit votes should be on layers above pbase.
  • Voting: Does a block have to vote for other blocks? What if it doesn’t? (And if not, why would I bother?) Why does a block need to include a view/“Visible Mesh”?
    • Barak: No. Nothing happens. It's part of the protocol and honest miners should follow it. We want to guarantee that all honest miners count votes the same.
  • Is there a limit to how many blocks a block can vote for? Is there a minimum? IOW, what exactly does the protocol say about voting?
    • Barak: No limit (possibly self healing needs unlimited voting distance?)
  • Is PoST used for anything other than generating an ATX/blocks? E.g., do you need an ATX (for this epoch) to be eligible for Hare?
    • ATX gives eligibility for blocks and Hare. PoST is only used for ATX generation.

Other

  • Is the node ID keypair distinct from an ephemeral P2P identity keypair that's regenerated when a node restarts? Do nodes have no knowledge of the node ID public key of their neighbors?
    • yes, distinct
  • Fees
    • Are fee payments implicit (i.e., no corresponding transactions, they are just implied by the protocol) or explicit (i.e., transactions corresponding to each fee payment are inserted into a block)?
    • When, exactly, are fees paid? At the end of the layer? When can they be spent by the recipient miner?
    • Implicit, paid right away when a layer is finalized by the Hare
  • What happens if a miner creates and gossips an invalid block? (I.e., the miner is not eligible to produce a block in the current epoch, or layer)
    • The block is discarded and not gossiped any further, so it would not propagate. We'll likely also want to disconnect from and blacklist a peer that sends us a syntactically invalid block.
  • The number of blocks per layer is an average, right? (I.e., total blocks per epoch divided by number of layers per epoch gives us average blocks per layer in that epoch.) What's the probability distribution of the number of blocks per layer? Is it thus hypothetically possible that a given layer could have very few, or even zero blocks?
    • If there are more miners than blocks in an epoch, every miner is eligible to produce one block and there will be "more blocks than planned."
    • For each block eligibility a miner randomly draws a layer from the epoch, with replacement, so the distribution is binomial.
    • This is all in the honest/expected case. An attacker could produce more blocks than they're supposed to, in which case they'd receive a smaller reward per block. An eligible miner may also miss a block if it's e.g. offline.
  • What happens if a second tx is received by a miner with the same nonce as a tx that’s already in their mempool - what happens? Does it compare fees and drop the tx with the lower fee?
  • When does a miner drop a tx from its mempool after that tx gets mined into a block (by another miner)? Right away when it sees a block that includes that tx, or only after the tx makes it into global state? Is there only one mempool, or is there a second pool for tx that have been mined but aren't finalized yet? What happens if a tx gets mined into a block but that block ends up getting dropped later?
  • Does the miner submission/input/request to a PoET server include/is specific/tied to to a specific epoch? Or could the PoET proof be used for any epoch?
    • Barak: PoET servers should be considered as external service - they have no notion of epochs. Yet, the miner's registration challenge contains (implicitly, and at the moment also explicitly) an epoch_id. Note, however, that there is no "correct" epoch for a PoET proof; 2 miners can register to the same PoET round for different epochs (for whatever reason), and both ATXs, that include the same PoET proof, should be valid.
  • Would gossip protocol gossip e.g. an ATX before it had seen the PoST proof that it references? Is it possible that a node receives an ATX without having seen the PoST proof it references? What happens then?
    • Barak: If a node cannot validate the ATX, for example since it does not hold the PoET proof (so it cannot validate the NIPoST), it won't propagate it. The miner should try and fetch the missing parts, and if it fails, it considers this ATX syntactically invalid. Note that PoST is sent as part of the ATX, but the PoET proof is sent separately.
  • Does an ATX have contextual validity?
    • Barak: Yes, for example if another ATX with the same sequence number exists (this means that the miner tries to re-use their space-time for different ATXs). The protocol paper has more details.
  • Blocks
    • The white paper says that a block includes a “Tick t” field. What is this for? Is this actually used? I don’t see it in the code.
    • Why does the block contain both the minerId and a Signature? Why can’t we just extract the minerId from the Signature?
      • I believe that in our implementation blocks do not contain the miner's id.
    • And why doesn’t the whitepaper include the Signature under block?
      • Barak: I don't know "why", but signatures are not really part of blocks (you sign the block, once finalised), and anyway it is clear (to the cryptographers among us) that one also signs their block.
    • What “happens” to a contextually invalid block? Does it just disappear?
      • If some miner does see it, it'll be part of that miner's views and that miner will consider its votes. Other than this, nothing happens. It does not affect the state.
    • Do blocks include transactions or just pointers to them? Just txid, right?
      • Yes, just txid
    • Why does a block have a timestamp? What’s it used for? Isn’t this unnecessary with the layer ID?
  • What was the exact bug that caused the testnet fork? How do we know it won’t happen again?
    • Barak: we don't know the exact bug, but a syntactically valid block was not propagated to all miners. The subsequent events that actually caused a fork (or maybe "mesh inconsistency" is better, since there was no fork in the global state) are not super important because it was a block from Genesis (for which we have a "stupid" implementation) and there are several known open tasks, that should have prevented this fork. We don't know that.
  • Sync
    • Is fully/weakly synced per layer, i.e., can I be fully synced for layer 100 and weakly for layer 101?
      • Barak: The only situation, in our current implementation, that being weakly but not fully synced may happen is for one layer after a node finishes syncing (e.g. a miner finished syncing in layer 100, and during layer 101 it only listens, but does not actively participate).
    • What is the precise definition of weakly and strongly synced?
    • Do you ask one peer or multiple/all peers for a block, tx, atx, etc. as part of sync?
      • Barak: As far as I know you ask each item from one peer at a time, but you switch between peers.
    • The syncer will never getLayerFromNeighbors for an older layer, right? Only the top/current layer? And everything else happens recursively
      • Barak: No, you may specify which layer you want from your peers. Doing things recursively is inefficient and probably will blowup your memory. There's no reason of doing so, when you know exactly what layers you are missing.
    • Why does a node need to wait until it’s fully synced to generate an ATX/begin to participate in mining? Or at least to submit to PoET?
      • Barak: The miner probably doesn't have to, but it is more hermetic. E.g. we would like blocks to contain in their view only orphan blocks and to vote correctly; ATXs need positioning ATX, so they have to be somewhat synced. Keep in mind that in practice, if it's the first time the miner syncs, it has to wait a full epoch before it can generate blocks or actively participate in the Hare protocol.
  • Params
    • Why are hare-wakeup-delta and sync-validation-delta different? Why do we set/tweak the first for the testnet but not the second?
      • Barak: Implementation decision, guess to give more flexibility. Don't expect consistency, these were developed - and tested - at different times. We should probably set neither.

Related: #27
Re: #29
CC @ilans

@noamnelke
Copy link
Member

  • Is the node ID keypair distinct from an ephemeral P2P identity keypair that's regenerated when a node restarts? Do nodes have no knowledge of the node ID public key of their neighbors?

Yes, they're distinct. Since P2P IDs are paired with IP addresses, any node on the network knows the real world identity (or pretty close) of any P2P ID. Node IDs are associated with a coinbase account via the ATX. If the two identities were the same, or easily associated, there would be zero privacy for miners. It could also help attackers perform targeted attacks against specific miners, ranging from a simple DDoS to a full-on eclipse attack.

While simply separating the IDs is not enough (network analysis can still be used to associate the two), it's an easy first step in the right direction. Dandelion and similar privacy features can be added on later.

  • Fees

    • Are fee payments implicit (i.e., no corresponding transactions, they are just implied by the protocol) or explicit (i.e., transactions corresponding to each fee payment are inserted into a block)?

They're implicit. The existence of a valid block in the mesh implies that the appropriate share of the layer reward and fees has been added to that miner's coinbase account, as declared in the matching ATX.

It's each node's responsibility to calculate the rewards and apply them to the global state.

  • When, exactly, are fees paid? At the end of the layer? When can they be spent by the recipient miner?

There's no maturation period. The reward (incl. fees) from a block can be spent as soon as any other transaction included in that block can be spent--once the hare completes for the relevant layer.

  • Consensus

    • What would happen if we had a Hare but no Tortoise?

      • Answer: We could still achieve consensus but it would be a different kind of consensus. Tortoise gives us 1. irreversibility (the property that once something is valid it becomes harder and harder to reverse it with more blocks in the mesh, i.e., the way Bitcoin works), 2. self-healing, and we'd lose these properties. We'd also need to store Hare vote data on the mesh somehow so that future participants could verifiably calculate the global state.
    • Without the Hare, would pbase lag further behind? I.e., does the Hare help pbase "keep up"?

Possibly. If everyone's honest, then everyone's blocks would vote the same (or close enough to the same) so that the lack of hare would make little difference. Having the hare makes balancing attacks (see if a block has close to half of the votes for it and vote so that it stays that way), which can prevent the network from reaching consensus, harder to pull off.

@barakshani can probably shed more light on this.

  • How do inactive miners achieve Hare consensus with the active Hare committee participants? Do they just passively watch gossiped Hare protocol messages?

What's an inactive miner? Not every miner participates in the hare at every round. Other miners just listen for hare messages and validate everything. The term "hare committee" makes it sound like there's a closed set of miners that participate in the hare protocol until it completes, but that's misleading. Each miner could be eligible to participate in any round of the hare, different rounds have a different set of participants. Hare processes that take more rounds to complete also involve more miners and since it's impossible to predict how many rounds will be needed, everyone has to be ready to participate at a moment's notice.

  • What happens if a miner creates and gossips an invalid block? (I.e., the miner is not eligible to produce a block in the current epoch, or layer)

If a node receives a syntactically invalid block, like you describe, it discards it and doesn't gossip it to neighbors. In the future we plan to also disconnect and blacklist any peer that passed a syntactically invalid message to us.

If the block is syntactically invalid and signed by the miner, we may want to blacklist the miner, as well, but that's trickier. If we want to do that, we'd need to publish some kind of proof of the wrongdoing on-mesh, so that there's consensus about that, and it's not clear that it's worth it.

  • The number of blocks per layer is an average, right? (I.e., total blocks per epoch divided by number of layers per epoch gives us average blocks per layer in that epoch.) What's the probability distribution of the number of blocks per layer? Is it thus hypothetically possible that a given layer could have very few, or even zero blocks?

If everyone's honest and everyone sees the whole network, the number of block eligibilities in an epoch is fixed, up to a rounding error, with one exception: if there are more miners than the number of blocks in an epoch - each miner will be eligible for one block and there will be more total blocks than planned.

After each miner calculates how many blocks they are eligible for in an epoch, they draw, with replacement, that many layers from the epoch, and those are the layers in which they are eligible for blocks. Because it's with replacement, a miner could be eligible for multiple blocks per layer.

Because each eligibility has the same probability to fall in each layer, the number of blocks per layer is distributed binomially.

All of this is in the honest case, since attackers could produce more blocks than they are supposed to (they'd receive a smaller share of the rewards per block in that case--pending implementation).

It also ignores the benevolent case where miners could publish an ATX and then not publish the blocks they're eligible for, e.g. because they're offline.

@barakshani please correct me if I'm wrong about anything here.

@lrettig lrettig added the flows Questions that came up as part of assertion flows work label Mar 26, 2020
@lrettig
Copy link
Member Author

lrettig commented Mar 26, 2020

Thanks @noamnelke. One followup question:

What happens if a miner creates and gossips an invalid block? (I.e., the miner is not eligible to produce a block in the current epoch, or layer)

If a node receives a syntactically invalid block, like you describe, it discards it and doesn't gossip it to neighbors.

I understand the case where a block references a non-existent ATX, or an invalid ATX, and is thus a syntactically invalid block. What if the block references a valid ATX, but the block was produced in the right epoch but the wrong layer (i.e., a layer that the miner wasn't eligible to produce a block in)? How can you determine that the block is syntactically invalid?

@lrettig
Copy link
Member Author

lrettig commented Mar 26, 2020

(Moved the list of questions up to the top to maintain one canonical list)

Some more recent questions - @noamnelke @barakshani would appreciate your help with these. Thanks!

@noamnelke
Copy link
Member

I understand the case where a block references a non-existent ATX, or an invalid ATX, and is thus a syntactically invalid block. What if the block references a valid ATX, but the block was produced in the right epoch but the wrong layer (i.e., a layer that the miner wasn't eligible to produce a block in)? How can you determine that the block is syntactically invalid?

The way eligibility for a block is proven is by signing a message which includes a counter value. The value must be less than the number of blocks the miner is eligible for in the epoch and each value can only be used once. The layer is derived from the signature.

@barakshani
Copy link

Keep in mind that a layer is defined to be "a set of blocks". Hence, time-wise you cannot publish a block in the "wrong layer". If it's not the "right" time, then the block would be considered early/late, but it is still syntactically valid.

What Noam refers to is more into the layer definition: a block explicitly specifies which layer it belongs to, and this is part of the eligibility proof, hence can be verified (note: it could also be implicitly derived from the proof).
[forgot about the counter, it is irrelevant to your question]

@lrettig lrettig added the question Further information is requested label Mar 30, 2020
@barakshani
Copy link

Consensus

  1. already answered.
  2. was answered on slack.
  3. not exactly sure what you mean by inactive. Everyone participate in the entire protocol as listeners (passive role), only a subset of them participate actively by sending messages.
  4. pbase is not expected to be at the current top layer, as the pbase layer needs at least 401 votes upon (and we don't run layers with so many blocks). I don't have an exact formula, but one can be made. There should be some layer, after which the probability that pbase does not advance is reducing exponentially.
  5. yes, it does. that's the main point.
  6. In the proposal round there is a different threshold for activeness (than in the other rounds) - it is set according to the expected leaders parameter. Yes, each active participant needs to send their VRF output to prove eligibility (that they are indeed active), as in other rounds. Among the active participants, the one with the lowest VRF is the leader.
  7. Once a miner have the Hare beacon and the number of total eligible parties for some layer, it can calculate its role for all possible rounds in that layer. The computation is explained in the protocol paper. In practice, each miner computes its role at the beginning of each round.
  8. no. however, committing (on the proposed set) means willingness to accept it, so it happens in round 3.
  9. yes, it should terminate with an empty set, according to the Hare validity properties.
  10. technically, the blocks are the messages, but yes, once the tortoise starts, it is non interactive. I am guessing that means that overall, less messages are needed in order to get (irreversible) consensus on a specific layer.
  11. yes, a balancing attack where half of the honest parties does/not think that a block is (contextually) valid, because for example this block was sent in the borderline for being accepted as valid (say according to some time frame). Then it is very easy, theoretically, for the adversary to maintain the uncertainty about this block. I believe that the protocol paper also describes this.
  12. a block's view should contain all, and only, orphans block. Explicit votes should be on layers above pbase. Votes in the Spacemesh 2 protocol are different.
    If a block is not part of the Hare results of its layer, it should be considered invalid by all miners. A late block is no different in that regard.
    A miner needs to have in its view all syntactically valid blocks, unrelated to the miner's opinion about the contextually validity of them. It is important, for example, since the tortoise counts votes of all syntactically valid blocks, and we'd like all the miners to have the same block tally. That is, if a miner knows about a block B that isn't contextually valid and doesn't point to it (in its block's view), then that miner counts B's votes, but others - who are not aware of B - do not.
  13. No. Nothing happens. (it is part of the protocol - honest miners should follow it).
    we want to guarantee that all honest miners count votes the same, see previous answer.
  14. No limit (possibly self healing needs unlimited voting distance?). See above.
  15. ATX gives eligibility for blocks and Hare. PoST is only used for ATX generation.

@barakshani
Copy link

Other
1-6. skipped.
7. PoET servers should be considered as external service - they have no notion of epochs. Yet, the miner's registration challenge contains (implicitly, and at the moment also explicitly) an epoch_id. Note, however, that there is no "correct" epoch for a PoET proof; 2 miners can register to the same PoET round for different epochs (for whatever reason), and both ATXs, that include the same PoET proof, should be valid.
8. While on the gossip level, we may break the ATX into parts, from the protocol level, NIPoST (which contains the PoST proof) is an inherent part of the ATX. If a node cannot validate the ATX, for example since it does not hold the PoET proof (so it cannot validate the NIPoST), it won't propagate it. The miner should try and fetch the missing parts, and if it fails, it considers this ATX syntactically invalid. Note that PoST is sent as part of the ATX, but the PoET proof is sent separately.
9. Yes, for example if another ATX with the same sequence number exists (this means that the miner tries to re-use their space-time for different ATXs). The protocol paper has more details.
10. I think that this is explained in the protocol paper under section "Neutral and postponed votes" (see "1. If the work tick of P’s ATX..."). Ticks have not been implemented yet.
I believe that in our implementation blocks do not contain the miner's id.
I don't know "why", but signatures are not really part of blocks (you sign the block, once finalised), and anyway it is clear (to the cryptographers among us) that one also signs their block.
Nothing happens. Things happens only with contextually valid blocks (e.g. the state is being updated with their txs).
Yes, tx_id.
I'm not familiar with blocks having timestamps.
11. we don't know the exact bug, but a syntactically valid block was not propagated to all miners. The subsequent events that actually caused a fork (or maybe "mesh inconsistency" is better, since there was no fork in the global state) are not super important because it was a block from Genesis (for which we have a "stupid" implementation) and there are several known open tasks, that should have prevented this fork. We don't know that.
12. If you have a block from layer 100 and you "listen" for new blocks from layer 101, you are weakly synced. You will be considered fully synced if you also (try to) produce blocks and participate in the Hare. The only situation, in our current implementation, that being weakly but not fully synced may happen is for one layer after a node finishes syncing (e.g. a miner finished syncing in layer 100, and during layer 101 it only listens, but does not actively participate).
As far as I know you ask each item from one peer at a time, but you switch between peers.
No, you may specify which layer you want from your peers. Doing things recursively is inefficient and probably will blowup your memory. There's no reason of doing so, when you know exactly what layers you are missing.
The miner probably doesn't have to, but it is more hermetic. E.g. we would like blocks to contain in their view only orphan blocks and to vote correctly; ATXs need positioning ATX, so they have to be somewhat synced. Keep in mind that in practice, if it's the first time the miner syncs, it has to wait a full epoch before it can generate blocks or actively participate in the Hare protocol.
13. Implementation decision, guess to give more flexibility. Don't expect consistency, these were developed - and tested - at different times. We should probably set neither.

@lrettig
Copy link
Member Author

lrettig commented Apr 7, 2020

@noamnelke:

All of this is in the honest case, since attackers could produce more blocks than they are supposed to (they'd receive a smaller share of the rewards per block in that case--pending implementation).

So there's no punishment to prevent an attacker from producing more blocks than they're supposed to, say, lots more? What prevents someone from doing this? I understand that they won't receive any extra reward for doing so - is that the only thing?

@noamnelke
Copy link
Member

So there's no punishment to prevent an attacker from producing more blocks than they're supposed to, say, lots more? What prevents someone from doing this? I understand that they won't receive any extra reward for doing so - is that the only thing?

We plan to implement bounds for the number of seen active miners. This means that a spammer (a type of attacker) is limited in how many spam blocks they can create, given their investment. So a small space-time investment will allow some spam, but nothing crazy. Limited damage, combined with no rewards, should discourage spammers from choosing this route.

@lrettig
Copy link
Member Author

lrettig commented May 6, 2020

The "limited damage" part is important to prevent severe griefing attacks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flows Questions that came up as part of assertion flows work question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants