3d Range-Based Set Reconciliation
When two peers wish to synchronise data, they typically first exchange which Areas in which namespaces they are interested in. Intersecting these Areas yields the sets of Entries for which they then need to bring each other up to speed. In this document, we present a strategy for doing so efficiently.
Given the Entries that the peers have available, there can be two cases that necessitate data exchange. First, one peer might have an Entry that the other does not have, and second, the peers might hold nonequal parts of the Payload of some common Entry.
As a first step to solving the problem, we simplify it. If Entries contained information about locally available Payload bytes, then both cases would merge into a single case: one peer might have a datum that the other lacks. Hence, we do not synchronise Entries directly, but LengthyEntries:
struct LengthyEntry
The task of the two peers then becomes conceptually simple: they each have a set of LengthyEntries, and they need to inform each other about all LengthyEntries the other party does not have, that is, they each need to compute the union of their two sets. In the scientific literature, this problem is known as set reconciliation11Minsky, Yaron, Ari Trachtenberg, and Richard Zippel. "Set reconciliation with nearly optimal communication complexity." IEEE Transactions on Information Theory 49.9 (2003): 2213-2218..
Once the two peers have reconciled their sets, they can filter out Entries that overwrite each other, and they can separately request any missing (suffixes of) Payloads. Going forward, we thus concentrate on the set reconciliation part only.
To perform set reconciliation, we adapt the approach of range-based set reconciliation22https://github.com/AljoschaMeyer/rbsr_short/blob/main/main.pdf (this peer-reviewed paper was presented at SRDS2023, the proceedings have not been published yet)For a more accessible introduction to the technique, see also this webpage..
Range-based set reconciliation solves the problem recursively. To reconcile two sets, one peer first computes a hash over all items in its set, and sends this fingerprint to the other peer. That peer then computes the fingerprint over its items as well. If the fingerprints match, they are done reconciling.
If they do not match, there are two options. First, the peer can split its set in half and then initiate set reconciliation for each half concurrently (by transmitting its hashes for both halves). Second, if the set is sufficiently small, the peer can instead simply transmit its items in the set. The other peer responds to this with all other items that it held in the set, completing the process of reconciliation.
Overall, the peers collaboratively drill down to the differences between their two sets in a logarithmic number of communication rounds, spending only little bandwidth on those regions of the original sets where they hold the same items. Note that peers can actually split sets into arbitrarily many subsets in each step. Splitting into more subsets per step decreases the total number of communication rounds.
3d range-based set reconciliation takes these ideas and applies them to Willow. The core design decision is to delimit sets of LengthyEntries via 3dRanges. When a peer splits its 3dRanges, it is crucial for overall efficiency to not split based on volume (for example, by splitting the times in half numerically), but to split into subranges in which the peer holds roughly the same number of Entries.
Let Fingerprint denote the type of hashes of LengthyEntries that the peers exchange. Then the precise pieces of information that peers exchange are the following:
struct 3dRangeFingerprintThe 3dRange in question.The set of LengthyEntries a peer holds in some 3dRange.struct 3dRangeEntrySetThe 3dRange in question.The LengthyEntries that the sender holds in the 3d_range.A boolean flag to indicate whether the sender wishes to receive the other peer’s 3dRangeEntrySet for the same 3d_range in return.
To initiate reconciliation of a 3dRange, a peer sends its 3dRangeFingerprint. Upon receiving a 3dRangeFingerprint, a peer computes the Fingerprint over its local LengthyEntries in the same range.
If it does not match, the peer either sends a number of 3dRangeFingerprints whose 3dRanges cover the 3dRange for which it received the mismatching Fingerprint. Or it replies with its 3dRangeEntrySet for that 3dRange, with the want_response flag set to true
.
To any such 3dRangeEntrySet, a peer replies with its own 3dRangeEntrySet, setting the want_response flag to false
, and omitting all LengthyEntries it had just received in the other peer’s 3dRangeEntrySet.
When a peer receives a 3dRangeFingerprint that matches the Fingerprint over its local LengthyEntries in the same 3dRange, the peer should reply with an empty 3dRangeEntrySet for that 3dRange, setting the want_response flag to false
. This notifies the sender of the 3dRangeFingerprint that reconciliation has successfully concluded for the 3dRange.
Fingerprinting
3d range-based set reconciliation requires the ability to hash arbitrary sets of LengthyEntries into values of some type Fingerprint. To quickly compute Fingerprints, it helps if the Fingerprint for a 3dRange can be assembled from precomputed Fingerprints of other, smaller 3dRange. For this reason, we define the fingerprinting function in terms of some building blocks:
First, we require a function fingerprint_singleton that hashes individual LengthyEntries into the set Fingerprint. This hash function should take into account all aspects of the LengthyEntry: modifying its namespace_id, subspace_id, path, timestamp, payload_digest, payload_length, or its number of available bytes, should result in a completely different Fingerprint.
Second, we require an associative and commutative33Classic range-based set reconciliation does not require commutativity. We require it because we do not wish to prescribe how to linearise three-dimensional data into a single order. function fingerprint_combine that maps two Fingerprints to a single new Fingerprint. The fingerprint_combine function must further have a neutral element fingerprint_neutral.
Set | Fingerprint |
---|---|
{ } | { } | { } |
Given these building blocks, we define the function fingerprint from sets of LengthyEntries to Fingerprint:
- applying fingerprint to the empty set yields fingerprint_neutral,
- applying fingerprint to a set containing exactly one LengthyEntry yields the same result as applying fingerprint_singleton to that LengthyEntry, and
- applying fingerprint to any other set of LengthyEntries yields the result of applying fingerprint_singleton to all members of the set individually and then combining the resulting Fingerprints with fingerprint_combine (grouping and ordering do not matter because of associativity and commutativity).
For 3d range-based set reconciliation to work correctly, fingerprint must map distinct sets of LengthyEntries to distinct Fingerprints with high probability, even when facing maliciously crafted input sets. The range-based set reconciliation paper surveys suitable, cryptographically secure hash functions in section 5B. All but the Cayley hashes use commutative fingerprint_combine functions, and are thus suitable for 3d range-based set reconciliation.