Random peer sampling (literature thread)

We’ve had a couple conversations about random peer sampling problems, both around improving mix network PKIs, as well as directly for usage in libp2p. I’ll start a thread for discussing the literature:

Brahms provides a basic system for actively gossiping nodes, while restricting access to the node list from which you sample.

Julius Bunger implemented and improved Brahms for GNUNet, which sounds quite promising, but the improvements seem under documented so far.

There is an interesting measure of randomness in sampling proposed in Uniform Node Sampling Service Robust against Collusions of Malicious Nodes by Emmanuelle Anceaume, Yann Busnel, and Bruno Sericola.

Newscast and Spray (github) seemingly worry only about performance, not protecting past or future samples, and perhaps not uniformity either.

As I wrote elsewhere, we could temporarily go with a Tor-like global PKI as “accounts” associated to some parachain, which sounds semi-viable long-term if we’re paying all the nodes anyways. We’d require “lite clients” to sample this PKI parachain without downloading everything.

If relays know all “accounts” then we might sample using epsilon-PIR backed by some space efficient data structure. I’ll ask/check about the space efficency of existing data structures used for storing accounts, but recently a couple papers had PIR backed by cuckoo tables, even without fully exploiting recent cuckoo table improvements, but that’s not compatible with being distributed. If relays do not know all accounts, then one could adapt the epsilon-PIR results to incomplete databases, but maybe this leaks excessively.

Recently tomaka has started implementing Brahms for libp2p:

Aside from fun, tomaka found serious problems with the current DHT based mechanism for peer discovery

And notes that Ethereum has similar problems: