wallet/docs/sybil_attack.md
reaction.la 362b7e653c
Updated to current pandoc format
Which affected all documentation files.
2022-05-07 12:49:33 +10:00

113 lines
5.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Sybil Attack
# katex
...
We are able to pretend that bank transactions are instant, because they are
reversible. And then they get reversed for the wrong reasons, often very bad
reasons.
But in general, with irreversible transactions, hard to pretend they go
through in an instant. Bitcoin times are pretty good, but not good enough to
pay at the cash register.
For this reason, all rhocoin transactions are made with an identity. You can
have as many identities as you please, and these identities are not visible
to third parties unless one of the parties to the transaction chooses to make
them visible.
To support transactions at the cash register, going to need credit rating
agencies. If you make the payment as an identity that has an adequate credit
rating, the cash register trusts that the transaction will go through.
So, how do we stop someone from generating a credit rating by issuing a lot
of sham transactions from his right hand to his left hand, and then buying
something and not paying? The credit card problem.
And how do we stop someone from generating a pile of fake sales, and then
making a real sale and not delivering? The Ebay problem.
There is a solution to this problem. Each rating agency will treat payments
and reviews as a directed graph, and measure graph connectivity. each
subgraph of fake payments, and fake reviews will have good connectivity to
itself and poor connectivity to the rest of the graph.
# SybilGuard
The [SybilGuard paper] provides graph theory solutions against sybils.
[SybilGuard paper]:http://www.math.cmu.edu/~adf/research/SybilGuard.pdf
According to SybilGuard, social networks are empirically determined to be
“fast mixing”
“social networks tend to be fast mixing (defined in the next section), which
necessarily means that subsets of honest nodes have good connectivity to the
rest of the social network”
So the legitimate musician with legitimate fans will be connected to the rest
of the social network his legitimate fans will have interactions with
legitimate fans of other legitimate musicians, and will purchase stuff from
other legitimate musicians, while the fake fans making fake purchases will
not.
“in a fast mixing graph, after a small number of hops a random walk is
equally likely to be traversing any edge in a given hop.”
The paper argues that Sybil subsets of the network are substantially less
likely to be entered after a small number of hops.
“With a length $w$ random walk, clearly the distribution of the ending point
of the walk depends on the starting point. However, for connected and
nonbipartite graphs, the ending point distribution becomes independent of the
starting point when $w → ∞$. This distribution is called the stationary
distribution of the graph. The mixing time T of a graph quantifies how fast
the ending point of a random walk approach the stationary distribution. In
other words, after $Θ(T)$ steps, the node on the random walk becomes roughly
independent of the starting point. If $T = Θ(log(n))$, the graph is called
fast mixing.”
$Θ(f)$ means that the space or running time is always proportional to $f$,
$O(f)$ means that the common worst case space or running time is proportional
to $f$. Engineers generally do not care about such differences, and just use
$O$ everywhere.
If we mix the graph for a suitable time, legitimate nodes will be well mixed
with each other, and Sybil nodes will not be well mixed with legitimate nodes.
Legitimate nodes will have positions that are close to each other in the
space of distributions of end probabilities, and Sybil nodes will be far from
legitimate nodes in that space.
This algorithm is $O(n^2)$ in data size (the distribution for each starting
point), and $Θ(n^2{log(n)})$ in computational time, which is a lot better
than non polynomial, though still apt to be inconveniently large. It can
probably be speeded up considerably by [random dimensional reduction],
wherein a vector of very large dimension (the probability of reaching each
end point, the distribution of end points) is randomly mapped to a vector of
dimension $Θ(log(n))$ log of the original dimension.
[random dimensional reduction]:recognizing_categories_and_instances_as_members_of_a_category.html#dimensional-reduction
"recognizing instances as members of a category"
In this case, the vector space of large dimension is of vectors whose
elements are the probability of arriving at a given node after a random walk
of distance $w$, “the distribution of the ending point of the walk”
Which will, I think, speed it up to $O(n×{{(log (n))}^2})$ which is
acceptable.
Mixing for a suitable time, a suitable path distance, divides the graph into
well mixed regions, one of which is the legitimate nodes. We identify which
of the regions is non Sybil by checking the region of a small number of known
legitimate nodes. If they all belong to the same region, our mixing distance
is sufficient, and chances are the Sybil nodes are not in that region.
During mixing, the points draw closer to each other in the space of
distributions. The fan group of a particular musician will coalesce rapidly,
and then the fan groups of all legitimate coalitions will, more slowly draw
together, while the fake fan group drifts very slowly indeed. We halt the
mixing when the known legitimate nodes are all close to each other, because
by then everyone who is a legitimate node is close to them.
We then downrank reviews or page ranks that are distant from the coalescence.