forked from cheng/wallet
Deleted DHT design from social networking preparatory to writing
up the new design Added nfs to setup documentation
This commit is contained in:
parent
a856d438e7
commit
495f667c6f
@ -455,185 +455,9 @@ way hash, so are not easily linked to who is posting in the feed.
|
||||
|
||||
### Replacing Kademlia
|
||||
|
||||
[social distance metric]:recognizing_categories_and_instances.html#Kademlia
|
||||
{target="_blank"}
|
||||
This design deleted, because its scaling properties turned out to be unexpectedly bad.
|
||||
|
||||
I will describe the Kademlia distributed hash table algorithm not in the
|
||||
way that it is normally described and defined, but in such a way that we
|
||||
can easily replace its metric by [social distance metric], assuming that we
|
||||
can construct a suitable metric, which reflects what feeds a given host is
|
||||
following, and what network addresses it knows and the feeds they are
|
||||
following, a quantity over which a distance can be found that reflects how
|
||||
close a peer is to an unstable network address, or knows a peer that is
|
||||
likely to know a peer that is likely to know an unstable network address.
|
||||
|
||||
A distributed hash table works by each peer on the network maintaining a
|
||||
large number of live and active connections to computers such that the
|
||||
distribution of connections to computers distant by the distributed hash
|
||||
table metric is approximately uniform by distance, which distance is for
|
||||
Kademlia the $log_2$ of the exclusive-or between his hash and your hash.
|
||||
|
||||
And when you want to connect to an arbitrary computer, you asked the
|
||||
computers that are nearest in the space to the target for their connections
|
||||
that are closest to the target. And then you connect to those, and ask the
|
||||
same question again.
|
||||
|
||||
This works if each computer has approximately the same number of connections
|
||||
close to it by a metric as distant from it by some metric. So it will be
|
||||
connected to almost all of the computers that are nearby to it by that metric.
|
||||
|
||||
In the course of this operation, you acquire more and more active
|
||||
connections, which you purge from time to time to keep the total number
|
||||
of connections reasonable and the distribution approximately uniform by the
|
||||
metric of distance used.
|
||||
|
||||
The reason that the Kademlia distributed hash table cannot work in the
|
||||
face of enemy action, is that the shills who want to prevent something
|
||||
from being found create a hundred entries with a hash close to their target
|
||||
by Kademlia distance, and then when your search brings you close to
|
||||
target, it brings you to a shill, who misdirects you. Using social network
|
||||
distance resists this attack.
|
||||
|
||||
The messages of the people you are following are likely to be in a
|
||||
relatively small number of repositories, even if the total number of
|
||||
repositories out there is enormous and the number of hashes in each
|
||||
repository is enormous, so this algorithm and data structure will scale, and
|
||||
the responses to that thread that they have approved, by people you are not
|
||||
following, will be commits in that repository, that, by pushing their latest
|
||||
response to that thread to a public repository, they did the equivalent of a
|
||||
git commit and push to that repository.
|
||||
|
||||
Each repository contains all the material the poster has approved, resulting
|
||||
in considerable duplication, but not enormous duplication, approved links and
|
||||
reply-to links – but not every spammer, scammer, and
|
||||
shill in the world can fill your feed with garbage.
|
||||
|
||||
|
||||
### Kademlia in social space
|
||||
|
||||
The vector of an identity is $+1$ for each one bit, and $-1$ for each zero bit.
|
||||
|
||||
We don't use the entire two hundred fifty six dimensional vector, just
|
||||
enough of it that the truncated vector of every identity that anyone might
|
||||
be tracking has a very high probability of being approximately orthogonal
|
||||
to the truncated vector of every other identity.
|
||||
|
||||
We do not have, and do not need, an exact consensus on how much of the
|
||||
vector to actually use, but everyone needs to use roughly the same amount
|
||||
as everyone else. The amount is adjusted according to what is, over time,
|
||||
needed, by each identity adjusting according to circumstances, with the
|
||||
result that over time the consensus adjusts to what is needed.
|
||||
|
||||
Each party indicates what entities he can provide a direct link to by
|
||||
publishing the sum of the vectors of the parties he can link to - and also
|
||||
the sum of the their sums, and also the sum of their ... to as many deep as
|
||||
turns out to be needed in practice, which is likely to two or three such
|
||||
vector sums, maybe four or five. What is needed will depend on the
|
||||
pattern of tracking that people engage in in practice.
|
||||
|
||||
If everyone behind a firewall or with an unstable network address arranges
|
||||
to notify a well known peer with stable network address whenever his
|
||||
address changes, and that peer, as part of the arrangement, includes him in
|
||||
that peer's sum vector, the number of well known peers with stable
|
||||
network address offering this service is not enormously large, they track
|
||||
each other, and everyone tracks some of them, we only need the sum and
|
||||
the sum of sums.
|
||||
|
||||
When someone is looking to find how to connect to an identity, he goes
|
||||
through the entities he can connect to, and looks at the dot product of
|
||||
their sum vectors with target identity vector.
|
||||
|
||||
He contacts the closest entity, or a close entity, and if that does not work
|
||||
out, contacts another. The closest entity will likely be able to contact
|
||||
the target, or contact an entity more likely to be able to contact the target.
|
||||
|
||||
* the identity vector represents the public key of a peer
|
||||
* the sum vector represents what identities a peer thinks he has valid connection information for.
|
||||
* the sum of sum vectors indicate what identities that he thinks he can connect to think that they can connect to.
|
||||
* the sum of the sum of the sum vectors ...
|
||||
|
||||
A vector that provides the paths to connect to a billion entities, each of
|
||||
them redundantly through a thousand different paths, is still sixty or so
|
||||
thirty two bit signed integers, distributed in a normal distribution with a
|
||||
variance of a million or so, but everyone has to store quite a lot of such
|
||||
vectors. Small devices such as phones can get away with tracking a small
|
||||
number of such integers, at the cost of needing more lookups, hence not being
|
||||
very useful for other people to track for connection information.
|
||||
|
||||
To prevent hostile parties from jamming the network by registering
|
||||
identities that closely approximate identities that they do not want people
|
||||
to be able to look up, we need the system to work in such a way that
|
||||
identities that lots of people want to look up tend to heavily over
|
||||
represented in sum of sums vectors relative to those that no one wants to
|
||||
look up. If you repeatedly provide lookup services for a certain entity,
|
||||
you should track that entity that had last stable network address on the
|
||||
path that proved successful to the target entity, so that peers that
|
||||
provide useful tracking information are over represented, and entities that
|
||||
provide useless tracking information are under represented.
|
||||
|
||||
If an entity makes publicly available network address information for an
|
||||
identity whose vector is an improbably good approximation to an existing
|
||||
widely looked up vector, a sybil attack is under way, and needs to be
|
||||
ignored.
|
||||
|
||||
To be efficient at very large scale, the network should contain a relatively
|
||||
small number of large well connected devices each of which tracks the
|
||||
tracking information of large number of other such computers, and a large
|
||||
number of smaller, less well connected devices, that track their friends and
|
||||
acquaintances, and also track well connected devices. Big fanout on on the
|
||||
interior vertices, smaller fanout on the exterior vertices, stable identities
|
||||
on all devices, moderately stable network addresses on the interior vertices,
|
||||
possibly unstable network addresses on the exterior vertices.
|
||||
|
||||
If we have a thousand identities that are making public the information
|
||||
needed to make connection to them, and everyone tracks all the peers that
|
||||
provide third party look up service, we need only the first sum, and only
|
||||
about twenty dimensions.
|
||||
|
||||
But if everyone attempts to track all the connection information network
|
||||
for all peers that provide third party lookup services, there are soon going
|
||||
to be a whole lot shill, entryist, and spammer peers purporting to provide
|
||||
such services, whereupon we will need white lists, grey lists, and human
|
||||
judgement, and not everyone will track all peers who are providing third
|
||||
party lookup services, whereupon we need the first two sums.
|
||||
|
||||
In that case random peer searching for connection information to another
|
||||
random peer first looks to through those for which has good connection
|
||||
information, does not find the target. Then looks through for someone
|
||||
connected to the target, may not find him, then looks for someone
|
||||
connected to someone connected to the target and, assuming that most
|
||||
genuine peers providing tracking information are tracking most other
|
||||
peers providing genuine tracking information, and the peer doing the
|
||||
search has the information for a fair number of peers providing genuine
|
||||
tracking information, will find him.
|
||||
|
||||
Suppose there are a billion peers for which tracking information exists. In
|
||||
that case, we need the first seventy or so dimensions, and possibly one
|
||||
more level of indirection in the lookup (the sum of the sum of the sum of
|
||||
vectors being tracked). Suppose a trillion peers, then about the first eighty
|
||||
dimensions, and possibly one more level of indirection in the lookup.
|
||||
|
||||
That is a quite large amount of data, but if who is tracking whom is stable,
|
||||
even if the network addresses are unstable, updates are infrequent and small.
|
||||
|
||||
If everyone tracks ten thousand identities, and we have a billion identities
|
||||
whose network address is being made public, and million always up peers
|
||||
with fairly stable network addresses, each of whom tracks one thousand
|
||||
unstable network addresses and several thousand other peers who also
|
||||
track large numbers of unstable addresses, then we need about fifty
|
||||
dimensions and two sum vectors for each entity being tracked, about a
|
||||
million integers, total -- too big to be downloaded in full every time, but
|
||||
not a problem if downloaded in small updates, or downloaded in full
|
||||
infrequently.
|
||||
|
||||
But suppose no one specializes in tracking unstable network addresses.
|
||||
If your network address is unstable, you only provide updates to those
|
||||
following your feed, and if you have a lot of followers, you have to get a
|
||||
stable network address with a stable open port so that you do not have to
|
||||
update them all the time. Then our list of identities whose connection
|
||||
information we track will be considerably smaller, but our level of
|
||||
indirection considerably deeper - possibly needing six or so deep in sum of
|
||||
the sum of ... sum of identity vectors.
|
||||
I am now writing up a better design.
|
||||
|
||||
## Private messaging
|
||||
|
||||
|
@ -1,3 +1,10 @@
|
||||
body {
|
||||
max-width: 30em;
|
||||
margin-left: 1em;
|
||||
font-family:"DejaVu Serif", "Georgia", serif;
|
||||
font-style: normal;
|
||||
font-variant: normal;
|
||||
font-weight: normal;
|
||||
font-stretch: normal;
|
||||
font-size: 100%;
|
||||
}
|
||||
|
@ -3839,3 +3839,63 @@ Not much work has been done on this project recently, though development and mai
|
||||
## Freenet
|
||||
|
||||
See [libraries](../libraries.html#freenet)
|
||||
|
||||
# Network file system
|
||||
|
||||
This is most useful when you have a lot of real and
|
||||
virtual machines on your local network
|
||||
|
||||
## Server
|
||||
|
||||
```bash
|
||||
sudo apt update && sudo apt upgrade -qy
|
||||
sudo apt install -qy nfs-kernel-server nfs-common.
|
||||
sudo nano /etc/default/nfs-common
|
||||
```
|
||||
|
||||
In the configuration file `nfs-common` change the paramter NEED_STATD to no and NEED_IDMAPD to yes. The NFSv4 required NEED_IDMAPD that will be used as the ID mapping daemon and provides functionality between the server and client.
|
||||
|
||||
```terminal_image
|
||||
NEED_STATD="no"
|
||||
NEED_IDMAPD="yes"
|
||||
```
|
||||
|
||||
Then to disable nfs3 `sudo nano /etc/default/nfs-kernel-server`
|
||||
|
||||
```terminal_image
|
||||
RPCNFSDOPTS="-N 2 -N 3"
|
||||
RPCMOUNTDOPTS="--manage-gids -N 2 -N 3"
|
||||
```
|
||||
|
||||
then to export the root of your nfs file system: `sudo nano /etc/exports`
|
||||
|
||||
```terminal_image
|
||||
/nfs 192.168.1.0/24(rw,async,fsid=0,crossmnt,no_subtree_check,no_root_squash)
|
||||
```
|
||||
|
||||
```bash
|
||||
sudo systemctl restart nfs-server
|
||||
sudo showmount -e
|
||||
```
|
||||
|
||||
## client
|
||||
|
||||
```bash
|
||||
sudo apt update && sudo apt upgrade -qy
|
||||
sudo apt install -qy nfs-common
|
||||
sudo mkdir «mydirectory»
|
||||
sudo nano /etc/fstab
|
||||
```
|
||||
|
||||
```terminal_image
|
||||
# <file system> <mount point> <type> <options> <dump> <pass>
|
||||
«mynfsserver».local:/ «mydirectory» nfs4 _netdev 0 0
|
||||
```
|
||||
|
||||
Where the «funny brackets», as always, indicate mutas mutandis.
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo mount -a
|
||||
sudo df -h
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user