Deleted DHT design from social networking preparatory to writing
up the new design Added nfs to setup documentation
This commit is contained in:
parent
a856d438e7
commit
495f667c6f
@ -455,185 +455,9 @@ way hash, so are not easily linked to who is posting in the feed.
|
|||||||
|
|
||||||
### Replacing Kademlia
|
### Replacing Kademlia
|
||||||
|
|
||||||
[social distance metric]:recognizing_categories_and_instances.html#Kademlia
|
This design deleted, because its scaling properties turned out to be unexpectedly bad.
|
||||||
{target="_blank"}
|
|
||||||
|
|
||||||
I will describe the Kademlia distributed hash table algorithm not in the
|
I am now writing up a better design.
|
||||||
way that it is normally described and defined, but in such a way that we
|
|
||||||
can easily replace its metric by [social distance metric], assuming that we
|
|
||||||
can construct a suitable metric, which reflects what feeds a given host is
|
|
||||||
following, and what network addresses it knows and the feeds they are
|
|
||||||
following, a quantity over which a distance can be found that reflects how
|
|
||||||
close a peer is to an unstable network address, or knows a peer that is
|
|
||||||
likely to know a peer that is likely to know an unstable network address.
|
|
||||||
|
|
||||||
A distributed hash table works by each peer on the network maintaining a
|
|
||||||
large number of live and active connections to computers such that the
|
|
||||||
distribution of connections to computers distant by the distributed hash
|
|
||||||
table metric is approximately uniform by distance, which distance is for
|
|
||||||
Kademlia the $log_2$ of the exclusive-or between his hash and your hash.
|
|
||||||
|
|
||||||
And when you want to connect to an arbitrary computer, you asked the
|
|
||||||
computers that are nearest in the space to the target for their connections
|
|
||||||
that are closest to the target. And then you connect to those, and ask the
|
|
||||||
same question again.
|
|
||||||
|
|
||||||
This works if each computer has approximately the same number of connections
|
|
||||||
close to it by a metric as distant from it by some metric. So it will be
|
|
||||||
connected to almost all of the computers that are nearby to it by that metric.
|
|
||||||
|
|
||||||
In the course of this operation, you acquire more and more active
|
|
||||||
connections, which you purge from time to time to keep the total number
|
|
||||||
of connections reasonable and the distribution approximately uniform by the
|
|
||||||
metric of distance used.
|
|
||||||
|
|
||||||
The reason that the Kademlia distributed hash table cannot work in the
|
|
||||||
face of enemy action, is that the shills who want to prevent something
|
|
||||||
from being found create a hundred entries with a hash close to their target
|
|
||||||
by Kademlia distance, and then when your search brings you close to
|
|
||||||
target, it brings you to a shill, who misdirects you. Using social network
|
|
||||||
distance resists this attack.
|
|
||||||
|
|
||||||
The messages of the people you are following are likely to be in a
|
|
||||||
relatively small number of repositories, even if the total number of
|
|
||||||
repositories out there is enormous and the number of hashes in each
|
|
||||||
repository is enormous, so this algorithm and data structure will scale, and
|
|
||||||
the responses to that thread that they have approved, by people you are not
|
|
||||||
following, will be commits in that repository, that, by pushing their latest
|
|
||||||
response to that thread to a public repository, they did the equivalent of a
|
|
||||||
git commit and push to that repository.
|
|
||||||
|
|
||||||
Each repository contains all the material the poster has approved, resulting
|
|
||||||
in considerable duplication, but not enormous duplication, approved links and
|
|
||||||
reply-to links – but not every spammer, scammer, and
|
|
||||||
shill in the world can fill your feed with garbage.
|
|
||||||
|
|
||||||
|
|
||||||
### Kademlia in social space
|
|
||||||
|
|
||||||
The vector of an identity is $+1$ for each one bit, and $-1$ for each zero bit.
|
|
||||||
|
|
||||||
We don't use the entire two hundred fifty six dimensional vector, just
|
|
||||||
enough of it that the truncated vector of every identity that anyone might
|
|
||||||
be tracking has a very high probability of being approximately orthogonal
|
|
||||||
to the truncated vector of every other identity.
|
|
||||||
|
|
||||||
We do not have, and do not need, an exact consensus on how much of the
|
|
||||||
vector to actually use, but everyone needs to use roughly the same amount
|
|
||||||
as everyone else. The amount is adjusted according to what is, over time,
|
|
||||||
needed, by each identity adjusting according to circumstances, with the
|
|
||||||
result that over time the consensus adjusts to what is needed.
|
|
||||||
|
|
||||||
Each party indicates what entities he can provide a direct link to by
|
|
||||||
publishing the sum of the vectors of the parties he can link to - and also
|
|
||||||
the sum of the their sums, and also the sum of their ... to as many deep as
|
|
||||||
turns out to be needed in practice, which is likely to two or three such
|
|
||||||
vector sums, maybe four or five. What is needed will depend on the
|
|
||||||
pattern of tracking that people engage in in practice.
|
|
||||||
|
|
||||||
If everyone behind a firewall or with an unstable network address arranges
|
|
||||||
to notify a well known peer with stable network address whenever his
|
|
||||||
address changes, and that peer, as part of the arrangement, includes him in
|
|
||||||
that peer's sum vector, the number of well known peers with stable
|
|
||||||
network address offering this service is not enormously large, they track
|
|
||||||
each other, and everyone tracks some of them, we only need the sum and
|
|
||||||
the sum of sums.
|
|
||||||
|
|
||||||
When someone is looking to find how to connect to an identity, he goes
|
|
||||||
through the entities he can connect to, and looks at the dot product of
|
|
||||||
their sum vectors with target identity vector.
|
|
||||||
|
|
||||||
He contacts the closest entity, or a close entity, and if that does not work
|
|
||||||
out, contacts another. The closest entity will likely be able to contact
|
|
||||||
the target, or contact an entity more likely to be able to contact the target.
|
|
||||||
|
|
||||||
* the identity vector represents the public key of a peer
|
|
||||||
* the sum vector represents what identities a peer thinks he has valid connection information for.
|
|
||||||
* the sum of sum vectors indicate what identities that he thinks he can connect to think that they can connect to.
|
|
||||||
* the sum of the sum of the sum vectors ...
|
|
||||||
|
|
||||||
A vector that provides the paths to connect to a billion entities, each of
|
|
||||||
them redundantly through a thousand different paths, is still sixty or so
|
|
||||||
thirty two bit signed integers, distributed in a normal distribution with a
|
|
||||||
variance of a million or so, but everyone has to store quite a lot of such
|
|
||||||
vectors. Small devices such as phones can get away with tracking a small
|
|
||||||
number of such integers, at the cost of needing more lookups, hence not being
|
|
||||||
very useful for other people to track for connection information.
|
|
||||||
|
|
||||||
To prevent hostile parties from jamming the network by registering
|
|
||||||
identities that closely approximate identities that they do not want people
|
|
||||||
to be able to look up, we need the system to work in such a way that
|
|
||||||
identities that lots of people want to look up tend to heavily over
|
|
||||||
represented in sum of sums vectors relative to those that no one wants to
|
|
||||||
look up. If you repeatedly provide lookup services for a certain entity,
|
|
||||||
you should track that entity that had last stable network address on the
|
|
||||||
path that proved successful to the target entity, so that peers that
|
|
||||||
provide useful tracking information are over represented, and entities that
|
|
||||||
provide useless tracking information are under represented.
|
|
||||||
|
|
||||||
If an entity makes publicly available network address information for an
|
|
||||||
identity whose vector is an improbably good approximation to an existing
|
|
||||||
widely looked up vector, a sybil attack is under way, and needs to be
|
|
||||||
ignored.
|
|
||||||
|
|
||||||
To be efficient at very large scale, the network should contain a relatively
|
|
||||||
small number of large well connected devices each of which tracks the
|
|
||||||
tracking information of large number of other such computers, and a large
|
|
||||||
number of smaller, less well connected devices, that track their friends and
|
|
||||||
acquaintances, and also track well connected devices. Big fanout on on the
|
|
||||||
interior vertices, smaller fanout on the exterior vertices, stable identities
|
|
||||||
on all devices, moderately stable network addresses on the interior vertices,
|
|
||||||
possibly unstable network addresses on the exterior vertices.
|
|
||||||
|
|
||||||
If we have a thousand identities that are making public the information
|
|
||||||
needed to make connection to them, and everyone tracks all the peers that
|
|
||||||
provide third party look up service, we need only the first sum, and only
|
|
||||||
about twenty dimensions.
|
|
||||||
|
|
||||||
But if everyone attempts to track all the connection information network
|
|
||||||
for all peers that provide third party lookup services, there are soon going
|
|
||||||
to be a whole lot shill, entryist, and spammer peers purporting to provide
|
|
||||||
such services, whereupon we will need white lists, grey lists, and human
|
|
||||||
judgement, and not everyone will track all peers who are providing third
|
|
||||||
party lookup services, whereupon we need the first two sums.
|
|
||||||
|
|
||||||
In that case random peer searching for connection information to another
|
|
||||||
random peer first looks to through those for which has good connection
|
|
||||||
information, does not find the target. Then looks through for someone
|
|
||||||
connected to the target, may not find him, then looks for someone
|
|
||||||
connected to someone connected to the target and, assuming that most
|
|
||||||
genuine peers providing tracking information are tracking most other
|
|
||||||
peers providing genuine tracking information, and the peer doing the
|
|
||||||
search has the information for a fair number of peers providing genuine
|
|
||||||
tracking information, will find him.
|
|
||||||
|
|
||||||
Suppose there are a billion peers for which tracking information exists. In
|
|
||||||
that case, we need the first seventy or so dimensions, and possibly one
|
|
||||||
more level of indirection in the lookup (the sum of the sum of the sum of
|
|
||||||
vectors being tracked). Suppose a trillion peers, then about the first eighty
|
|
||||||
dimensions, and possibly one more level of indirection in the lookup.
|
|
||||||
|
|
||||||
That is a quite large amount of data, but if who is tracking whom is stable,
|
|
||||||
even if the network addresses are unstable, updates are infrequent and small.
|
|
||||||
|
|
||||||
If everyone tracks ten thousand identities, and we have a billion identities
|
|
||||||
whose network address is being made public, and million always up peers
|
|
||||||
with fairly stable network addresses, each of whom tracks one thousand
|
|
||||||
unstable network addresses and several thousand other peers who also
|
|
||||||
track large numbers of unstable addresses, then we need about fifty
|
|
||||||
dimensions and two sum vectors for each entity being tracked, about a
|
|
||||||
million integers, total -- too big to be downloaded in full every time, but
|
|
||||||
not a problem if downloaded in small updates, or downloaded in full
|
|
||||||
infrequently.
|
|
||||||
|
|
||||||
But suppose no one specializes in tracking unstable network addresses.
|
|
||||||
If your network address is unstable, you only provide updates to those
|
|
||||||
following your feed, and if you have a lot of followers, you have to get a
|
|
||||||
stable network address with a stable open port so that you do not have to
|
|
||||||
update them all the time. Then our list of identities whose connection
|
|
||||||
information we track will be considerably smaller, but our level of
|
|
||||||
indirection considerably deeper - possibly needing six or so deep in sum of
|
|
||||||
the sum of ... sum of identity vectors.
|
|
||||||
|
|
||||||
## Private messaging
|
## Private messaging
|
||||||
|
|
||||||
|
@ -1,3 +1,10 @@
|
|||||||
body {
|
body {
|
||||||
|
max-width: 30em;
|
||||||
|
margin-left: 1em;
|
||||||
|
font-family:"DejaVu Serif", "Georgia", serif;
|
||||||
|
font-style: normal;
|
||||||
|
font-variant: normal;
|
||||||
|
font-weight: normal;
|
||||||
|
font-stretch: normal;
|
||||||
font-size: 100%;
|
font-size: 100%;
|
||||||
}
|
}
|
||||||
|
@ -3839,3 +3839,63 @@ Not much work has been done on this project recently, though development and mai
|
|||||||
## Freenet
|
## Freenet
|
||||||
|
|
||||||
See [libraries](../libraries.html#freenet)
|
See [libraries](../libraries.html#freenet)
|
||||||
|
|
||||||
|
# Network file system
|
||||||
|
|
||||||
|
This is most useful when you have a lot of real and
|
||||||
|
virtual machines on your local network
|
||||||
|
|
||||||
|
## Server
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo apt update && sudo apt upgrade -qy
|
||||||
|
sudo apt install -qy nfs-kernel-server nfs-common.
|
||||||
|
sudo nano /etc/default/nfs-common
|
||||||
|
```
|
||||||
|
|
||||||
|
In the configuration file `nfs-common` change the paramter NEED_STATD to no and NEED_IDMAPD to yes. The NFSv4 required NEED_IDMAPD that will be used as the ID mapping daemon and provides functionality between the server and client.
|
||||||
|
|
||||||
|
```terminal_image
|
||||||
|
NEED_STATD="no"
|
||||||
|
NEED_IDMAPD="yes"
|
||||||
|
```
|
||||||
|
|
||||||
|
Then to disable nfs3 `sudo nano /etc/default/nfs-kernel-server`
|
||||||
|
|
||||||
|
```terminal_image
|
||||||
|
RPCNFSDOPTS="-N 2 -N 3"
|
||||||
|
RPCMOUNTDOPTS="--manage-gids -N 2 -N 3"
|
||||||
|
```
|
||||||
|
|
||||||
|
then to export the root of your nfs file system: `sudo nano /etc/exports`
|
||||||
|
|
||||||
|
```terminal_image
|
||||||
|
/nfs 192.168.1.0/24(rw,async,fsid=0,crossmnt,no_subtree_check,no_root_squash)
|
||||||
|
```
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo systemctl restart nfs-server
|
||||||
|
sudo showmount -e
|
||||||
|
```
|
||||||
|
|
||||||
|
## client
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo apt update && sudo apt upgrade -qy
|
||||||
|
sudo apt install -qy nfs-common
|
||||||
|
sudo mkdir «mydirectory»
|
||||||
|
sudo nano /etc/fstab
|
||||||
|
```
|
||||||
|
|
||||||
|
```terminal_image
|
||||||
|
# <file system> <mount point> <type> <options> <dump> <pass>
|
||||||
|
«mynfsserver».local:/ «mydirectory» nfs4 _netdev 0 0
|
||||||
|
```
|
||||||
|
|
||||||
|
Where the «funny brackets», as always, indicate mutas mutandis.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo mount -a
|
||||||
|
sudo df -h
|
||||||
|
```
|
||||||
|
Loading…
Reference in New Issue
Block a user