Deleted DHT design from social networking preparatory to writing

up the new design Added nfs to setup documentation
2024-02-12 00:12:58 +00:00 · 2024-02-12 00:12:58 +00:00 · 495f667c6f
commit 495f667c6f
parent a856d438e7
3 changed files with 69 additions and 178 deletions
--- a/docs/manifesto/social_networking.md
+++ b/docs/manifesto/social_networking.md
@ -455,185 +455,9 @@ way hash, so are not easily linked to who is posting in the feed.

 ### Replacing Kademlia

- [social distance metric]:recognizing_categories_and_instances.html#Kademlia
-{target="_blank"}
+This design deleted, because its scaling properties turned out to be unexpectedly bad.

- I will describe the Kademlia distributed hash table algorithm not in the
- way that it is normally described and defined, but in such a way that we
- can easily replace its metric by [social distance metric], assuming that we
- can construct a suitable metric, which reflects what feeds a given host is
- following, and what network addresses it knows and the feeds they are
- following, a quantity over which a distance can be found that reflects how
- close a peer is to an unstable network address, or knows a peer that is
- likely to know a peer that is likely to know an unstable network address.
-
-A distributed hash table works by each peer on the network maintaining a
-large number of live and active connections to computers such that the
-distribution of connections to computers distant by the  distributed hash
-table metric is approximately uniform by distance, which distance is for
-Kademlia the $log_2$  of the exclusive-or between his hash and your hash.
-
- And when you want to connect to an arbitrary computer, you asked the
- computers that are nearest in the space to the target for their connections
- that are closest to the target.  And then you connect to those, and ask the
- same question again.
-
- This works if each computer has approximately the same number of connections 
- close to it by a metric as distant from it by some metric.  So it will be
- connected to almost all of the computers that are nearby to it by that metric.
-
- In the course of this operation, you acquire more and more active
- connections, which you purge from time to time to keep the total number
- of connections reasonable and the distribution approximately uniform by the
- metric of distance used.
- 
-  The reason that the Kademlia distributed hash table cannot work in the
- face of enemy action, is that the shills who want to prevent something
- from being found create a hundred entries with a hash close to their target
- by Kademlia distance, and then when your search brings you close to
- target, it brings you to a shill, who misdirects you.  Using social network
- distance resists this attack.
-
-The messages of the people you are following are likely to be in a
-relatively small number of repositories, even if the total number of
-repositories out there is enormous and the number of hashes in each
-repository is enormous, so this algorithm and data structure will scale, and
-the responses to that thread that they have approved, by people you are not
-following, will be commits in that repository, that, by pushing their latest
-response to that thread to  a public repository, they did the equivalent of a
-git commit and push to that repository.
-
-Each repository contains all the material the poster has approved, resulting
-in considerable duplication, but not enormous duplication, approved links and
-reply-to links – but not every spammer, scammer, and
- shill in the world can fill your feed with garbage.
-
-
-### Kademlia in social space
-
-The vector of an identity is $+1$ for each one bit, and $-1$ for each zero bit.
-
-We don't use the entire two hundred fifty six dimensional vector, just
-enough of it that the truncated vector of every identity that anyone might
-be tracking has a very high probability of being approximately orthogonal
-to the truncated vector of every other identity.
-
-We do not have, and do not need, an exact consensus on how much of the
-vector to actually use, but everyone needs to use roughly the same amount
-as everyone else.  The amount is adjusted according to what is, over time,
-needed, by each identity adjusting according to circumstances, with the
-result that over time the consensus adjusts to what is needed.
-
-Each party indicates what entities he can provide a direct link to by
-publishing the sum of the vectors of the parties he can link to - and also
-the sum of the their sums, and also the sum of their ...  to as many deep as
-turns out to be needed in practice, which is likely to  two or three such
-vector sums, maybe four or five.  What is needed will depend on the
-pattern of tracking that people engage in in practice.
-
-If everyone behind a firewall or with an unstable network address arranges
-to notify a well known peer with stable network address whenever his
-address changes, and that peer, as part of the arrangement, includes him in
-that peer's sum vector, the number of well known peers with stable
-network address offering this service is not enormously large, they track
-each other, and everyone tracks some of them, we only need the sum and
-the sum of sums.
-
-When someone is looking to find how to connect to an identity, he goes
-through the entities he can connect to, and looks at the dot product of
-their sum vectors with target identity vector.
-
-He contacts the closest entity, or a close entity, and if that does not work
-out, contacts another.  The closest entity will likely be able to contact
-the target, or contact an entity more likely to be able to contact the target.
-
-* the identity vector represents the public key of a peer
-* the sum vector represents what identities a peer thinks he has valid connection information for.
-* the sum of sum vectors indicate what identities that he thinks he can connect to think that they can connect to.
-* the sum of the sum of the sum vectors ...
-
-A vector that provides the paths to connect to a billion entities, each of
-them redundantly through a thousand different paths, is still sixty or so 
-thirty two bit signed integers, distributed in a normal distribution with a
-variance of a million or so, but everyone has to store quite a lot of such
-vectors.  Small devices such as phones can get away with tracking a small
-number of such integers, at the cost of needing more lookups, hence not being
-very useful for other people to track for connection information.
-
-To prevent hostile parties from jamming the network by registering
- identities that closely approximate identities that they do not want people
- to be able to look up, we need the system to work in such a way that
- identities that lots of people want to look up tend to heavily over
- represented in sum of sums vectors relative to those that no one wants to
- look up.  If you repeatedly provide lookup services for a certain entity,
- you should track that entity that had last stable network address on the
- path that proved successful to the target entity, so that peers that
- provide useful tracking information are over represented, and entities that
- provide useless tracking information are under represented.
-
- If an entity makes publicly available network address information for an
- identity whose vector is an improbably good approximation to an existing
- widely looked up vector, a sybil attack is under way, and needs to be
- ignored.
-
-To be efficient at very large scale, the network should contain a relatively
-small number of large well connected devices each of which tracks the
-tracking information of large number of other such computers, and a large
-number of smaller, less well connected devices, that track their friends and
-acquaintances, and also track well connected devices.  Big fanout on on the
-interior vertices, smaller fanout on the exterior vertices, stable identities
-on all devices, moderately stable network addresses on the interior vertices,
-possibly unstable network addresses on the exterior vertices.
-
-If we have a thousand identities that are making public the information
-needed to make connection to them, and everyone tracks all the peers that
-provide third party look up service, we need only the first sum, and only
-about twenty dimensions.
-
-But if everyone attempts to track all the connection information network
-for all peers that provide third party lookup services, there are soon going
-to be a whole lot shill, entryist, and spammer peers purporting to provide
-such services, whereupon we will need white lists, grey lists, and human
-judgement, and not everyone will track all peers who are providing third
-party lookup services, whereupon we need the first two sums.
-
-In that case random peer searching for connection information to another
-random peer first looks to through those for which has good connection
-information, does not find the target.  Then looks through for someone
-connected to the target, may not find him, then looks for someone
-connected to someone connected to the target and, assuming that most
-genuine peers providing tracking information are tracking most other
-peers providing genuine tracking information, and the peer doing the
-search has the information for a fair number of peers providing genuine
-tracking information, will find him.
-
-Suppose there are a billion peers for which tracking information exists.  In
-that case, we need the first seventy or so dimensions, and possibly one
-more level of indirection in the lookup (the sum of the sum of the sum of
-vectors being tracked).  Suppose a trillion peers, then about the first eighty
-dimensions, and possibly one more level of indirection in the lookup.
-
-That is a quite large amount of data, but if who is tracking whom is stable,
-even if the network addresses are unstable, updates are infrequent and small.
-
-If everyone tracks ten thousand identities, and we have a billion identities
-whose network address is being made public, and million always up peers 
-with fairly stable network addresses, each of whom tracks one thousand
-unstable network addresses and several thousand other peers who also
-track large numbers of unstable addresses, then we need about fifty
-dimensions and two sum vectors for each entity being tracked, about a
-million integers, total -- too big to be downloaded in full every time, but
-not a problem if downloaded in small updates, or downloaded in full
-infrequently. 
-
-But suppose no one specializes in tracking unstable network addresses. 
-If your network address is unstable, you only provide updates to those
-following your feed, and if you have a lot of followers, you have to get a
-stable network address with a stable open port so that you do not have to
-update them all the time.  Then our list of identities whose connection
-information we track will be considerably smaller, but our level of
-indirection considerably deeper - possibly needing six or so deep in sum of
-the sum of ... sum of identity vectors.
+I am now writing up a better design.

 ##   Private messaging

--- a/docs/pandoc_templates/vscode.css
+++ b/docs/pandoc_templates/vscode.css
@ -1,3 +1,10 @@
 body {
+	max-width: 30em;
+	margin-left: 1em;
+	font-family:"DejaVu Serif", "Georgia", serif;
+	font-style: normal;
+	font-variant: normal;
+	font-weight: normal;
+	font-stretch: normal;
 	font-size: 100%;
 	}
--- a/docs/setup/set_up_build_environments.md
+++ b/docs/setup/set_up_build_environments.md
@ -3839,3 +3839,63 @@ Not much work has been done on this project recently, though development and mai
 ## Freenet

 See [libraries](../libraries.html#freenet)
+
+# Network file system
+
+This is most useful when you have a lot of real and
+virtual machines on your local network
+
+## Server
+
+```bash
+sudo apt update && sudo apt upgrade -qy
+sudo apt install -qy nfs-kernel-server nfs-common.
+sudo nano /etc/default/nfs-common
+```
+
+In the configuration file `nfs-common` change the paramter NEED_STATD to no and NEED_IDMAPD to yes. The NFSv4 required NEED_IDMAPD that will be used as the ID mapping daemon and provides functionality between the server and client.
+
+```terminal_image
+NEED_STATD="no"
+NEED_IDMAPD="yes"
+```
+
+Then to disable nfs3 `sudo nano /etc/default/nfs-kernel-server`
+
+```terminal_image
+RPCNFSDOPTS="-N 2 -N 3"
+RPCMOUNTDOPTS="--manage-gids -N 2 -N 3"
+```
+
+then to export the root of your nfs file system: `sudo nano /etc/exports`
+
+```terminal_image
+/nfs     192.168.1.0/24(rw,async,fsid=0,crossmnt,no_subtree_check,no_root_squash)
+```
+
+```bash
+sudo systemctl restart nfs-server
+sudo showmount -e
+```
+
+## client
+
+```bash
+sudo apt update && sudo apt upgrade -qy
+sudo apt install -qy nfs-common
+sudo mkdir «mydirectory»
+sudo nano /etc/fstab
+```
+
+```terminal_image
+# <file system>       <mount point> <type> <options> <dump> <pass>
+«mynfsserver».local:/ «mydirectory» nfs4   _netdev   0      0
+```
+
+Where the «funny brackets», as always, indicate mutas mutandis.
+
+```bash
+sudo systemctl daemon-reload
+sudo mount -a
+sudo df -h
+```