diff --git a/docs/blockdag_consensus.md b/docs/blockdag_consensus.md index 2f19616..a4fa579 100644 --- a/docs/blockdag_consensus.md +++ b/docs/blockdag_consensus.md @@ -77,11 +77,11 @@ from the same representative sample. For each peer that could be on the network, including those that have been sleeping in a cold wallet for years, each peer keeps a running cumulative -total of that peers stake. With every new block, the peers stake is added to +total of that peers shares. With every new block, the peers shares is added to its total. On each block of the chain, a peer’s rank is the bit position of the highest -bit of the running total that rolled over when its stake was added for that +bit of the running total that rolled over when its shares was added for that block. *edit note* @@ -94,13 +94,13 @@ Which gives the same outcome, that on average and over time, the total weight wi *end edit note* -So if Bob has a third of the stake of Carol, and $N$ is a rank that -corresponds to bit position higher than the stake of either of them, then +So if Bob has a third of the shares of Carol, and $N$ is a rank that +corresponds to bit position higher than the shares of either of them, then Bob gets to be rank $R$ or higher one third as often as Carol. But even if -his stake is very low, he gets to be high rank every now and then. +his he has a very small shareholding, he gets to be high rank every now and then. A small group of the highest ranking peers get to decide on the next block, -and the likelihood of being a high ranking peer depends on stake. +and the likelihood of being a high ranking peer depends on shares. They produce the next block by unanimous agreement and joint signature. The group is small enough that this is likely to succeed, and if they do not, @@ -239,7 +239,7 @@ this, and the system needs to be able to produce a sensible result even if some peers maliciously or through failure do not generate sequential signature sequence numbers. -Which, in the event of a fork, will on average reflect the total stake of +Which, in the event of a fork, will on average reflect the total shares of peers on that fork. If two prongs have the same weight, take the prong with the most transactions. If they have the same weight and the same number of transactions, hash all the public keys of the signatories that formed the @@ -247,7 +247,7 @@ blocks, their ranks, and the block height of the root of the fork and take the prong with the largest hash. This value, the weight of a prong of the fork will, over time for large deep -forks, approximate the stake of the peers online on that prong, without the +forks, approximate the shares of the peers online on that prong, without the painful cost taking a poll of all peers online, and without the considerable risk that that poll will be jammed by hostile parties. @@ -444,7 +444,7 @@ I have become inclined to believe that there is no way around making some peers special, but we need to distribute the specialness fairly and uniformly, so that every peer get his turn being special at a certain block height, with the proportion of block heights at which he is special being -proportional to his stake. +proportional to his shares. If the number of peers that have a special role in forming the next block is very small, and the selection and organization of those peers is not @@ -554,7 +554,7 @@ while blocks that received the other branch first continue to work on that branch, until one branch gets ahead of the other branch, whereupon the leading branch spreads rapidly through the peers. With proof of share, that is not going work, one can lengthen a branch as fast as you please. Instead, -each branch has to be accompanied by evidence of the weight of stake of +each branch has to be accompanied by evidence of the weight of shares of peers on that branch. Which means the winning branch can start spreading immediately. @@ -681,17 +681,17 @@ limit, see: themselves can commit transactions through the peers, if the clients themselves hold the secret keys and do not need to trust the peers. -# Calculating the stake of a peer +# Calculating the shares represented by a peer We intend that peers will hold no valuable or lasting secrets, that all the value and the power will be in client wallets, and the client wallets with most of the value, who should have most of the power, will seldom be online. -I propose proof of share. The stake of a peer is not the stake it owns, but -the stake that it has injected into the blockchain on behalf of its clients -and that its clients have not spent yet, or stake that some client wallet +I propose proof of share. The shares of a peer is not the shares it owns, but +the shares that it has injected into the blockchain on behalf of its clients +and that its clients have not spent yet, or shares that some client wallet somewhere has chosen to be represented by that peer. Likely only the -whales will make a deliberate and conscious decision to have their stake +whales will make a deliberate and conscious decision to have their shares represented by a peer, and it will be a peer that they likely control, or that someone they have some relationship with controls, but not necessarily a peer that they use for transactions. diff --git a/docs/contracts_on_blockchain.md b/docs/contracts_on_blockchain.md index 4c41e41..5cff6fd 100644 --- a/docs/contracts_on_blockchain.md +++ b/docs/contracts_on_blockchain.md @@ -95,7 +95,7 @@ cryptographic mathematics, but by the fact that our blockchain, unlike the others, is organized around client wallets chatting privately with other client wallets. Every other blockchain has necessary cryptographic mathematics to do the equivalent, usually more powerfull and general -than anything on the rhocoin blockchain, and Monaro has immensely +than anything on the rhocoin blockchain, and Monero has immensely superior cryptographic capabilities, but somehow, they don’t, the difference being that rhocoin is designed to avoid uses of the internet that render a blockchain vulnerable to regulation, rather than to use diff --git a/docs/names/TCP.md b/docs/design/TCP.md similarity index 99% rename from docs/names/TCP.md rename to docs/design/TCP.md index a46a72e..dd5722a 100644 --- a/docs/names/TCP.md +++ b/docs/design/TCP.md @@ -102,6 +102,11 @@ upper bound. To find the actual MTU, have to have a don't fragment field (which is these days generally set by default on UDP) and empirically track the largest packet that makes it on this connection. Which TCP does. +MTU (packet size) and MSS (data size, $MTU-40$) is a +[messy problem](https://www.cisco.com/c/en/us/support/docs/ip/generic-routing-encapsulation-gre/25885-pmtud-ipfrag.html) +Which can be side stepped by always sending packets +of size 576 contiaing 536 bytes of data. + ## first baby steps To try and puzzle this out, I need to build a client server that can listen on diff --git a/docs/design/mkdocs.sh b/docs/design/mkdocs.sh new file mode 100644 index 0000000..ea1e53c --- /dev/null +++ b/docs/design/mkdocs.sh @@ -0,0 +1,7 @@ +#!/bin/bash +set -e +cd `dirname $0` +docroot="../" +banner_height=banner_height:15ex +templates=$docroot"pandoc_templates" +. $templates"/mkdocs.cfg" diff --git a/docs/names/nat.md b/docs/design/nat.md similarity index 96% rename from docs/names/nat.md rename to docs/design/nat.md index b4d110a..571e209 100644 --- a/docs/names/nat.md +++ b/docs/design/nat.md @@ -8,8 +8,10 @@ name system, SSL, and email. This is covered at greater length in # Implementation issues -There is a great [pile of RFCs](TCP.html) on issues that arise with using udp and icmp +There is a great [pile of RFCs on issues that arise with using udp and icmp to communicate. +[Peer-to-Peer Communication Across Network Address Translators] +(https://bford.info/pub/net/p2pnat/){target="_blank"} ## timeout @@ -30,7 +32,7 @@ needed. They never bothered with keep alive. They also found that a lot of the time, both parties were behind the same NAT, sometimes because of NATs on top of NATs -[hole punching]:http://www.mindcontrol.org/~hplus/nat-punch.html +[hole punching]:https://tailscale.com/blog/how-nat-traversal-works "How to communicate peer-to-peer through NAT firewalls" {target="_blank"} diff --git a/docs/design/navbar b/docs/design/navbar new file mode 100644 index 0000000..9f9009d --- /dev/null +++ b/docs/design/navbar @@ -0,0 +1,7 @@ +
+ \ No newline at end of file diff --git a/docs/design/peer_socket.md b/docs/design/peer_socket.md new file mode 100644 index 0000000..90c2eda --- /dev/null +++ b/docs/design/peer_socket.md @@ -0,0 +1,615 @@ +--- +# katex +title: >- + Peer Socket +sidebar: false +notmine: false +... + +::: myabstract +[abstract:]{.bigbold} +Most things follow the client server model, +so it makes sense to have a distinction between server sockets +and client sockets. But ultimately what we are doing is +passing messages between entities and the revolutionary +and subversive technologies, bittorrent, bitcoin, and +bitmessage are peer to peer, so it makes sense that all sockets, +however created wind up with same properties. +::: + +# factoring + +In order to pass messages, the socket has to know a whole lot of state. And +in order handle messages, the entity handling the messages has to know a +whole lot of state. So a socket api is an answer to the question how we +factor this big pile of state into two smaller piles of state. + +Each big bundle of state represents a concurrent communicating process. +Some of the state of this concurrent communicating process is on one side +of our socket division, and is transparent to one side of our division. The +application knows the internals of the some of the state, but the internals +of socket state are opaque, while the socket knows the internals of the +socket state, but the internals of the application state are opaque to it. +The socket state machines think that they are passing messages of one class +or a very small number of classes, to one big state machine, which messages +contain an opaque block of bytes that application class serializes and +deserializes. + +## layer responsibilities + +The sockets layer just sends and receives arbitrary size blocks +of opaque bytes over the wire between two machines. +They can be sent with or without flow control +and with or without reliability, +but if the block is too big to fit in this connection's maximum +packet size, the without flow control and without +reliability option is ignored. Flow control and reliability is +always applied to messages too big to fit in a packet. + +The despatch layer parses out the in-reply-to and the +in-regards-to values from the opaque block of bytes and despatches them +to the appropriate application layer state machine, which parses out +the message type field, deserializes the message, +and despatches it to the appropriate fully typed event handler +of that state machine. + +# It is remarkable how much stuff can be done without +concurrent communicating processes. Nostr is entirely +implemented over request reply, except that a whole lot +of requests and replies have an integer representing state, +where the state likely winds up being a database rowid. + +The following discussion also applies if the reply-to field +or in-regards-to field is associated with a database index +rather than an instance of a class living in memory, and might +well be handled by an instance of a class containing only a database index. + + +# Representing concurrent communicating processes + +node.js represents them as continuations. Rust tokio represents them +as something like continuations. Go represents them lightweight +threads, which is a far more natural and easier to use representation, +but under the hood they are something like continuations, and the abstraction +leaks a little. The abstraction leaks a little in the case you have one +concurrent process on one machine communicating with another concurrent +process on another machine. + +Well, in C++, going to make instances of a class, that register +call backs, and the callback is the event. Which had an instance +of a class registered with the callback. Which in C++ is a pointer +to a method of an object, which has no end of syntax that no one +ever manages to get their head around. + +So if `dog` is method pointer with the argument `bark`, just say +`std::invoke(dog, bark)` and let the compiler figure out how +to do it. `bark` is, of course, the data supplied by the message +and `dog` is the concurrent communicating process plus its registered +callback. And since the process is sequential, it knows the data +for the message that this is a reply to. + +A message may contain a reply-to field and or an in-regards-to field. + +In general, the in-regards-to field identifies the state machine +on the server and the client, and remains unchanged for the life +of the state machines. Therefore its handler function remains unchanged, +though it may do different things depending +on the state of the state machine and depending on the type of the message. + +If the message only has an in-regards-to field, then the callback function for it +will normally be reginstered for the life of the councurrent process (instance) + +If it is an in-reply-to, the dispatch mechanism will unregister the handler when it +dispatches the message. If you are going to receive multiple messages in response +to a single message, then you create a new instance. + +In C, one represents actions of concurrent processes by a +C function that takes a callback function, so in C++, +a member function that takes a member function callback +(warning, scary and counter intuitive syntax). + +Member to function pointers are a huge mess containing +one hundred workarounds, and the best workaround is to not use them. +People have a whole lot of ingenious ways to not use them, for example +a base class that passes its primary function call to one of many +derived classes. Which solution does not seem applicable to our +problem. + +`std:invoke` is syntax sugar for calling weird and wonderful +callable things - it figures out the syntax for you at compile +time according to the type, and is strongly recommended, because +with the wide variety of C++ callable things, no one can stretch +their brain around the differing syntaxes. + +The many, many, clever ways of not using member pointers +just do not cut it, for the return address on a message ultimately maps +to a function pointer, or something that is exactly equivalent to a function pointer. + +Of course, we very frequently do not have any state, and you just +cannot have a member function to a static function. One way around +this problem is just to have one concurrent process whose state just +does not change, one concurrent process that cheerfully handles +messages from an unlimited number of correspondents, all using the same +`in-regards-to`, which may well be a well known named number, the functional +equivalent of a static web page. It is a concurrent process, +like all the others, and has its own data like all the others, but its +data does not change when it responds to a message, so never expects an +in-reply-to response, or if does, creates a dynamic instance of another +type to handle that. Because it does not remember what messages it sent +out, the in-reply-to field is no use to it. + +Or, possibly our concurrent process, which is static and stateless +in memory, nonetheless keeps state in the database, in which case +it looks up the in-reply-to field in the database to find +the context. But a database lookup can hang a thread, +which we do not want to stall network facing threads. + +So we have a single database handling thread that sequentially handles a queue +of messages from network facing threads driving network facing concurrent +processes, drives database facing concurrent processes, +which dispatch the result into a queue that is handled by +network facing threads that drive network facing concurrent +processes. + +So, a single thread that handles the network card, despatching +message out from a queue in memory, and in from queue in memory, and does not +usually or routinely do memory allocation or release, or handles them itself +if they are standard, common, and known to be capable of being quickly handled, +a single thread that handles concurrent systems that are purely +memory to memory, but could involve dynamic allocation of memory, +and a single thread that handles concurrent state machines that do database +lookups and writes and possibly dynamic memory allocation, but do not +directly interact with the network, handing that task over to concurrent +state machines in the networking thread. + +So a message comes in through the wire, where it is handled +by a concurrent process, probably a state machine with per connection +state, though it might have substates, child concurrent processes, +for reassembling one multipart message without hanging the next, + +It then passes that message to a state machine in the application +layer, which is queued up in the queue for the thread or threads appropriate +to its destination concurrent process, and receives messages from those threads, +which it then despatches to the wire. + +A concurrent process is of course created by another +concurrent process, so when it completes, +does a callback on the concurrent process that created it, +and any concurrent processes it has created +are abruptly discarded. So our external messages and events +involve a whole lot of purely internal messages and events. +And the event handler has to know what internal object this +message came from, +which for external messages is the in-regards-to field, +or is implicit in the in-reply-to field. + +If you could be receiving events from different kinds of +objects about different matters, well, you have to have +different kinds of handlers. And usually you are only +receiving messages from only one such object, but in +irritatingly many special cases, several such objects. + +But it does not make sense to write for the fully general case +when the fully general case is so uncommon, so we handle this +case ad-hoc by a special field, which is defined only for this +message type, not defined as a general quality of all messages. + +It typically makes sense to assume we are handling only one kind +of message, possibly of variant type, from one object, and in +the other, special, cases, we address that case ad hoc by additional +message fields. + +But if we support `std:variant`, there is a whole lot of overlap +between handling things by a new variant, and handling things +by a new callback member. + +The recipient must have associated a handler, consisting of a +call back and an opaque pointer to the state of the concurrent process +on the recipient with the messages referenced by at least one of +these fields. In the event of conflicting values, the reply-to takes +precedence, but the callback of the reply-to has access to both its +data structure, and the in-regards-to dat structure, a pointer to which +is normally in its state. The in-regards-to being the state machine, +and the in-reply-to the event that modifies the +state of the state machine. + +When we initialize a connection, we establish a state machine +at both ends, both the application factor of the state machine, +and the socket factor of the state machine. + +When I say we are using state machines, this is just the +message handling event oriented architecture familiar in +programming graphical user interfaces. + +Such a program consists of a pile of derived classes whose +base classes have all the machinery for handling messages. +Each instance of one of these classes is a state machine, +which contains member functions that are event handlers. + +So when I say "state machine", I mean a class for handling +events like the many window classes in wxWidgets. + +One big difference will be that we will be posting a lot of events +that we expect to trigger events back to us. And we will want +to know which posting the returning event came from. So we will +want to create some data that is associated with the fired event, +and when a resulting event is fired back to us, we can get the +correct associated data, because we might fire a lot of events, +and they might come back out of order. Gui code has this capability, +but it is rarely used. + +## Implementing concurrent state machines in C++ + +Most of this is me re-inventing Asio, which is part of the +immense collection of packages of Msys2, Obviously I would be +better off integrating Asio than rebuilding it from the ground up +But I need to figure out what needs to be done, so that I can +find the equivalent Asio functionality. + +Or maybe Asio is bad idea. Boost Asio was horribly broken. +I am seeing lots of cool hot projects using Tokio, not seeing any cool +hot projects use Asio. +If Bittorrent DHT library did their own +implementation of concurrent communicating processes, +maybe Asio is broken at the core + +And for flow control, I am going to have to integrate Quic, +though I will have to fork it to change its security model +from certificate authorities to Zooko names. You can in theory +easily plug any kind of socket into Asio, +but I see a suspicious lack of people plugging Quic into it, +because Quic contains a huge amount of functionality that Asio +knows nothing of. But if I am forking it, can probably ignore +or discard most of that functionality. + +Gui code is normally single threaded, because it is too much of +a bitch to lock an instance of a message handling class when one of +its member functions is handling a message (the re-entrancy problem). + +However the message plumbing has to know which class is going +to handle the message (unless the message is being handled by a +stateless state machine, which it often is) so there is no reason +the message handling machinery could not atomically lock the class +before calling its member function, and free it on return from +its member function. + +State machines (message handling classes, as for example +in a gui) are allocated in the heap and owned by the message +handling machinery. The base constructor of the object plugs it +into the message handling machinery. (Well, wxWidgets uses the +base constructor with virtual classes to plug it in, but likely the +curiously recurring template pattern would be saner +as in ATL and WTL.) + +This means they have to be destroyed by sending a message to the message +handling machinery, which eventually results in +the destructor being called. The destructor does not have to worry +about cleanup in all the base classes, because the message handling +machinery is responsible for all that stuff. + +Our event despatcher does not call a function pointer, +because our event handlers are member functions. +We call an object of type `std::function`. We could also use pointer to member, +which is more efficient. + +All this complicated machinery is needed because we assume +our interaction is stateful. But suppose it is not. The request‑reply +pattern, where the request contains all information that +determines the reply is very common, probably the great majority +of cases. This corresponds to an incoming message where the +in‑regards‑to field and in‑reply‑to field is empty, +because the incoming message initiates the conversation, +and its type and content suffices to determine the reply. Or the incoming message +causes the recipient to reply and also set up a state machine, +or a great big pile of state machines (instances of a message handling class), +which will handle the lengthy subsequent conversation, +which when it eventually ends results in those objects being destroyed, +while the connection continues to exist. + +In the case of an incoming message of that type, it is despatched to +a fully re-entrant static function on the basis of its type. +The message handling machinery calls a function pointer, +not a class member. +We don't use, should not use, and cannot use, all the +message handling infrastructure that keeps track of state. + + +## receive a message with no in‑regards‑to field, no in‑reply‑to field + +This is directed to a re-entrant function, not a functor, +because re‑entrant and stateless. +It is directed according to message type. + +### A message initiating a conversation + +It creates a state machine (instance of a message handling class) +sends the start event to the state machine, and the state machine +does whatever it does. The state machine records what message +caused it to be created, and for its first message, +uses it in the in‑reply‑to field, and for subsequent messages, +for its in‑regards‑to field, + +### A request-reply message. + +Which sends a message with the in-reply-to field set. +The recipient is expected to have a hash-map associating this field +with information as to what to do with the message. + +#### A request-reply message where counterparty matters. + +Suppose we want to read information about this entity from +the database, and then write that information. Counterparty +information is likely to be needed to be durable. Then we do +the read-modify write as a single sql statement, +and let the database serialize it. + +## receive a message with no in‑regards‑to field, but with an in‑reply‑to field + +The dispatch layer looks up a hash-map table of functors, +by the id of the field and id of the sender, and despatches the message to +that functor to do whatever it does. +When this is the last message expected in‑reply‑to the functor +frees itself, removes itself from the hash-map. If a message +arrives with no entry in the table, it is silently dropped. + +## receive a message with an in‑regards‑to field, with or without an in‑reply‑to to field. + +Just as before, the dispatch table looks up the hash-map of state machines +(instances of message handling classes) and dispatches +the message to the stateful message handler, which figures out what +to do with it according to its internal state. What to do with an +in‑reply‑to field, if there is one, is something the stateful +message handler will have to decide. It might have its own hashmap for +the in‑reply‑to field, but this would result in state management and state +transition of huge complexity. The expected usage is it has a small +number of static fields in its state that reference a limited number +of recently sent messages, and if the incoming message is not one +of them, it treats it as an error. Typically the state machine is +only capable of handling the +response to its most recent message, and merely wants to be sure +that this *is* a response to its most recent message. But it could +have shot off half a dozen messages with the in‑regards‑to field set, +and want to handle the response to each one differently. +Though likely such a scenario would be easier to handle by creating +half a dozen state machines, each handling its own conversation +separately. On the other hand, if it is only going to be a fixed +and finite set of conversations, it can put all ongoing state in +a fixed and finite set of fields, each of which tracks the most +recently sent message for which a reply is expected. + + +## A complex conversation. + +We want to avoid complex conversations, and stick to the +request‑reply pattern as much as possible. But this is apt to result +in the server putting a pile of opaque data (a cookie) its reply, +which it expects to have sent back with every request. +And these cookies can get rather large. + +Bob decides to initiate a complex conversation with Carol. + +He creates an instance of a state machine (instance of a message +handling class) and sends a message with no in‑regards‑to field +and no in‑reply‑to field but when he sends that initial message, +his state machine gets put in, and owned by, +the dispatch table according to the message id. + +Carol, on receiving the message, also creates a state machine, +associated with that same message id, albeit the counterparty is +Bob, rather than Carol, which state machine then sends a reply to +that message with the in‑reply‑to field set, and which therefore +Bob's dispatch layer dispatches to the appropriate state machine +(message handler) + +And then it is back and forth between the two stateful message handlers +both associated with the same message id until they shut themselves down. + +## factoring layers. + +A layer is code containing state machines that receive messages +on one machine, and then send messages on to other code on +*the same machine*. The sockets layer is the code that receives +messages from the application layer, and then sends them on the wire, +and the code that receives messages from the wire, +and sends messages to the application layer. + +But a single state machine at the application level could be +handling several connections, and a single connection could have +several state machines running independently, and the +socket code should not need to care. + +We have a socket layer that receives messages containing an +opaque block of bytes, and then sends a message to +the application layer message despatch machinery, for whom the +block is not opaque, but rather identifies a message type +meaningful for the despatch layer, but meaningless for the socket layer. + +The state machine terminates when its job is done, +freeing up any allocated memory, +but the connection endures for the life of the program, +and most of the data about a connection endures in +an sql database between reboots. + +The connection is a long lived state machine running in +the sockets layer, which sends and receives what are to it opaque +blocks of bytes to and from the dispatch layer, and the dispatch +layer interprets these blocks of bytes as having information +(message type, in‑regards‑to field and in‑reply‑to field) +that enables it to despatch the message to a particular method +of a particular instance of a message handling class in C++, +corresponding to a particular channel in Go. +And these message handling classes are apt to be short lived, +being destroyed when their task is complete. + +Because we can have many state machines on a connection, +most of our state machines can have very little state, +typically an infinite receive loop, an infinite send receive loop, +or an infinite receive send loop, which have no state at all, +are stateless. We factorize the state machine into many state machines +to keep each one manageable. + + +## Comparison with concurrent interacting processes in Go + +These concurrent communicating processes are going to +be sending messages to each other on the same machine. +We need to model Go's goroutines. + +A goroutine is a function, and functions always terminate -- +and in Go are unceremoniously and abruptly ended when their parent +function ends, because they are variables inside its dataspace, +as are their channels. +And, in Go, a channel is typically passed by the parent to its children, +though they can also be passed in a channel. +Obviously this structure is impossible and inapplicable +when processes may live, and usually do live, +in different machines. + +The equivalent of Go channel is not a connection. Rather, +one sends a message to the other to request it create a state machine, +which will correspond to the in-regards-to message, and the equivalent of a +Go channel is a message type, the in-regards-to message id, +and the connection id. Which we pack into a single class so that we +can use it the way Go uses channels. + +The sockets layer (or another state machine on the application layer) +calls the callback routine with the message and the state. + +The sockets layer treats the application layer as one big state +machine, and the information it sends up to the application +enables the application layer to despatch the event to the +correct factor of that state machine, which we have factored into +as many very small, and preferably stateless, state machines as possible. + +We factor the potentially ginormous state machine into +many small state machines (message handling classes), +in the same style as Go factors a potentially ginormous Goroutine into +many small goroutines. + +The socket code being a state machine composed of many +small state machines, which communicates with the application layer +over a very small number of channels, +these channels containing blocks of bytes that are +opaque to the socket code, +but are serialized and deserialized by the application layer code. +From the point of view of the application layer code, +it is many state machines, +and the socket layer is one big state machine. +From the point of view of the socket code, it is many state machines, +and the application layer is one big state machine. +The application code, parsing the the in-reply-to message id, +and the in-regard-to message id, figures out where to send +the opaque block of bytes, and the recipient deserializes, +and sends it to a routine that acts on an object of that +deserialized class. + +Since the sockets layer does not know the internals +of the message struct, the message has to be serialized and deserialized +into the corresponding class by the dispatch layer, +and thence to the application layer. + +Go code tends to consist of many concurrent processes +continually being spawned by a master concurrent process, +and themselves spawning more concurrent processes. + +# flow control and reliability + +If we want to transmit a big pile of data, a big message, well, +this is the hard problem, for the sender has to throttle according +to the recipient's readiness to handle it and the physical connections capability to transmit it. + +Quic is a UDP protocol that provides flow control, and the obvious thing +to handle bulk data transfer is to fork it to use Zooko based keys. + +[Tailscale]:https://tailscale.com/blog/how-nat-traversal-works +"How to communicate peer-to-peer through NAT firewalls"{target="_blank"} + +[Tailscale] has solved a problem very similar to the one I am trying to solve, +albeit their solutions rely on a central human authority, +which authority they ask for money and they recommend: + +> If you’re reaching for TCP because you want a +> stream‑oriented connection when the NAT traversal is done, +> consider using QUIC instead. It builds on top of UDP, +> so we can focus on UDP for NAT traversal and still have a +> nice stream protocol at the end. + +But to interface QUIC to a system capable of handling a massive +number of state machines, going to need something like Tokio, +because we want the thread to service other state machines while +QUIC is stalling the output or waiting for input. Indeed, no +matter what, if we stall in the socket layer rather than the +application layer, which makes life a whole lot easier for the +application programmer, going to need something like Tokio. + +Or we could open up Quic, which we have to do anyway +to get it to use our keys rather than enemy controlled keys, +and plug it into our C++ message passing layer. + +On the application side, we have to lock each state machine +when it is active. It can only handle one message at at time. +So the despatch layer has to queue up messages and stash them somewhere, +and if it has too many messages stashed, +it wants to push back on the state machine at the application layer +at the other end of the wire. So the despatch layer at the receiving end +has to from time to time tell the despatch layer at the sending end +"I have `n` bytes in regard to message 'Y', and can receive `m` more. +And when the despatch layer at the other end, which unlike the socket +layer knows which state machine is communicating with which, +has more than that amount of data to send, it then blocks +and locks the state machine at its end in its send operation. + +The socket layer does not know about that and does not worry about that. +What it worries about packets getting lost on the wire, and caches +piling up in the middle of the wire. +It adds to each message a send time and a receive time +and if the despatch layer wants to send data faster +than it thinks is suitable, it has to push back on the despatch layer. +Which it does in the same style. +It tells it the connection can handle up to `m` further bytes. +Or we might have two despatch layers, one for sending and one for +receiving, with the send state machine sending events to the receive state +machine, but not vice versa, in which case the socket layer +*can* block the send layer. + +# Tokio + +Most of this machinery seems like a re-implementation of Tokio-rust, +which is a huge project. I don't wanna learn Tokio-rust, but equally +I don't want to re-invent the wheel. + +Or perhaps we could just integrate QUICs internal message +passing infrastructure to our message passing infrastructure. +It probably already supports a message passing interface. + +Instead of synchronously writing data, you send a message to it +to write some data, and hen it is done, it calls a callback. + +# Minimal system + +Prototype. Limit global bandwidth at the application +state machine level -- they adjust their policy according to how much +data is moving, and they spread the non response outgoing +messages out to a constant rate (constant per counterparty, +and uniformly interleaved.) + +Single threaded, hence no state machine locking. + +Tweet style limit on the size of messages, hence no fragmentation +and re-assembly issue. Our socket layer becomes trivial - it just +send blobs like a zeromq socket. + +If you are trying to download a sackload of data, you request a counterparty to send a certain amount to you at a given rate, he immediately responds (without regard to global bandwidth limits) with the first instalment, and a promise of further instalments at a certain time) + +Each instalment records how much has been sent, and when, when the next instalment is coming, and the schedule for further instalments. + +If you miss an instalment, you nack it after a delay. If he receives +a nack, he replaces the promised instalments with the missing ones. + +The first thing we implement is everyone sharing a list of who they have successfully connected to, in recency order, and everyone keeps everyone else's list, which catastrophically fails to scale, and also how up to date their counter parties are with their own list, so that they do not have +endlessly resend data (unless the counterparty has a catastrophic loss of data, and requests everything from the beginning.) + +We assume everyone has an open port, which is sucks intolerably, but once that is working we can handle ports behind firewalls, because we are doing UDP. Knowing who the other guy is connected to, and you are not, you can ask him to initiate a peer connection for the two of you, until you have +enough connections that the keep alive works. + +And once everyone can connect to everyone else by their public username, then we can implement bitmessage. diff --git a/docs/proof_of_stake.md b/docs/design/proof_of_share.md similarity index 86% rename from docs/proof_of_stake.md rename to docs/design/proof_of_share.md index faa3c4a..644210b 100644 --- a/docs/proof_of_stake.md +++ b/docs/design/proof_of_share.md @@ -1,20 +1,20 @@ --- title: proof of share +sidebar: true +notmine: false ... -::: {style="background-color : #ffdddd; font-size:120%"} -![run!](tealdeer.gif)[TL;DR Map a blockdag algorithm equivalent to the -Generalized MultiPaxos Byzantine -protocol to the corporate form:]{style="font-size:150%"} +::: myabstract +[abstract:]{.bigbold} +Map a blockdag algorithm to the corporate form. The proof of share crypto currency will work like shares. Crypto wallets, or the humans controlling the wallets, correspond to shareholders. -Peer computers in good standing on the blockchain, or the humans -controlling them, correspond to company directors. -CEO. ::: +# the problem to be solved + We need proof of share because our state regulated system of notaries, bankers, accountants, and lawyers has gone off the rails, and because proof of work means that a tiny handful of people who are [burning a @@ -50,11 +50,12 @@ that in substantial part, it made such behavior compulsory. Which is why Gab is now doing an Initial Coin Offering (ICO) instead of an Initial Public Offering (IPO). -[Sarbanes-Oxley]:sox_accounting.html +[Sarbanes-Oxley]:../manifesto/sox_accounting.html "Sarbanes-Oxley accounting" +{target="_blank"} Because current blockchains are proof of work, rather than proof of -stake, they give coin holders no power. Thus an initial coin offering +shares, they give coin holders no power. Thus an initial coin offering (ICO) is not a promise of general authority over the assets of the proposed company, but a promise of future goods or services that will be provided by the company. A proof of share ICO could function as a more @@ -113,6 +114,10 @@ lawyers each drawing \$300 per hour, increasingly impractical. Proof of work is a terrible idea, and is failing disastrously, but we need to replace it with something better than bankers and government. +Proof of stake, as implemented in practice, is merely a +central bank digital currency with the consensus determined by a small +secretive insider group (hello Ethereum). + The gig economy represents the collapse of the corporate form under the burden of HR and accounting. @@ -120,6 +125,96 @@ The initial coin offering (in place of the initial public offering) represents an attempt to recreate the corporate form using crypto currency, to which existing crypto currencies are not well adapted. +# How proof of share works + +One way out of this is proof of share, plus evidence of good +connectivity, bandwidth, and disk speed. You have a crypto currency +that works like shares in a startup. Peers have a weight in +the consensus, a likelihood of their view of the past becoming the +consensus view, that is proportional to the amount of +crypto currency their client wallets possessed at a certain block height, +$\lfloor(h−1000)/4096\rfloor∗4096$, where $h$ is the current block height, +provided they maintain adequate data, disk access, +and connectivity. The trouble with this is that it reveals +what machines know where the whales are, and those machines +could be raided, and then the whales raided, so we have to have +a mechanism that can hide the ips of whales delegating weight +in the consensus to peers from the peers exercising that weight +in the consensus. And [in fact I intend to do that mechanism +before any crypto currency, because bitmessage is abandonware +and needs to be replaced](file:///C:/Users/john/src/reactionId/wallet/docs/manifesto/social_networking.html#monetization){target="_blank"}. + +Plus the peers consense over time on a signature that +represents human board, which nominates another signature that represents +a human ceo, thus instead of informal secret centralisation with +the capability to do unknown and possibly illegitimate things +(hello Ethereum), you have a known formal centralisation with +the capability to do known and legitimate things. Dictating +the consensus and thus rewriting the past not being one of those +legitimate things. + +## consensus algorithm details + +Each peer gets a weight at each block height that is a +deterministically random function of the block height, +its public key, the hash of the previous block that it is building its block +on top of, and the amount of crypto currency (shares) +that it represents, with the likelihood of it getting a high weight +proportional to the amount of crypto currency it represents, such +that the likelihood of a peer having a decisive vote is proportional +to the amount of share it represents. + +Each peer sees the weight of a proposed block as +the median weight of the three highest weighted peers +that it knows know or knew of the block and its contents according to +their current weight at this block height and perceived it has highest +weighted at the time they synchronized on it, plus the weight of +the median weighted peer among up to three peers +that were recorded by the proposer +as knowing to the previous block that the proposed block +is being built on at the previous block height, plus the +weight of the chain of preceding blocks similarly. + +When it synchronizes with another peer on a block, and the block is +at that time the highest weighted block proposed block known to both +of them, +both record the other's signature as knowing that block +as the highest weighted known at that time. If one of them +knows of a higher weighted proposed block, then they +synchronize on whichever block will be the highest weighted block. +when both have synchronized on it. + +if it has a record of less than three knowing that block, +or if the other has a higher weight than one of the three, +then they also synchronize their knowledge of the highest weighted three. + +This algorithm favors peers that represent a lot of shares, and also +favors peers with good bandwidth and data access, and peers that +are responsive to other peers, since they have more and better connections +thus their proposed block is likely become widely known faster. + +If only one peer, the proposer, knows of a block, then its weight is +the weight of the proposer, plus previous blocks but is lower than +the weight if any alternative block whose previous blocks have the +same weight, but two proposers. + +This rule biases the consensus to peers with good connection and +good bandwidth to other good peers. + +If comparing two proposed blocks, each of them known to two proposers, that +have chains of previous blocks that are the same weight, then the weight +is the lowest weighted of the two, but lower than any block known to +three. If known to three, the median weight of the three. If known to +a hundred, only the top three matter and only the top three are shared +around. + +It is thus inherently sybil proof, since if one machine is running +a thousand sybils, each sybil has one thousandth the share +representation, one thousandth the connectivity, one thousandth +the random disk access, and one thousandth the cpu. + +# Collapse of the corporate form + The corporate form is collapsing in part because of [Sarbanes-Oxley], which gives accountants far too much power, and those responsible for deliverables and closing deals far too little, and in part because HR @@ -194,7 +289,7 @@ intent was for buying drugs, buying guns, violating copyright, money laundering, and capital flight. These are all important and we need to support them all, especially -violating copyright, capital flight and buying guns under repressive +violating copyright, capital flight, and buying guns under repressive regimes. But we now see big demand for crypto currencies to support a replacement for Charles the Second’s corporate form, which is being destroyed by HR, and to restore double entry accounting, which is being @@ -356,12 +451,12 @@ be controlled by private keys known only to client wallets, but most transactions or transaction outputs shall be registered with one specific peer. The blockchain will record a peer’s uptime, its provision of storage and bandwidth to the blockchain, and the amount of -stake registered with a peer. To be a peer in good standing, a peer has +shares registered with a peer. To be a peer in good standing, a peer has to have a certain amount of uptime, supply a certain amount of bandwidth -and storage to the blockchain, and have a certain amount of stake +and storage to the blockchain, and have a certain amount of shares registered to it. Anything it signed as being in accordance with the rules of the blockchain must have been in accordance with the rules of -the blockchain. Thus client wallets that control large amounts of stake +the blockchain. Thus client wallets that control large amounts of shares vote which peers matter, peers vote which peer is primus inter pares, and the primus inter pares settles double spending conflicts and suchlike. @@ -405,7 +500,7 @@ protocol where they share transactions around. During gossip, they also share opinions on the total past of the blockchain. If each peer tries to support past consensus, tries to support the opinion of -what looks like it might be the majority of peers by stake that it sees in +what looks like it might be the majority of peers by shares that it sees in past gossip events, then we get rapid convergence to a single view of the less recent past, though each peer initially has its own view of the very recent past. @@ -593,20 +688,20 @@ network, we need the one third plus one to reliably verify that there is no other one third plus one, by sampling geographically distant and network address distant groups of nodes. -So, we have fifty percent by weight of stake plus one determining policy, +So, we have fifty percent by weight of shares plus one determining policy, and one third of active peers on the network that have been nominated by -fifty percent plus one of weight of stake to give effect to policy +fifty percent plus one of weight of shares to give effect to policy selecting particular blocks, which become official when fifty percent plus one of active peers the network that have been nominated by fifty percent -plus one of weight of stake have acked the outcome selected by one third +plus one of weight of shares have acked the outcome selected by one third plus one of active peers. In the rare case where half the active peers see timeouts from the other half of the active peers, and vice versa, we could get two blocks, each endorsed by one third of the active peers, which case would need to be -resolved by a fifty one percent vote of weight of stake voting for the +resolved by a fifty one percent vote of weight of shares voting for the acceptable outcome that is endorsed by the largest group of active peers, -but the normal outcome is that half the weight of stake receives +but the normal outcome is that half the weight of shares receives notification (the host representing them receives notification) of one final block selected by one third of the active peers on the network, without receiving notification of a different final block. diff --git a/docs/names/server.md b/docs/design/server.md similarity index 100% rename from docs/names/server.md rename to docs/design/server.md diff --git a/docs/index.md b/docs/index.md index a56b3fb..dca5f2b 100644 --- a/docs/index.md +++ b/docs/index.md @@ -145,4 +145,4 @@ worth, probably several screens. - [How to do VPNs right](how_to_do_VPNs.html) - [How to prevent malware](safe_operating_system.html) - [The cypherpunk program](cypherpunk_program.html) -- [Replacing TCP and UDP](names/TCP.html) +- [Replacing TCP and UDP](design/TCP.html) diff --git a/docs/libraries.md b/docs/libraries.md index 8f42f76..3db30a8 100644 --- a/docs/libraries.md +++ b/docs/libraries.md @@ -1,5 +1,7 @@ --- title: Libraries +sidebar: true +notmine: false ... A review of potentially useful libraries and utilities. @@ -124,6 +126,47 @@ does not represent the good stuff. # Peer to Peer +## Freenet + +Freenet has long intended to be, and perhaps is, the social +application that you have long intended to write, +and has an enormous coldstart advantage over anything you could write, +no matter how great. + +It also relies on udp, to enable hole punching, and routinely does hole punching. + +So the only way to go, to compete, is to write a better freenet within +freenet. + +One big difference is that I think that we want to go after the visible net, +where network addresses are associated with public keys - that the backbone should be +ips that have a well known and stable relationship to public keys. + +Which backbone transports encrypted information authored by people whose +public key is well known, but the network address associated with that +public key cannot easily be found. + +Freenet, by design, chronically loses data. We need reliable backup, +paid for in services or crypto currency. +filecoin provides this, but is useless for frequent small incremental +backups. + +## Bittorrent DHT library + +This is a general purpose library, not all that married to bittorrent + +It is available of as an MSYS2 library , MSYS2 being a fork of +the semi abandoned mingw libary, with the result that the name of the +very dead project Mingw-w64 is all over it. + +Its pacman name is mingw-w64-dht, but it has repos all over the plac under its own name + +It is async, driven by being called on a timer, and called when +data arrives. It contains a simple example program, that enables you to publish any data you like. + + +## libp2p + [p2p]:https://github.com/elenaf9/p2p {target="_blank"} @@ -324,6 +367,17 @@ Of course missing from this from Jim's long list of plans are DDoS protection, a The net is vast and deep. Maybe we need to start cobbling these pieces together. The era of centralized censorship needs to end. Musk will likely lose either way, and he's only one man against the might of so many paper tigers that happen to be winning the information war. +## Lightning node + +[`rust-lightning`]:https://github.com/lightningdevkit/rust-lightning +{target="_blank"} + + [`rust-lightning`] is a general purpose library for writing lightning nodes, running under Tokio, that is used in one actual lightning node implementation. + + It is intended to be integrated into on-chain wallets. + + It provides the channel state as "a binary blob that you can store any way you want" -- which is to say, ready to be backed up onto the social net. + # Consensus @@ -785,7 +839,17 @@ libraries, but I hear it cursed as a complex mess, and no one wants to get into it. They find the far from easy `cmake` easier. And `cmake` runs on all systems, while autotools only runs on linux. -I believe `cmake` has a straightforward pipeline into `*.deb` files, but if it has, the autotools pipleline is far more common and widely used. +MSYS2, which runs on Windows, supports autotools. So, maybe it does run +on windows. + +[autotools documentation]:https://thoughtbot.com/blog/the-magic-behind-configure-make-make-install +{target="_blank"} + +Despite the complaints about autotools, there is [autotools documentation] +on the web that does not make it sound too bad. + +I believe `cmake` has a straightforward pipeline into `*.deb` files, +but if it has, the autotools pipleline is far more common and widely used. ## The standard windows installer @@ -818,6 +882,8 @@ NSIS can create msi files for windows, and is open source. [NSIS Open Source repository] +NSIS is also available as an MSYS package + People who know what they are doing seem to use this open source install system, and they write nice installs with it. @@ -1138,7 +1204,7 @@ which could receive a packet at any time. I need to look at the GameNetworkingSockets code and see how it listens on lots and lots of sockets. If it uses [overlapped IO], then it is golden. Get it up first, and it put inside a service later. -[Overlapped IO]:server.html#the-select-problem +[Overlapped IO]:design/server.html#the-select-problem {target="_blank"} The nearest equivalent Rust application gave up on congestion control, having programmed themselves into a blind alley. @@ -1357,7 +1423,7 @@ transaction affecting the payee factor state. A transaction has no immediate affect. The payer mutable substate changes in a way reflecting the transaction block at the next block boundary. And that change then has effect on product mutable state at a subsequent product state block -boundary, changing the stake possessed by the substate. +boundary, changing the shares possessed by the substate. Which then has effect on the payee mutable substate at its next block boundary when the payee substate links back to the previous @@ -1567,6 +1633,12 @@ You can create a pool of threads processing connection handlers (and waiting for finalizing database connection), by running `io_service::run()` from multiple threads. See Boost.Asio docs. +## Asio +I tried boost asio, and concluded it was broken, trying to do stuff that cannot be done, +and hide stuff that cannot be hidden in abstractions that leak horribly. + +But Asio by itself (comes with MSYS2) might work. + ## Asynch Database access MySQL 5.7 supports [X Plugin / X Protocol, which allows asynchronous query execution and NoSQL But X devapi was created to support node.js and stuff. The basic idea is that you send text messages to mysql on a certain port, and asynchronously get text messages back, in google protobuffs, in php, JavaScript, or sql. No one has bothered to create a C++ wrapper for this, it being primarily designed for php or node.js](https://dev.mysql.com/doc/refman/5.7/en/document-store-setting-up.html) @@ -1791,7 +1863,7 @@ Javascript is a great language, and has a vast ecosystem of tools, but it is controlled from top to bottom by our enemies, and using it is inherently insecure. -It consists of a string (which is implemented under the hood as a copy on +Tcl consists of a string (which is implemented under the hood as a copy on write rope, with some substrings of the rope actually being run time typed C++ types that can be serialized and deserialized to strings) and a name table, one name table per interpreter, and at least one interpreter per @@ -1869,14 +1941,50 @@ from wide character variants and locale variants. (We don't want locale variants, they are obsolete. The whole world is switching to UTF, but our software and operating environments lag) +Locales still matter in case insensitive compare, collation order, +canonicalization of utf-8 strings, and a rats nest of issues, +which linux and sqlite avoids by doing binary compares, and if it cannot +avoid capitalization issues, only considering A-Z to be capitals. + +If you tell sqlite to incorporate the ICU library, sqlite will attempt to +do case lowering and collation for all of utf-8 - which strikes me +as something that cannot really be done, and I am not at all sure how +it will interact with wxWidgets attempting to do the same thing. + +What happens is that operations become locale dependent. It will +have a different view of what characters are equivalent in different +places. And changing the locale on a database will break an index or +table that has a non binary collation order. Which probably will not +matter much because we are likely to have few entries that only differ +in capitalization. The sql results will be wrong, but the database will +not crash, and when we have a lot of entires that affected by non latin +capitalization rules, it is probably going to be viewed only in that +locale. But any collation order that is global to all parties on the blockchain +has to be latin or binary. + +wxWidgets does *not* include the full unicode library, so cannot do this +stuff. But sqlite provides some C string functions that are guaranteed to +do whatever it does, and if you include the ICU library it attempts +to handle capitalization on the entire unicode set\ +`int sqlite3_stricmp(const char *, const char *);`\ +`sqlite3_strlike(P,X,E)`\ +The ICU library also provides a real regex function on unicode +(`sqlite3_strlike` being the C equivalent of the SQL `LIKE`, +providing a rather truncated fragment of regex capability) +Pretty sure the wxWidgets regex does something unwanted on unicode + `wString::ToUTF8()` and `wString::FromUTF8()` do what you would expect. +`wxString::c_str()` does something too clever by half. + On visual studio, need to set your source files to have bom, so that Visual Studio knows that they are UTF‑8, need to set the compiler environment in Visual Studio to UTF‑8 with `/Zc:__cplusplus /utf-8 %(AdditionalOptions)` And you need to set the run time environment of the program to UTF‑8 -with a manifest. +with a manifest. Not at all sure how codelite will handle manifests, +but there is a codelite build that does handle utf-8, presumably with +a manifest. Does not do it in the standard build on windows. You will need to place all UTF‑8 string literals and string constants in a resource file, which you will use for translated versions. @@ -1907,9 +2015,6 @@ way of the future, but not the way of the present. Still a work in progress Does not build under Windows. Windows now provide UTF8 entries to all its system functions, which should make it easy. -wxWidgets provides `wxRegEx` which, because wxWidgets provides index -by entity, should just work. Eventually. Maybe the next release. - # [UTF8-CPP](http://utfcpp.sourceforge.net/ "UTF-8 with C++ in a Portable Way") A powerful library for handling UTF‑8. This somewhat duplicates the diff --git a/docs/libraries/cpp_automatic_memory_management.md b/docs/libraries/cpp_automatic_memory_management.md index ffad374..df9f41f 100644 --- a/docs/libraries/cpp_automatic_memory_management.md +++ b/docs/libraries/cpp_automatic_memory_management.md @@ -437,11 +437,23 @@ A class can be explicitly defined to take aggregate initialization } } -but that does not make it of aggregate type. Aggregate type has *no* -constructors except default and deleted constructors +but that does not make it of aggregate type. +Aggregate type has *no* constructors +except default and deleted constructors # functional programming +A lambda is a nameless value of a nameless class that is a +functor, which is to say, has `operator()` defined. + +But, of course you can get the class with `decltype` +and assign that nameless value to an `auto` variable, +or stash it on the heap with `new`, +or in preallocated memory with placement `new` + +But if you are doing all that, might as well explicitly define a +named functor class. + To construct a lambda in the heap: auto p = new auto([a,b,c](){}) @@ -475,8 +487,8 @@ going to have to introduce a compile time name, easier to do it as an old fashioned function, method, or functor, as a method of a class that is very possibly pod. -If we are sticking a lambda around to be called later, might copy it by -value into a templated class, or might put it on the heap. +If we are sticking a lambda around to be called later, might copy +it by value into a templated class, or might put it on the heap. auto bar = []() {return 5;}; @@ -522,7 +534,7 @@ lambdas and functors, but are slow because of dynamic allocation C++ does not play well with functional programming. Most of the time you can do what you want with lambdas and functors, using a pod class that -defines operator(\...) +defines `operator(...)` # auto and decltype(variable) diff --git a/docs/libraries/navbar b/docs/libraries/navbar index 710d180..bc0ed33 100644 --- a/docs/libraries/navbar +++ b/docs/libraries/navbar @@ -1,7 +1,8 @@ + \ No newline at end of file diff --git a/docs/lightning_layer.md b/docs/lightning_layer.md index 95dff4c..1bda8de 100644 --- a/docs/lightning_layer.md +++ b/docs/lightning_layer.md @@ -395,7 +395,7 @@ give the reliable broadcast channel any substantial information about the amount of the transaction, and who the parties to the transaction are, but the node of the channel sees IP addresses, and this could frequently be used to reconstruct a pretty good guess about who is transacting with whom and why. -As we see with Monaro, a partial information leak can be put together with +As we see with Monero, a partial information leak can be put together with lots of other sources of information to reconstruct a very large information leak. diff --git a/docs/manifesto/May_scale_of_monetary_hardness.md b/docs/manifesto/May_scale_of_monetary_hardness.md index 1489be1..0cee488 100644 --- a/docs/manifesto/May_scale_of_monetary_hardness.md +++ b/docs/manifesto/May_scale_of_monetary_hardness.md @@ -38,7 +38,7 @@ text-align:center;">May Scale of monetary hardness- -The primary function of the wallet file is to provide a secret key for -a public key, though eventually we stuff all sorts of user odds and ends -into it.
- -In Bitcoin, this is simply a pile of random secrets, randomly generated, but obviously it is frequently useful to have them not random, but merely seemingly random to outsiders. One important and valuable application of this is the paper wallet, where one can recreate the wallet from its master secret, because all the seemingly random secret keys are derived from a single master secret. But this does not cover all use cases.
- -We care very much about the case where a big important peer in a big central -node of the internet, and the owner skips out with the paper key that owns the -peer’s reputation in his pocket, and sets up the same peer in another -jurisdiction, and everyone else talks to the peer in the new jurisdiction, -and ignores everything the government seized.
- -The primary design principle of our wallet is to bypass the DNS and ICANN, so -that people can arrange transactions securely. The big security hole in -Bitcoin is not the one that Monaro fixes, but that you generally arrange -transactions on the totally insecure internet. The spies care about the -metadata more than they care about the data. We need to internalize and secure -the metadata.
- -We particularly want peers to be controlled by easily hidden and transported -secrets. The key design point around which all else is arranged is that if -some man owns a big peer on the blockchain, and the cops descend upon it, he -grabs a book of his bookshelf that looks like any airport book, except on -pencil in the margins of one page is his master secret, written in pencil, -and a list of all his nicknames, also written in pencil. Then he skips off -to the airport and sets up his peer on a new host, and no one notices that it -has moved except for the cops who grabbed his now useless hardware. (Because -the secrets on that hardware were only valuable because their public keys -were signed by his keys, and when he starts up the new hardware, his keys -will sign some new keys on the peer and on the slave wallet on that hardware.) -
- -We also want perfect forward secrecy, which is easy. Each new connection -initiation starts with a random transient public key, and any identifying -information is encrypted in the shared secret formed from that transient public -key, and the durable public key of the recipient.
- -We also want off the record messaging. So we want to prove to the recipient -that the sender knows the shared secret formed from the recipients public key -and the sender’s transient private key, and also the shared secret formed from -the recipient’s public key, the sender’s transient private key, and the -sender’s durable private key. But the recipient, though he then knows the -message came from someone who has the sender’s durable private key, cannot -prove that to a third party, because he could have constructed that shared -secret, even if the sender did not know it. The recipient could have forged -the message and sent it to himself.
- -Thus we get off the record messaging. The sender by proving knowledge of two -shared secrets, proves to the recipient knowledge of two secret keys -corresponding to the two public keys provided, but though the sender proves -it to the recipient, the recipient cannot prove it to anyone else.
- -The message consists of transient public key in the clear, which encrypts -durable public key. then durable public key plus transient public key encrypts -the rest of the message, which encryption proves knowledge of the secret key -underlying the durable public key, proves it to the recipient, but he cannot -prove it to anyone else.
- -The durable public key may be followed by the schema identifier 2, which -implies the nickname follows, which implies the twofitytwo bit hash of the -public key followed by nickname, which is the global identifier of the -pseudonym sending the message. But we could have a different schema, 3, -corresponding to a chain of signatures authenticating that public key, subject -to timeouts and indications of authority, which we will use in slave wallets -and identifiers that correspond to a corporate role in a large organization, -or in iff applications where new keys are frequently issued.
- -If we have a chain of signatures, the hash is of the root public key and the -data being signed (names, roles, times, and authorities) but not the signatures -themselves, nor the public keys that are being connected to that data.
- -When someone initiates a connection using such a complex identity, he sends -proof of shared knowledge of two shared secrets, and possibly a chain of -signatures, but does not sign anything with the transient public key, nor the -durable public key at the end of the chain of signatures, unless, as with a -review, he wants the message to be shared around, in which case he signs the -portion of the message that is to be shared around with that public key, but -not the part of the message that is private between sender and receiver.
- -To identify the globally unique twofitytwo bit id, the recipient hashes the -public key and the identifier that follows. Or he may already know, because -the sender has a client relationship with the recipient, that the public key -is associated with a given nickname and globally unique identifier.
- -If we send the schema identifier 0, we are not sending id information, either -because the recipient already has it, or because we don’t care, or because we -are only using this durable public key with this server and do not want a -global identity that can be shared between different recipients.
- -So, each master wallet will contain a strong human readable and human writeable -master secret, and the private key of each nickname will be generated by -hashing the nickname with the master secret, so that when, in some new -location, he types in the code on fresh hardware and blank software, he will -get the nickname’s public and private keys unchanged, even if he has to buy or -rent all fresh hardware, and is not carrying so much as a thumbdrive with him. -
- -People with high value secrets will likely use slave wallets, which perform -functions analogous to a POP or IMAP server, with high value secrets on their -master wallet, which chats to their slave wallet, which does not have high -value secrets. It has a signed message from the master wallet authorizing it -to receive value in the master wallets name, and another separately signed -message containing the obfuscation shared secret, which is based on the -master wallet secret and an,the first integer being the number of the -transaction with the largest transaction in that name known to the master -wallet, an integer that the slave wallet increments with every request for -value. The master wallet needs to provide a name for the slave walle, and to -recover payments, needs that name. The payments are made to a public key of -the master wallet, multiplied by a pseudo random scalar constructed from the -hash of a sequential integer with an obfuscation secret supplied by the master -wallet, so that the master wallet can recover the payments without further -communication, and so that the person making the payment knows that any -payment can only be respent by someone who has the master wallet secret key, -which may itself be the key at the end of chain of signatures identity, -which the master wallet posesses, but the slave wallet does not.
- -For slave wallets to exist, the globally unique id has to be not a public -key, but rather the twofiftytwo bit hash of a rule identifying the public key. -The simplest case being hash(2|public key|name), which is the identifier used -by a master wallet. A slave wallet would use the identifier hash(3|master -public key|chain of signed ids) with the signatures and public keys in the -chain omitted from the hash, so the durable public key and master secret of -the slave wallet does not need to be very durable at all. It can, and -probably should, have a short timeout and be frequently updated by messages -from the master wallet.
- -The intent is that a slave wallet can arrange payments on behalf of an identity -whose secret it does not possess, and the payer can prove that he made the -payment to that identity. So if the government grabs the slave wallet, which -is located in a big internet data center, thus eminently grabbable, it grabs -nothing, not reputation, not an identity, and not money. The slave wallet has -the short lived and regularly changed secret for the public key of an -unchanging identity identity authorized to make offers on behalf a long lived -identity, but not the secret for an identity authorized to receive money on -behalf of a long lived identity. The computer readable name of these -identities is a twofiftytwo bit hash, and the human readable name is something -like receivables@_@globally_unique_human_readable_name, or -sales@_@globally_unique_human_readable_name. The master secret for -_@globally_unique_human_readable_name is closely held, and the master secret -for globally_unique_human_readable_name written in pencil on a closely held -piece of paper and is seldom in any computer at all.
- -For a name that is not globally unique, the human readable name is -non_unique_human_readable_nickname zero_or_more_whitespace -42_characters_of_slash6_code zero_or_more_whitespace.
-
-Supposing Carol wants to talk to Dave, and the base point is B. Carols secret is the scalar c
, and her public key is elliptic point C=c*B
. Similarly Dave’s secret is the scalar d
, and his public key is elliptic point
-D=d*B
.
-
-His secret scalar d
is probably derived from the hash of his master secret with his nickname, which we presume is "Dave".
-
-They could establish a shared secret by Carol calculating c*D
, and
-Dave calculating d*C
. But if either party’s key is seized, their
-past communications become decryptable. Plus this unnecessarily leaks metadata
- o people watching the interaction.
-
-Better, Carol generates a random scalar r
, unpredictable to anyone
-else, calculates R=r*B
, and sends Dave R, C
encrypted
-using r*D,
and the rest of the message encrypted using
-(r+c)*D
.
- -This gives us perfect forward secrecy and off-the-record.
- -One flaw in this is that if Dave’s secret leaks, not only can he and the people communicating with him be spied upon, but he can receive falsified messages that appear to come from trusted associates.
- -One fix for this is the following protocol. When Carol initiates communication -with Dave, she encrypts only with the transient public and private keys, but -when Dave replies, he encrypts with another transient key. If Carol can -receive his reply, it really is Carol. This does not work with one way -messages, mail like messages, but every message that is a reply should -automatically contain a reference to the message that it replies to, and every -message should return an ack, so one way communication is a limited case.
- -Which case can be eliminated by handling one way messages as followed up by -mutual acks. For TCP, the minimum set up and tear down of a TCP connection -needs three messages to set up a connection, and three messages to shut it -down, so if we could multiplex encryption setup into communication setup, which -we should, we could bullet proof authentication. To leak the least metadata, -want to establish keys as early in the process as we can while avoiding the -possibility of DNS attacks against the connection setup process. If -communication only ensues when both sides know four shared secrets, then the -communication can only be forged or intercepted by a party that knows both -sides durable secrets.
- -We assume on public key per network address and port – that when she got the port -associated with the public key, she learned what which of his possibly numerous -public keys will be listening on that port.
-
-Carol calculates (c+r)*D
. Dave calculates d*(C+R)
, to arrive at a shared secret, used only once.
-
-That is assuming Carol needs to identify. If this is just a random anonymous client connection, in which Dave responds the same way to every client, all she needs to send is R
and D
. She does not need forward secrecy, since not re-using R
. But chances are Dave does not like this, since chances are he wants to generate a record of the distinct entities applying, so a better solution is for Carol to use as her public key when talking to Dave hash(c, D)*B
-
-Sometimes we are going to onion routing, where the general form of an onion routed message is [R, D,
encrypted] with the encryption key being d*R = r*D
. And inside that encryption may be another message of the same form, another layer of the onion.
- -But this means that Dave cannot contact Carol. But maybe Carol does not want to be contacted, in which case we need a flag to indicate that this is an uncontactable public key. Conversely, if she does want to be contactable, but her network address changes erratically, she may need to include contact info, a stable server that holds her most recent network address.
- -Likely all she needs is that Dave knows that she is the same entity as logged in with Dave before. (Perhaps because all he cares about is that this is the same entity that paid Dave for services to be rendered, whom he expects to keep on paying him for services he continues to render.)
- -The common practical situation, of course is that Carol@Name wants to talk to Dave@AnotherName, and Name knows the address of AnotherName, and AnotherName knows the name of Dave, and Carol wants to talk to Dave as Carol@Name. To make this secure, have to have have a a commitment to the keys of Carol@Name and Dave@AnotherName, so that Dave and Carol can see that Name and AnotherName are showing the same keys to everyone.
- -The situation we would like to have is a web of public documents rendered immutable by being linked into the blockchain, identifying keys, and a method of contacting the entity holding the key, the method of contacting the identity holding the key being widely available public data.
-
-In this case, Carol may well have a public key only for use with Name, and Dave a public key only for use with AnotherName.
-
-If using d*C
as the identifier, inadvisable to use a widely known C
, because this opens attacks via d
, so C
should be a per host public key that Carol keeps secret.
-
-The trouble with all these conversations is that they leak metadata – it is visible to third parties that C
is talking to D
.
- -Suppose it is obvious that you are talking to Dave, because you had to look up Dave’s network address, but you do not want third parties to know that Carol is talking to Dave. Assume that Dave’s public key is the only public key associated with this network address.
- -We would rather not make it too easy for the authorities to see the public key of the entity you are contacting, so would like to have D end to end encrypted in the message. You are probably contacting the target through a rendevous server, so you contact the server on its encrypted key, and it sets up the rendevous talking to the wallet you want to contact on its encrypted key, in which case the messages you send in the clear do not need, and should not have, the public key of the target.
-
-Carol sends R
to Dave in the clear, and encrypts C
and r
*D
, using the shared secret key r
*D
= d
*R
-
-Subsequent communications take place on the shared secret key (c+r)*D = d*(C+R)
-
-
-Suppose there are potentially many public keys associated with this network address, as is likely to be the case if it is a slave wallet performing Pop and Imap like functions. Then it has one primary public key for initiating communications. Call its low value primary keys p and P. Then Carol sends R
in the clear, followed by D
encrypted to r*P = p*R
, followed by C
and r*D
, encrypted to r*D = d*R
.
- -We can do this recursively, and Dave can return another, possibly transient, public key or public key chain that works like a network address. This implies a messaging system whose api is that you send an arbitrary length message with a public key as address, and possibly a public key as authority, and an event identifier, and then eventually you get an event call back, which may indicate merely that the message has gone through, or not, and may contain a reply message. The messaging system handles the path and the cached shared secrets. All shared secrets are in user space, and messages with different source authentication have different connections and different shared secrets.
- -Bitcoin just generates a key at random when you need one, and records it. A paper wallet is apt to generate a stack of them in advance. They offer the option of encrypting the keys with a lesser secret, but the master secret has to be stronger, because it has to be strong enough to resist attacks on the public keys.
- -On reflection, there are situations where this is difficult – we would like the customer, rather than the recipients, to generate the obfuscation keys, and pay the money to a public key that consists of the recipients deeply held secret, and a not very deeply held obfuscation shared secret.
- -This capability is redundant, because we also plan to attach to the transaction a Merkle hash that is the root of tree of Merkle hashes indexed by output, in Merkle-patricia order, each of which may be a random number or may identify arbitrary information, which could include any signed statements as to what the obligations the recipient of the payment has agreed to, plus information identifying the recipient of the payment, in that the signing key of the output is s*B+recipient_public key. But it would be nice to make s derived from the hash of the recipients offer, since this proves that when he spends the output, he spends value connected to obligations.
- -Such receipts are likely generated by a slave wallet, which may well do things differently to a master wallet. So let us just think about the master wallet at this point. We want the minimum capability to do only those things a master wallet can do, while still being future compatible with all the other things we want to do.
- -When we generate a public key and use it, want to generate records of to how it was generated, why, and how it was used. But this is additional tables, and an identifier in the table of public keys saying which additional table to look at.
- -One meaning per public key. If we use the public key of a name for many purposes, use it as a name. We don’t use it as a transaction key except for transactions selling, buying, or assigning that name. But we could have several possible generation methods for a single meaning.
- -And, likely, the generation method depends on the meaning. So, in the table of public keys we have a single small integer telling us the kind of thing the public key is used for, and if you look up that table indicated by the integer, maybe all the keys in that table are generated by one method, or maybe several methods, and we have a single small integer identifying the method, and we look up the key in that table which describes how to generate the key by that method.
- -Initially we have only use. Names. And the table for names can have only one method, hence does not need an index identifying the method.
- -So, a table of public key private key pairs, with a small integer identifying the use and whether the private key has been zeroed out, and a table of names, with the rowid of the public key. Later we introduce more uses, more possible values for the use key, more tables for the uses, and more tables for the generation methods.
- -A wallet is the data of a Zooko identity. When logged onto the interent is is a Zooko triangle identity or Zookos quandrangle identity, but is capable of logging onto the internet as a single short user identity term, or logging on with one peer identity per host. When that wallet is logged in, it is Zooko identity with its master secret and an unlimited pile of secrets and keys generated on demand from a single master secret.
- -We will want to implement the capability to receive value into a key that is not currently available. So how are we going to handle the identity that the wallet will present?
- -Well, keys that have a time limited signature that they can be used for certain purposes in the name of another key are a different usage, that will have other tables, that we can add later.
-
-A receive only wallet, when engaging in a transaction, identifies itself as Zooko identity for which it does not have the master key, merely a signed authorization allowing it to operate for that public key for some limited lifetime, and when receiving value, requests that value be placed in public keys that it does not currently possess, in the form of a scalar and the public key that will receive the value. The public key of the transaction output will be s*B+D
, where s is the obfuscating value that hides the beneficiery from the public blockchain, B is the base, and D
is the beneficiary. B is public, s*B+D
is public, but s and D
are known only to the parties to the transaction, unless they make them public.
- -And that is another usage, and another rule for generating secrets, which we can get around to another day.
- -The implementation can and will vary from one peer to the next. The canonical form is going to be a Merkle-patricia dac, but the Merkle-patricia dac has to be implemented on top of something, and our first implementation is going to implement the Merkle-patricia dac on top of a particular sqlite database with a particular name, and that name stored in a location global to all programs of a particular user on a particular machine, and thus that database global to all programs of that particular user on that particular machine.
- -Then we will implement a global consensus Merkle-patricia dac, so that everyone agrees on the one true mapping between human readable names, but the global consensus dac has to exist on top of particular implementations run by particular users on particular machines, whose implementation may well vary from one machine to the next.
- -A patricia tree contains, in itself, just a pile of bitstreams. To represent actual information, there has to be a schema. It has to be a representation of something equivalent to a pile of records in a pile of database tables. And so, before we represent data, before we create the patricia tree, have to first have a particular working database, a particular implementation, which represents actual data. Particular database first, then we construct universal canonical representation of that data second.
- -In every release version, starting with the very first release, we are going to have to install a database, if one is not already present, in order that the user can communicate with other users, so we will have no automagic creation of the database. I will manually create and install the first database, and will have a dump file for doing so, assuming that the dump file can create blobs.
- -A receive only wallet contains the public key and zooko name of its master wallet, optionally a thirty two byte secret, a numbered signed record by the master wallet authorizing it to use that name and receive value on the master wallet secret and its own public key.
- -A zooko quandrangle identifier consists of a public master key, a globally accepted human readable name globally accepted as identifying that key, a local human readable petname, normally identical to the global name, and an owner selected human readable nickname.
- -A contact consists of a public key, and network address information where you will likely find a person or program that knows the corresponding private key, or is authorized by that private key to deal with communications.
- -So, first thing we have to do is create a wxWidgets program that accesses the database, and have it create Zooko’s triangle identities, with the potential for becoming Zooko’s quandrangle identities.
- --Our protocol, unlike bitcoin, has proof of transaction. The proof is private, but can be made public, to support ebay type reputations.
- -Supposing you are sending value to someone, who has a big important reputation that he would not want damaged. And you want proof he got the value, and that he agreed to goods, services, or payment of some other form of money in return for the value.
- -Well, if he has a big important reputation, chances are you are not transacting with him personally and individually through a computer located behind a locked door in his basement, to which only he has the key, plus he has a shotgun on the wall near where he keeps the key. Rather, you are transacting with him through a computer somwhere in the cloud, in a big computing center to which far too many people have access, including the computer center management, police, the Russian mafia, the Russian spy agency, and the man who mops the floors.
- -So, when "he", which is to say the computer in the cloud, sends you public keys for you to put value into on the blockchain, you want to be sure that only he can control value you put into the blockchain. And you want to be able to prove it, but you do not want anyone else except you and he (and maybe everyone who has access to the data center on the cloud) can prove it except you make the data about your transaction public.
- -You are interacting with a program. And the program probably only has low value keys. It has a key signed by a key signed by a key signed by his high value and closely held key, with start times and timeouts on all of the intermediate keys.
- -So he should have a key that is not located in the datacenter, and you want proof that only the holder of that key can spend the money – maybe an intermediate key with a timeout on it that authorizes it to receive money, which signs an end key that authorizes the program to agree to deals, but not to receive money – the intermediate key authorized to receive money presumably not being in the highly insecure data center.
- -This document is licensed under the CreativeCommons Attribution-Share Alike 3.0 License
- - diff --git a/docs/wallet_implementation.md b/docs/wallet_implementation.md new file mode 100644 index 0000000..149cfc7 --- /dev/null +++ b/docs/wallet_implementation.md @@ -0,0 +1,280 @@ +--- +lang: en +title: Wallet Implementation +--- + +The primary function of the wallet file is to provide a secret key for +a public key, though eventually we stuff all sorts of user odds and ends +into it. + +In Bitcoin, this is simply a pile of random secrets, randomly generated, but obviously it is frequently useful to have them not random, but merely seemingly random to outsiders. One important and valuable application of this is the paper wallet, where one can recreate the wallet from its master secret, because all the seemingly random secret keys are derived from a single master secret. But this does not cover all use cases. + +We care very much about the case where a big important peer in a big central +node of the internet, and the owner skips out with the paper key that owns the +peer’s reputation in his pocket, and sets up the same peer in another +jurisdiction, and everyone else talks to the peer in the new jurisdiction, +and ignores everything the government seized. + +The primary design principle of our wallet is to bypass the DNS and ICANN, so +that people can arrange transactions securely. The big security hole in +Bitcoin is not the one that Monero fixes, but that you generally arrange +transactions on the totally insecure internet. The spies care about the +metadata more than they care about the data. We need to internalize and secure +the metadata. + +We particularly want peers to be controlled by easily hidden and transported +secrets. The key design point around which all else is arranged is that if +some man owns a big peer on the blockchain, and the cops descend upon it, he +grabs a book of his bookshelf that looks like any airport book, except on +pencil in the margins of one page is his master secret, written in pencil, +and a list of all his nicknames, also written in pencil. Then he skips off +to the airport and sets up his peer on a new host, and no one notices that it +has moved except for the cops who grabbed his now useless hardware. (Because +the secrets on that hardware were only valuable because their public keys +were signed by his keys, and when he starts up the new hardware, his keys +will sign some new keys on the peer and on the slave wallet on that hardware.) + +We also want perfect forward secrecy, which is easy. Each new connection +initiation starts with a random transient public key, and any identifying +information is encrypted in the shared secret formed from that transient public +key, and the durable public key of the recipient. + +We also want off the record messaging. So we want to prove to the recipient +that the sender knows the shared secret formed from the recipients public key +and the sender’s transient private key, and also the shared secret formed from +the recipient’s public key, the sender’s transient private key, and the +sender’s durable private key. But the recipient, though he then knows the +message came from someone who has the sender’s durable private key, cannot +prove that to a third party, because he could have constructed that shared +secret, even if the sender did not know it. The recipient could have forged +the message and sent it to himself. + +Thus we get off the record messaging. The sender by proving knowledge of two +shared secrets, proves to the recipient knowledge of two secret keys +corresponding to the two public keys provided, but though the sender proves +it to the recipient, the recipient cannot prove it to anyone else. + +The message consists of transient public key in the clear, which encrypts +durable public key. then durable public key plus transient public key encrypts +the rest of the message, which encryption proves knowledge of the secret key +underlying the durable public key, proves it to the recipient, but he cannot +prove it to anyone else. + +The durable public key may be followed by the schema identifier 2, which +implies the nickname follows, which implies the twofitytwo bit hash of the +public key followed by nickname, which is the global identifier of the +pseudonym sending the message. But we could have a different schema, 3, +corresponding to a chain of signatures authenticating that public key, subject +to timeouts and indications of authority, which we will use in slave wallets +and identifiers that correspond to a corporate role in a large organization, +or in iff applications where new keys are frequently issued. + +If we have a chain of signatures, the hash is of the root public key and the +data being signed (names, roles, times, and authorities) but not the signatures +themselves, nor the public keys that are being connected to that data. + +When someone initiates a connection using such a complex identity, he sends +proof of shared knowledge of two shared secrets, and possibly a chain of +signatures, but does not sign anything with the transient public key, nor the +durable public key at the end of the chain of signatures, unless, as with a +review, he wants the message to be shared around, in which case he signs the +portion of the message that is to be shared around with that public key, but +not the part of the message that is private between sender and receiver. + +To identify the globally unique twofitytwo bit id, the recipient hashes the +public key and the identifier that follows. Or he may already know, because +the sender has a client relationship with the recipient, that the public key +is associated with a given nickname and globally unique identifier. + +If we send the schema identifier 0, we are not sending id information, either +because the recipient already has it, or because we don’t care, or because we +are only using this durable public key with this server and do not want a +global identity that can be shared between different recipients. + +So, each master wallet will contain a strong human readable and human writeable +master secret, and the private key of each nickname will be generated by +hashing the nickname with the master secret, so that when, in some new +location, he types in the code on fresh hardware and blank software, he will +get the nickname’s public and private keys unchanged, even if he has to buy or +rent all fresh hardware, and is not carrying so much as a thumbdrive with him. + +People with high value secrets will likely use slave wallets, which perform +functions analogous to a POP or IMAP server, with high value secrets on their +master wallet, which chats to their slave wallet, which does not have high +value secrets. It has a signed message from the master wallet authorizing it +to receive value in the master wallets name, and another separately signed +message containing the obfuscation shared secret, which is based on the +master wallet secret and an,the first integer being the number of the +transaction with the largest transaction in that name known to the master +wallet, an integer that the slave wallet increments with every request for +value. The master wallet needs to provide a name for the slave wallet, and to +recover payments, needs that name. The payments are made to a public key of +the master wallet, multiplied by a pseudo random scalar constructed from the +hash of a sequential integer with an obfuscation secret supplied by the master +wallet, so that the master wallet can recover the payments without further +communication, and so that the person making the payment knows that any +payment can only be respent by someone who has the master wallet secret key, +which may itself be the key at the end of chain of signatures identity, +which the master wallet posesses, but the slave wallet does not. + +For slave wallets to exist, the globally unique id has to be not a public +key, but rather the twofiftytwo bit hash of a rule identifying the public key. +The simplest case being hash(2\|public key\|name), which is the identifier used +by a master wallet. A slave wallet would use the identifier hash(3\|master +public key\|chain of signed ids) with the signatures and public keys in the +chain omitted from the hash, so the durable public key and master secret of +the slave wallet does not need to be very durable at all. It can, and +probably should, have a short timeout and be frequently updated by messages +from the master wallet. + +The intent is that a slave wallet can arrange payments on behalf of an identity +whose secret it does not possess, and the payer can prove that he made the +payment to that identity. So if the government grabs the slave wallet, which +is located in a big internet data center, thus eminently grabbable, it grabs +nothing, not reputation, not an identity, and not money. The slave wallet has +the short lived and regularly changed secret for the public key of an +unchanging identity identity authorized to make offers on behalf a long lived +identity, but not the secret for an identity authorized to receive money on +behalf of a long lived identity. The computer readable name of these +identities is a twofiftytwo bit hash, and the human readable name is something +like receivables@\_@globally_unique_human_readable_name, or +sales@\_@globally_unique_human_readable_name. The master secret for +\_@globally_unique_human_readable_name is closely held, and the master secret +for globally_unique_human_readable_name written in pencil on a closely held +piece of paper and is seldom in any computer at all. + +For a name that is not globally unique, the human readable name is +non_unique_human_readable_nickname zero_or_more_whitespace +42_characters_of_slash6_code zero_or_more_whitespace. + +Supposing Carol wants to talk to Dave, and the base point is B. Carols secret is the scalar `c`, and her public key is elliptic point `C=c*B`. Similarly Dave’s secret is the scalar `d`, and his public key is elliptic point +`D=d*B`. + +His secret scalar `d` is probably derived from the hash of his master secret with his nickname, which we presume is "Dave". + +They could establish a shared secret by Carol calculating `c*D`, and +Dave calculating `d*C`. But if either party’s key is seized, their +past communications become decryptable. Plus this unnecessarily leaks metadata +o people watching the interaction. + +Better, Carol generates a random scalar `r`, unpredictable to anyone +else, calculates `R=r*B`, and sends Dave `R, C` encrypted +using `r*D,` and the rest of the message encrypted using ` (r+c)*D`. + +This gives us perfect forward secrecy and off-the-record. + +One flaw in this is that if Dave’s secret leaks, not only can he and the people communicating with him be spied upon, but he can receive falsified messages that appear to come from trusted associates. + +One fix for this is the following protocol. When Carol initiates communication +with Dave, she encrypts only with the transient public and private keys, but +when Dave replies, he encrypts with another transient key. If Carol can +receive his reply, it really is Carol. This does not work with one way +messages, mail like messages, but every message that is a reply should +automatically contain a reference to the message that it replies to, and every +message should return an ack, so one way communication is a limited case. + +Which case can be eliminated by handling one way messages as followed up by +mutual acks. For TCP, the minimum set up and tear down of a TCP connection +needs three messages to set up a connection, and three messages to shut it +down, so if we could multiplex encryption setup into communication setup, which +we should, we could bullet proof authentication. To leak the least metadata, +want to establish keys as early in the process as we can while avoiding the +possibility of DNS attacks against the connection setup process. If +communication only ensues when both sides know four shared secrets, then the +communication can only be forged or intercepted by a party that knows both +sides durable secrets. + +We assume on public key per network address and port – that when she got the port +associated with the public key, she learned what which of his possibly numerous +public keys will be listening on that port. + +Carol calculates `(c+r)*D`. Dave calculates `d*(C+R)`, to arrive at a shared secret, used only once. + +That is assuming Carol needs to identify. If this is just a random anonymous client connection, in which Dave responds the same way to every client, all she needs to send is `R` and `D`. She does not need forward secrecy, since not re-using `R`. But chances are Dave does not like this, since chances are he wants to generate a record of the distinct entities applying, so a better solution is for Carol to use as her public key when talking to Dave `hash(c, D)*B` + +Sometimes we are going to onion routing, where the general form of an onion routed message is \[`R, D,` encrypted\] with the encryption key being `d*R = r*D`. And inside that encryption may be another message of the same form, another layer of the onion. + +But this means that Dave cannot contact Carol. But maybe Carol does not want to be contacted, in which case we need a flag to indicate that this is an uncontactable public key. Conversely, if she does want to be contactable, but her network address changes erratically, she may need to include contact info, a stable server that holds her most recent network address. + +Likely all she needs is that Dave knows that she is the same entity as logged in with Dave before. (Perhaps because all he cares about is that this is the same entity that paid Dave for services to be rendered, whom he expects to keep on paying him for services he continues to render.) + +The common practical situation, of course is that Carol@Name wants to talk to Dave@AnotherName, and Name knows the address of AnotherName, and AnotherName knows the name of Dave, and Carol wants to talk to Dave as Carol@Name. To make this secure, have to have have a a commitment to the keys of Carol@Name and Dave@AnotherName, so that Dave and Carol can see that Name and AnotherName are showing the same keys to everyone. + +The situation we would like to have is a web of public documents rendered immutable by being linked into the blockchain, identifying keys, and a method of contacting the entity holding the key, the method of contacting the identity holding the key being widely available public data. + +In this case, Carol may well have a public key only for use with Name, and Dave a public key only for use with AnotherName. +If using `d*C` as the identifier, inadvisable to use a widely known `C`, because this opens attacks via `d`, so `C` should be a per host public key that Carol keeps secret. + +The trouble with all these conversations is that they leak metadata – it is visible to third parties that `C` is talking to `D`. + +Suppose it is obvious that you are talking to Dave, because you had to look up Dave’s network address, but you do not want third parties to know that *Carol* is talking to Dave. Assume that Dave’s public key is the only public key associated with this network address. + +We would rather not make it too easy for the authorities to see the public key of the entity you are contacting, so would like to have D end to end encrypted in the message. You are probably contacting the target through a rendevous server, so you contact the server on its encrypted key, and it sets up the rendevous talking to the wallet you want to contact on its encrypted key, in which case the messages you send in the clear do not need, and should not have, the public key of the target. + +Carol sends `R` to Dave in the clear, and encrypts `C` and `r`\*`D`, using the shared secret key `r`\*`D` = `d`\*`R` + +Subsequent communications take place on the shared secret key `(c+r)*D = d*(C+R)` + +Suppose there are potentially many public keys associated with this network address, as is likely to be the case if it is a slave wallet performing Pop and Imap like functions. Then it has one primary public key for initiating communications. Call its low value primary keys p and P. Then Carol sends `R` in the clear, followed by `D` encrypted to `r*P = p*R`, followed by `C` and `r*D`, encrypted to `r*D = d*R`. + +We can do this recursively, and Dave can return another, possibly transient, public key or public key chain that works like a network address. This implies a messaging system whose api is that you send an arbitrary length message with a public key as address, and possibly a public key as authority, and an event identifier, and then eventually you get an event call back, which may indicate merely that the message has gone through, or not, and may contain a reply message. The messaging system handles the path and the cached shared secrets. All shared secrets are in user space, and messages with different source authentication have different connections and different shared secrets. + +Bitcoin just generates a key at random when you need one, and records it. A paper wallet is apt to generate a stack of them in advance. They offer the option of encrypting the keys with a lesser secret, but the master secret has to be stronger, because it has to be strong enough to resist attacks on the public keys. + +On reflection, there are situations where this is difficult – we would like the customer, rather than the recipients, to generate the obfuscation keys, and pay the money to a public key that consists of the recipients deeply held secret, and a not very deeply held obfuscation shared secret. + +This capability is redundant, because we also plan to attach to the transaction a Merkle hash that is the root of tree of Merkle hashes indexed by output, in Merkle-patricia order, each of which may be a random number or may identify arbitrary information, which could include any signed statements as to what the obligations the recipient of the payment has agreed to, plus information identifying the recipient of the payment, in that the signing key of the output is s\*B+recipient_public key. But it would be nice to make s derived from the hash of the recipients offer, since this proves that when he spends the output, he spends value connected to obligations. + +Such receipts are likely generated by a slave wallet, which may well do things differently to a master wallet. So let us just think about the master wallet at this point. We want the minimum capability to do only those things a master wallet can do, while still being future compatible with all the other things we want to do. + +When we generate a public key and use it, want to generate records of to how it was generated, why, and how it was used. But this is additional tables, and an identifier in the table of public keys saying which additional table to look at. + +One meaning per public key. If we use the public key of a name for many purposes, use it as a name. We don’t use it as a transaction key except for transactions selling, buying, or assigning that name. But we could have several possible generation methods for a single meaning. + +And, likely, the generation method depends on the meaning. So, in the table of public keys we have a single small integer telling us the kind of thing the public key is used for, and if you look up that table indicated by the integer, maybe all the keys in that table are generated by one method, or maybe several methods, and we have a single small integer identifying the method, and we look up the key in that table which describes how to generate the key by that method. + +Initially we have only use. Names. And the table for names can have only one method, hence does not need an index identifying the method. + +So, a table of public key private key pairs, with a small integer identifying the use and whether the private key has been zeroed out, and a table of names, with the rowid of the public key. Later we introduce more uses, more possible values for the use key, more tables for the uses, and more tables for the generation methods. + +A wallet is the data of a Zooko identity. When logged onto the interent is *is* a Zooko triangle identity or Zookos quandrangle identity, but is capable of logging onto the internet as a single short user identity term, or logging on with one peer identity per host. When that wallet is logged in, it is Zooko identity with its master secret and an unlimited pile of secrets and keys generated on demand from a single master secret. + +We will want to implement the capability to receive value into a key that is not currently available. So how are we going to handle the identity that the wallet will present? + +Well, keys that have a time limited signature that they can be used for certain purposes in the name of another key are a different usage, that will have other tables, that we can add later. + +A receive only wallet, when engaging in a transaction, identifies itself as Zooko identity for which it does not have the master key, merely a signed authorization allowing it to operate for that public key for some limited lifetime, and when receiving value, requests that value be placed in public keys that it does not currently possess, in the form of a scalar and the public key that will receive the value. The public key of the transaction output will be s\*B+`D`, where s is the obfuscating value that hides the beneficiery from the public blockchain, B is the base, and `D` is the beneficiary. B is public, s\*B+`D` is public, but s and `D` are known only to the parties to the transaction, unless they make them public. + +And that is another usage, and another rule for generating secrets, which we can get around to another day. + +The implementation can and will vary from one peer to the next. The canonical form is going to be a Merkle-patricia dac, but the Merkle-patricia dac has to be implemented on top of something, and our first implementation is going to implement the Merkle-patricia dac on top of a particular sqlite database with a particular name, and that name stored in a location global to all programs of a particular user on a particular machine, and thus that database global to all programs of that particular user on that particular machine. + +Then we will implement a global consensus Merkle-patricia dac, so that everyone agrees on the one true mapping between human readable names, but the global consensus dac has to exist on top of particular implementations run by particular users on particular machines, whose implementation may well vary from one machine to the next. + +A patricia tree contains, in itself, just a pile of bitstreams. To represent actual information, there has to be a schema. It has to be a representation of something equivalent to a pile of records in a pile of database tables. And so, before we represent data, before we create the patricia tree, have to first have a particular working database, a particular implementation, which represents actual data. Particular database first, then we construct universal canonical representation of that data second. + +In every release version, starting with the very first release, we are going to have to install a database, if one is not already present, in order that the user can communicate with other users, so we will have no automagic creation of the database. I will manually create and install the first database, and will have a dump file for doing so, assuming that the dump file can create blobs. + +A receive only wallet contains the public key and zooko name of its master wallet, optionally a thirty two byte secret, a numbered signed record by the master wallet authorizing it to use that name and receive value on the master wallet secret and its own public key. + +A zooko quandrangle identifier consists of a public master key, a globally accepted human readable name globally accepted as identifying that key, a local human readable petname, normally identical to the global name, and an owner selected human readable nickname. + +A contact consists of a public key, and network address information where you will likely find a person or program that knows the corresponding private key, or is authorized by that private key to deal with communications. + +So, first thing we have to do is create a wxWidgets program that accesses the database, and have it create Zooko’s triangle identities, with the potential for becoming Zooko’s quandrangle identities. + +# Proving knowledge of the secret key of a public key + +Our protocol, unlike bitcoin, has proof of transaction. The proof is private, but can be made public, to support ebay type reputations. + +Supposing you are sending value to someone, who has a big important reputation that he would not want damaged. And you want proof he got the value, and that he agreed to goods, services, or payment of some other form of money in return for the value. + +Well, if he has a big important reputation, chances are you are not transacting with him personally and individually through a computer located behind a locked door in his basement, to which only he has the key, plus he has a shotgun on the wall near where he keeps the key. Rather, you are transacting with him through a computer somwhere in the cloud, in a big computing center to which far too many people have access, including the computer center management, police, the Russian mafia, the Russian spy agency, and the man who mops the floors. + +So, when "he", which is to say the computer in the cloud, sends you public keys for you to put value into on the blockchain, you want to be sure that only he can control value you put into the blockchain. And you want to be able to prove it, but you do not want anyone else except you and he (and maybe everyone who has access to the data center on the cloud) can prove it except you make the data about your transaction public. + +You are interacting with a program. And the program probably only has low value keys. It has a key signed by a key signed by a key signed by his high value and closely held key, with start times and timeouts on all of the intermediate keys. + +So he should have a key that is not located in the datacenter, and you want proof that only the holder of that key can spend the money – maybe an intermediate key with a timeout on it that authorizes it to receive money, which signs an end key that authorizes the program to agree to deals, but not to receive money – the intermediate key authorized to receive money presumably not being in the highly insecure data center. + +This document is licensed under the [CreativeCommons Attribution-Share Alike 3.0 License](http://creativecommons.org/licenses/by-sa/3.0/){rel="license"} diff --git a/docs/writing_and_editing_documentation.md b/docs/writing_and_editing_documentation.md index 7fe5f6e..5aa7bf2 100644 --- a/docs/writing_and_editing_documentation.md +++ b/docs/writing_and_editing_documentation.md @@ -94,6 +94,24 @@ Since markdown has no concept of a title, Pandoc expects to find the title in a yaml inline, which is most conveniently put at the top, which renders it somewhat legible as a title. +Thus the markdown version of this document starts with: + +```markdown +--- +title: >- + Writing and Editing Documentation +# katex +... +``` + +## Converting html source to markdown source + +In bash + +```bash +fn=foobar +git mv $fn.html $fn.md && cp $fn.md $fn.html && pandoc -s --to markdown-smart --eol=lf --wrap=preserve --verbose -o $fn.md $fn.html +``` ## Math expressions and katex @@ -154,15 +172,19 @@ For it offends me to put unnecessary fat in html files. ### overly clever katex tricks -$$k \approx \frac{m\,l\!n(2)}{n}%uses\, to increase spacing, uses \! to merge letters, uses % for comments $$ -$$k \approx\frac{m\>\ln(2)}{n}%uses\> for a marginally larger increase in spacing and uses \ln, the escape for the well known function ln $$ +spacing control +: $$k \approx \frac{m\,l\!n(2)}{n}%uses\, to increase spacing, uses \! to merge letters, uses % for comments $$ + $$k \approx\frac{m\>\ln(2)}{n}%uses\> for a marginally larger increase in spacing and uses \ln, the escape for the well known function ln $$ -$$ \exp\bigg(\frac{a+bt}{x}\bigg)=\huge e^{\bigg(\frac{a+bt}{x}\bigg)}%use the escape for well known functions, use text size sets$$ +size control +: $$ \exp\bigg(\frac{a+bt}{x}\bigg)=\huge e^{\bigg(\frac{a+bt}{x}\bigg)}%use the escape for well known functions, use text size sets$$ -$$k\text{, the number of hashes} \approx \frac{m\ln(2)}{n}% \text{} for render as text$$ +text within maths +: $$k\text{, the number of hashes,} \approx \frac{m\ln(2)}{n}% \text{} for render as text$$ -$$\def\mydef#1{\frac{#1}{1+#1}} \mydef{\mydef{\mydef{\mydef{y}}}}%katex macro $$ +katex macro used recursively +: $$\def\mydef#1{\frac{#1}{1+#1}} \mydef{\mydef{\mydef{\mydef{y}}}}%katex macro $$ ## Tables @@ -385,10 +407,9 @@ You change a control point, the effect is entirely local, does not propagate up and down the line. If, however, you have a long move and a short move, your implied control -point is likely to be in a pathological location, in which case you have to -follow an S curve by a C curve, and manually calculate the first point of -the C to be in line with the last two points of the prior curve. - +point is likely to be in a pathological location, so the last control point +of the long curve needs to be close to the starting point of the following +short move. ``` default M point q point point t point t point ... t point ``` @@ -434,7 +455,8 @@ choice of the initial control point and the position of the t points, but you ca down the line, and changing any of the intermediate t points will change the the direction the curve takes through all subsequent t points, sometimes pushing the curve into pathological territory where bezier -curves give unexpected and nasty results. +curves give unexpected and nasty results. Works OK if all your t curves +are of approximately similar length. Scalable vector graphics are dimensionless, and the `