diff --git a/docs/blockdag_consensus.md b/docs/blockdag_consensus.md index a4fa579..c19c443 100644 --- a/docs/blockdag_consensus.md +++ b/docs/blockdag_consensus.md @@ -732,7 +732,7 @@ read and write access costs a thousand times what tardigrade costs, will cost about twelve dollars. So, we should consider the blockdag as an immutable store of arbitrary -typed data, a reliable broadcast channel, where some types are executable, +typed data, a total order broadcast channel, where some types are executable, and, when executed, cause a change in mutable total state, typically that a new unspent coin record is added, and an old unspent coin record is deleted. diff --git a/docs/immutable_append_only_data_structure.md b/docs/immutable_append_only_data_structure.md index 49494c0..57bbbad 100644 --- a/docs/immutable_append_only_data_structure.md +++ b/docs/immutable_append_only_data_structure.md @@ -11,9 +11,9 @@ We want wallet chains hanging off the primary chain, and child chains. A wallet Each wallet will be a merkle chain,that links to the consensus chain and is linked by the consensus chain, and it consists of a linear chain of transactions each, with a sequence number, each of which commits a non dust amount to a transaction output owned by the wallet - which only owns one such output, because all outputs by other people have to be committed into the destination wallet in a reasonable time. -A wallet chain has a single authority, but we could also have chains hanging off it that represent the consensus of smaller groups, for example the shareholders of a company with their shares in a merkle child chain, and their own reliable broadcast channel. +A wallet chain has a single authority, but we could also have chains hanging off it that represent the consensus of smaller groups, for example the shareholders of a company with their shares in a merkle child chain, and their own total order broadcast channel. -The reliable broadcast channel has to keep the complete set of hashes of the most recent transaction outputs of a wallet that commits more than a dust amount of money to the continuing wallet state and keep the path to those states around forever, but with the merkle chain structure envisaged, the path to those states will only grow logarithmically. +The total order broadcast channel has to keep the complete set of hashes of the most recent transaction outputs of a wallet that commits more than a dust amount of money to the continuing wallet state and keep the path to those states around forever, but with the merkle chain structure envisaged, the path to those states will only grow logarithmically. I discarded the data on infix and postfix positioning of parents, and the tree depicted does not allow us to reach an arbitrary item from a leaf node, but the final additional merkle vertex produced along with each leaf node does reach every item. At the cost of including a third hash for backtracking in some of the merkle vertexes. @@ -207,4 +207,3 @@ Every time a fresh tiny file is created, a background process is started to chec So that files sort in the correct order, we name them in base 36, with a suffix indicating whether they are the fixed length index records, or the variable sized records that the fixed length records point into. Although we should make the files friendly for conventional backup for the sake of cold storage, we cannot rely on conventional backup mechanisms, because we have to always have to have very latest state securely backed up before committing it into the blockchain. Which is best done by having a set of remote Sqlite files. - diff --git a/docs/libraries.md b/docs/libraries.md index 31d71f6..df34a51 100644 --- a/docs/libraries.md +++ b/docs/libraries.md @@ -1335,6 +1335,11 @@ services. Writing networking for a service with large numbers of clients is very different between Windows and Linux, and I expect Tokio to take care of the differences. +Unfortunately rust async is beyond human comprehension, because +the borrow checker does not really know or understand what the runtime is +is up to, and the task of explaining the runtime to the borrow checker +is dropped on the shoulders of the long suffering programmer. + There really is not any good C or C++ environment for writing services except Wt, which is completely specialized for the case of writing a web service whose client is the browser, and which runs only on Linux. diff --git a/docs/libraries/cpp_multithreading.md b/docs/libraries/cpp_multithreading.md index 7817c62..a446771 100644 --- a/docs/libraries/cpp_multithreading.md +++ b/docs/libraries/cpp_multithreading.md @@ -106,7 +106,162 @@ resembling that is going to be needed when we have to shard. Dumb idea. We already have the node.js solution in a Rust library. -Actix and Tokio are the (somewhat Cish) solutions. +Actix and Tokio are the (somewhat Cish) solutions. But Rust async is infamously +hard. The borrow checker goes mad trying figure lifetimes in async + + +## callbacks + +In C, a callback is implemented as an ordinary function pointer, and a pointer to void, +which is then cast to a data structure of the desired type. + +What the heavy C++ machinery of `std::function` does is bind the two together and then +do memory management after the fashion of `std::string`. + +(but we probably need to do our own memory management, so need to write +our own equivalent of std funcction supporting a C rather than C++ api) + +[compiler explorer]:https://godbolt.org/ {target="_blank"} + +And `std::function`, used correctly, should compile to the identical code +merely wrapping the function pointer and the void pointer in a single struct +-- but you had better use [compiler explorer] to make sure +that you are using it correctly. + +Write a callback in C, an an std::function in c++, and make sure that +the compiler generates what it should. + +Ownership is going to be complicated -- since after createing and passing a callback, we +probably do not want ownership any more -- the thread is going to return +and be applied to some entirely different task. So the call that is passed the callback +as an argument by reference uses `move` to ensure that when the `std::function` +stack value in its caller pointing to the heap gets destroyed, it does not +free the value on the heap, and then stashes the moved `std::function` in some +safe place. + +Another issue is that rust, python, and all the rest cannot talk to C++, they can only +talk C. On the other hand, the compiler will probably optimise the `std::function` that +consists of a lamda that is a call to function pointer and that captures a pointer to void. + +Again, since compiler is designed for arcane optimization issues, have to see what happens +in [compiler explorer]. + +But rather than guessing about the compiler correctly guessing intent, make the +callback a C union type implementing std variant in C, being a union of `std:monostate` +a C callback taking no arguments, a C++ callback taking no arguments, C and C++ callbacks +taking a void pointer argument, a c++ callback that is a pointer to method, and +a C++ callback that is an `std::function` + +In the old win32 apis, which were designed for C, and then retrofitted for C++ +they would have a static member function that took an LPARAM, which was a pointer +to void pointing at the actual object, and then the static member function +would directly call the appropriate, usually virtual, actual member function. + +Member function pointers have syntax that no one can wrap their brains around +so people wrap them in layers of typedefs. + +Sometimes you want to have indefinitely many data structures, which are dynamically allocated +and then discarded. + +Sometimes you want to have a single data structure that gets overwritten frequently. The latter is +preferable when it suffices, since it means that asynch callback code is more like sync code. + +In one case, you would allocate the object every time, and when does with it, discard it. + +In the other case it would be a member variable of struct that hangs around and is continually +re-used. + +### C compatibility + +Bind the two together in a way that C can understand: + +The code that calls the callback knows nothing about how the blob is structured. +The event management code knows nothing about how the blob is structured. +But the function pointer in the blob *does* know how the blob is structured. + +```C +// p points to or into a blob of data containing a pointer to the callback +// and the data that the callback needs is in a position relative to the pointer +// that is known to the callback function. +enum event_type { monovalue, reply, timeout, unreachable, unexpected_error }; + + struct ReplyTo; + typedef extern "C" void (*ReplyTo_)(ReplyTo * pp, void* RegardsTo, event_type evtype, void* event); + struct ReplyTo + { + ReplyTo_ p; + }; + + ReplyTo * pp; + RegardsTo py; + +// Within the actual function in the event handling code, +// one has to cast `ReplyTo* pp` from its base type to its actual type +// that has the rest of the data, which the event despatcher code knows nothing of. +// The event despatch code should not include the headers of the event handling code, +// as this would make possible breach of separation of responsibilities. +try{ +(*((*pp).p))(pp, py, evtype, event ); +} +catch(...){ + // log error and release event handler. + // if an exception propagates from the event handling code into the event despatch code + // it is programming error, a violation of separation of responsibilities + // and the event despatch code cannot do anything with the error. +} +``` + +`pp` points into a blob that contains the data needed for handling the event when it happens, +and a pointer to the code that will handle it when it happens, ptrEvent is a ptr to a +struct containing the event. + +But, that C code will be quite happy if given a class whose first field is a pointer to a C calling +convention static member function that calls the next field, the +next field being a lambda whose unnameable type is known to the templated object when +it was defined, or if it is given a class whose first field is a pointer to a C calling convention +static function that does any esoteric C++, or Rust, or Lua thing. + +The runtime despatches an event to an object of type `ReplyTo` once and only +once and then it is freed. Thus if for example the object is waiting for a packet that has a handle +to it, or a timeout, and two such packets arrive it is only called with the +first such packet, the next packet is silently discarded, and the timeout +event cancelled or ignored. + +The object of type RegardsTo has a longer lifetime, which the runtime +does not manage. The runtime ensures that if two events reference +the same RegardsTo object, they are handled serially, except for +the case that the RegardsTo object is a nullpointer. + +The next event referencing the same RegardsTo object goes into a queue +waiting on completion of the previous message. + +If an event has an InRegards to, but no InReply, it goes to the static default ReplyTo handler of the InRegards object, which gets called many times +and whose lifetime is not managed by the runtime. + +If a message should result in changes to many InRegards to objects +one of them has to handle it, and then send messages to the others. + +Code called by the async runtime must refrain from updating or +reading data that could be changed by other code called by the +asynch runtime unless the data is atomically changed from one +valid state to another valid state. (For example a pointer pointing +to the previous valid state is atomically updated to a pointer to a newly +created valid state.) + +In the normal case the ReplyTo callback is sticking to data that +is in its ReplyTo and RegardsTo object. + +When an RegardsTo object tells the runtime it has finalized, +then the runtime will no longer do callbacks referencing it. + +The finalization is itself an event, and results in a callback to +another event data is a pointer to the finalized object and whose +ReplyTo and RegardsTo objects are the objects that created the +now finalized object. + +Thus one RegardsTo object can spawn many such objects, and +can read their contents when they finalise. (But not until they +finalise) ## Use Go diff --git a/docs/libraries/cpp_automatic_memory_management.md b/docs/libraries/unobvious_cpp.md similarity index 77% rename from docs/libraries/cpp_automatic_memory_management.md rename to docs/libraries/unobvious_cpp.md index 3aad8e2..d38158e 100644 --- a/docs/libraries/cpp_automatic_memory_management.md +++ b/docs/libraries/unobvious_cpp.md @@ -1,6 +1,10 @@ --- -title: - C++ Automatic Memory Management +title: >- + Unobvious C++ +sidebar: true +notmine: false +abstract: >- + A collection of notes about the some of somewhat esoteric and unobvious aspects of C++ ... # Memory Safety @@ -29,11 +33,14 @@ std::make_unique, std::make_shared create pointers to memory managed objects. (But single objects, not an array, use spans for pointer arithmetic) +```C++ auto sp = std::make_shared(42); std::weak_ptr wp{sp}; +``` # Array sizing and allocation +```C++ /* This code creates a bunch of "brown dog" strings on the heap to test automatic memory management. */ char ca[]{ "red dog" }; //Automatic array sizing std::array arr{"red dog"}; //Requires #include @@ -79,10 +86,11 @@ arithmetic) MyClass()=default; // Otherwise unlikely to be POD MyClass& operator=(const MyClass&) = default; // default assignment Not actually needed, but just a reminder. }; +``` - ### alignment +### alignment - ```c++ + ```C++ // every object of type struct_float will be aligned to alignof(float) boundary // (usually 4) struct alignas(float) struct_float { @@ -119,25 +127,46 @@ deleted. Copy constructors +```C++ A(const A& a) +``` Copy assignment +```C++ A& operator=(const A other) +``` Move constructors +```C++ class_name ( class_name && other) A(A&& o) D(D&&) = default; +``` Move assignment operator +```C++ V& operator=(V&& other) +``` Move constructors +```C++ class_name ( class_name && ) +``` + +## delegating constructor + +```C++ +class Foo +{ +public: + Foo(char x, int y) {} + Foo(int y) : Foo('a', y) {} // Foo(int) delegates to Foo(char, int) +}; +``` ## rvalue references @@ -181,6 +210,22 @@ where `std::forward` is defined as follows: in themselves, rather they cause the code referencing `t` to use the intended copy and intended assignment. +## delegating constructors + +calling one constructor from another. + +```C++ +example::example(... arguments ...): + example(...different arguments ...) + { + ... + code + ... + }; +``` + + + ## constructors and destructors If you declare the destructor deleted that prevents the compiler from @@ -219,12 +264,16 @@ in the source class instead of the destination class, hence most useful when you are converting to a generic C type, or to the type of an external library that you do not want to change. - struct X { - int y; - operator int(){ return y; } - operator const int&(){ return y; } /* C habits would lead you to incorrectly expect "return &y;", which is what is implied under the hood. */ - operator int*(){ return &y; } // Hood is opened. - }; +```C++ +struct X { + int y; + operator int(){ return y; } + operator const int&(){ return y; } /* C habits would lead you to + incorrectly expect "return &y;", which is what is + implied under the hood. */ + operator int*(){ return &y; } // Hood is opened. +}; +``` Mpir, the Visual Studio skew of GMP infinite precision library, has some useful and ingenious template code for converting C type functions of @@ -255,6 +304,35 @@ stack instead of 128, hardly a cost worth worrying about. And in the common bad case, (a+b)\*(c+d) clever coding would only save one stack allocation and redundant copy. +# Introspection and sfinae + +Almost all weird and horrific sfinae code [has been rendered unnecessary by concepts](https://www.cppstories.com/2016/02/notes-on-c-sfinae/). + +```c++ +template concept HasToString += requires(T v) { + {v.toString()} -> std::convertible_to; +}; +``` + +The `requires` clause is doing the sfinae behind your back to deliver a boolean. + +The concept name should be chosen to carry the meaning in an error message. + +And any time concepts cannot replace sfinae, sfinae can be done much better by `std::void_t`, which is syntactic sugar for "trigger substitution failure with the least possible distracting syntax" + +```c++ +// default template: +template< class , class = void > +struct has_toString : false_type { }; + +// specialized as has_member< T , void > or sfinae +template< class T> +struct has_toString< T , void_t > : +std::is_same< std::string, decltype(declval().toString()) > +{ }; +``` + # Template specialization namespace N { @@ -272,15 +350,6 @@ implementation will not be nearly so clever. extern template int fun(int); /*prevents redundant instantiation of fun in this compilation unit – and thus renders the code for fun unnecessary in this compilation unit.*/ -# Template traits, introspection - -Template traits: C++ has no syntactic sugar to ensure that your template -is only called using the classes you intend it to be called with. - -Often you want different templates for classes that implement similar functionality in different ways. - -This is the entire very large topic of template time, compile time code, which is a whole new ball of wax that needs to be dealt with elsewhere - # Abstract and virtual An abstract base class is a base class that contains a pure virtual @@ -487,115 +556,6 @@ similarly placement `new`, and `unique_ptr`. Trouble is that an std::function object is a fixed sized object, like an `std::string`, typically sixteen bytes. which like an `std::string` points to a dynamically allocated object on the heap. -## callbacks - -In C, a callback is implemented as an ordinary function pointer, and a pointer to void, -which is then cast to a data structure of the desired type. - -What the heavy C++ machinery of `std::function` does is bind the two together and then -do memory management after the fashion of `std::string`. - -[compiler explorer]:https://godbolt.org/ {target="_blank"} - -And `std::function`, used correctly, should compile to the identical code -merely wrapping the function pointer and the void pointer in a single struct --- but you had better use [compiler explorer] to make sure -that you are using it correctly. - -Write a callback in C, an an std::function in c++, and make sure that -the compiler generates what it should. - -Ownership is going to be complicated -- since after createing and passing a callback, we -probably do not want ownership any more -- the thread is going to return -and be applied to some entirely different task. So the call that is passed the callback -as an argument by reference uses `move` to ensure that when the `std::function` -stack value in its caller pointing to the heap gets destroyed, it does not -free the value on the heap, and then stashes the moved `std::function` in some -safe place. - -Another issue is that rust, python, and all the rest cannot talk to C++, they can only -talk C. On the other hand, the compiler will probably optimise the `std::function` that -consists of a lamda that is a call to function pointer and that captures a pointer to void. - -Again, since compiler is designed for arcane optimization issues, have to see what happens -in [compiler explorer]. - -But rather than guessing about the compiler correctly guessing intent, make the -callback a C union type implementing std variant in C, being a union of `std:monostate` -a C callback taking no arguments, a C++ callback taking no arguments, C and C++ callbacks -taking a void pointer argument, a c++ callback that is a pointer to method, and -a C++ callback that is an `std::function` - -In the old win32 apis, which were designed for C, and then retrofitted for C++ -they would have a static member function that took an LPARAM, which was a pointer -to void pointing at the actual object, and then the static member function -would directly call the appropriate, usually virtual, actual member function. - -Member function pointers have syntax that no one can wrap their brains around -so people wrap them in layers of typedefs. - -Sometimes you want to have indefinitely many data structures, which are dynamically allocated -and then discarded. - -Sometimes you want to have a single data structure that gets overwritten frequently. The latter is -preferable when it suffices, since it means that asynch callback code is more like sync code. - -In one case, you would allocate the object every time, and when does with it, discard it. - -In the other case it would be a member variable of struct that hangs around and is continually -re-used. - -### C compatibility - -Bind the two together in a way that C can understand: - -The code that calls the callback knows nothing about how the blob is structured. -The event management code knows nothing about how the blob is structured. -But the function pointer in the blob *does* know how the blob is structured. - -```C -// p points to or into a blob of data containing a pointer to the callback -// and the data that the callback needs is in a position relative to the pointer -// that is known to the callback function. -enum event_type { monovalue, reply, timeout, unreachable, unexpected_error }; - - struct callback; - typedef extern "C" void (*callback_)(callback * pp, event_type evtype, void* event); - struct callback - { - callback_ p; - }; - - callback * pp; - -// Within the actual function in the event handling code, -// one has to cast `callback* pp` from its base type to its actual type -// that has the rest of the data, which the event despatcher code knows nothing of. -// The event despatch code should not include the headers of the event handling code, -// as this would make possible breach of separation of responsibilities. -try{ -(*((*pp).p))(pp, evtype, event ); -} -catch(...){ - // log error and release event handler. - // if an exception propagates from the event handling code into the event despatch code - // it is programming error, a violation of separation of responsibilities - // and the event despatch code cannot do anything with the error. -} -``` - -`pp` points into a blob that containst the data needed for handling the event when will happen, -and a pointer to the code that will handle it when it happens, ptrEvent is a ptr to a -struct containing an index that will tell us what kind of struct it is. It is a C union -of very different structs. Since it is dynamicaly allocated, we don't waste space, we create -the particular struct, cast it to the union, and then cast the union to the struct it actually is. - -But, that C code will be quite happy if given a class whose first field is a pointer to a C calling -convention static member function that calls the next field, the -next field being a lambda whose unnameable type is known to the templated object when -it was defined, or if it is given a class whose first field is a pointer to a C calling convention -static function that does any esoteric C++, or Rust, or Lua thing. - # auto and decltype(variable) In good c++, a tremendous amount of code behavior is specified by type diff --git a/docs/lightning_layer.md b/docs/lightning_layer.md index 1bda8de..e57e007 100644 --- a/docs/lightning_layer.md +++ b/docs/lightning_layer.md @@ -28,7 +28,7 @@ just actions that are cryptographically possible or impossible. But scriptless scripts cannot in themselves solve the hard problem, that all participants in a multilateral transaction need to know _in a short time_ that the whole multilateral transaction has definitely succeeded or definitely -failed. This inherently requires a reliable broadcast channel, though if +failed. This inherently requires a total order broadcast channel, though if everyone is cooperating, they don’t have to actually put anything on that channel. But they need the capability to resort to that channel if something funny happens, and that capability has to be used or lost within a time limit. @@ -48,18 +48,18 @@ broadcast channel if the information is available, yet certain time limits are nonetheless exceeded. My conclusion was that full circle unbreakability of lightning network -transactions within time limits needs a reliable broadcast, and I envisaged - a hierarchy of reliable broadcasters, (sidechains, with some sidechains +transactions within time limits needs a total order broadcast, and I envisaged + a hierarchy of total order broadcasters, (sidechains, with some sidechains representing a group of bilateral lightning network gateways that act as one multilateral lightning network gateway) But this conclusion may be wrong or overly simple – though we are still going to need sidechains and - hierarchical reliable broadcasting, because it can do no end of things that + hierarchical total order broadcasting, because it can do no end of things that are very difficult otherwise. - But reliable broadcast mechanism both supplies and requires a solution to + But total order broadcast mechanism both supplies and requires a solution to distributed Byzantine fault tolerant consensus, so the problem of getting a lock up and bringing it down is a general distributed Byzantine fault - tolerant consensus problem, and perhaps viewing it as a reliable broadcast + tolerant consensus problem, and perhaps viewing it as a total order broadcast problem is a misperception and misanalysis. Rather, the blockdag requires a mechanism to establish a total order of @@ -329,10 +329,10 @@ from Carol. Full circle transaction. We need to guarantee that either the full circle goes through, or none of the separate unilateral transactions in the circle go through. -## Reliable broadcast channel +## total order broadcast channel The solution to atomicity and maintaining consistency between different -entities on the lightning network is the reliable broadcast channel. +entities on the lightning network is the total order broadcast channel. Such as the blockchain itself. Create a special zero value transaction that has no outputs and carries its own signature, but can be a required input to @@ -349,49 +349,49 @@ time limit, the transaction succeeds, and each party could potentially play the transaction, and thus effectively owns the corresponding part of the gateway coin, regardless of whether they play it or not. -A reliable broadcast channel is something that somehow works like a +A total order broadcast channel is something that somehow works like a classified ad did back in the days of ink on paper newspapers. The physical process of producing the newspaper guaranteed that every single copy had the exact same classified ad in it, and that ad must have been made public on a certain date. Easy to do this with a printing press that puts ink on paper. Very hard to do this, with electronic point to point communications. -But let us assume we somehow have a reliable broadcast channel: +But let us assume we somehow have a total order broadcast channel: All the parties agree on a Merkle tree, which binds them if the joint -signature to that Merkle tree appears on the reliable broadcast channel +signature to that Merkle tree appears on the total order broadcast channel within a certain short time period. And, if some of them have the joint signature, then knowing that they could -upload it to the reliable broadcast channel, they each agree to superseding +upload it to the total order broadcast channel, they each agree to superseding unilateral transactions. If Bob expects payment from Ann and expects to make payment to Carol, and he has the joint signature, and knows Carol has a copy of the authenticated joint signature, because Carol sent him the signature and he sent Ann the signature, of it, then he knows Carol can *make* him pay her, and knows he can *make* Ann pay him. So he just goes right ahead with unilateral transactions that supersede the transaction that -relies on the reliable broadcast channel. And if every party to the +relies on the total order broadcast channel. And if every party to the transaction does that, none of them actually broadcast the signature the -reliable broadcast channel. Which in consequence, by merely being available +total order broadcast channel. Which in consequence, by merely being available enforces correct behaviour, and is seldom likely to need to actually broadcast anything. And when something is actually broadcast on that channel, chances are that all the transactions that that broadcast enables will have been superseded. Each party, when receives a copy of the joint signature that he *could* upload -to the reliable broadcast channel, sends a copy to the counter party that he +to the total order broadcast channel, sends a copy to the counter party that he expects to pay him, and each party, when he receives a copy from the party he expects to pay, performs the unilateral payment to that party that supersedes -and the transaction using the reliable broadcast network. +and the transaction using the total order broadcast network. And if a party has a copy of the joint signature and the document that it signs for the full circle transaction, but finds himself unable to perform the superseding unilateral transactions with his counterparties, (perhaps their internet connection or their computer went down) then he uploads the -signature to the reliable broadcast channel. +signature to the total order broadcast channel. -When the signature is uploaded to reliable broadcast channel, this does not -give the reliable broadcast channel any substantial information about the +When the signature is uploaded to total order broadcast channel, this does not +give the total order broadcast channel any substantial information about the amount of the transaction, and who the parties to the transaction are, but the node of the channel sees IP addresses, and this could frequently be used to reconstruct a pretty good guess about who is transacting with whom and why. @@ -404,7 +404,7 @@ will have only small fragments of data, not enough to put together to form a meaningful picture, hence the privacy leak is unlikely to be very useful to those snooping on other people’s business. -### Other use cases for a reliable broadcast channel +### Other use cases for a total order broadcast channel The use case of joint signatures implies an immutable data structure of the tuple rowid, hash, public key, and two scalars. @@ -415,7 +415,7 @@ If you continually upload the latest version, you wind up uploading most of tree, or all of it, which does not add significantly to the cost of each interaction recorded. The simplest sql friendly data structure is (rowid of this item, public key, hash, your index of hash, oids of two child hashes) -with the reliable broadcast channel checking that the child hashes do in fact +with the total order broadcast channel checking that the child hashes do in fact generate the hash, and that the tuple (public key, index of hash) is unique. If the data is aged out after say, three months, cannot directly check @@ -451,7 +451,7 @@ change, with the torch being passed from key to key. In which case the upload of total state needs to reference the used key, and an earlier, normally the earliest, signing key, with links in the chain of keys authorizing keys being renewed at less than the timeout interval for data to -be immutable, but unavailable from the reliable broadcast network. If the +be immutable, but unavailable from the total order broadcast network. If the witness asserts that key is authorized by a chain of keys going back to an earlier or the earliest keys, then it relies on its previous witness, rather than re-evaluating the entire, possibly very long, chain of keys every time. @@ -467,26 +467,26 @@ Because of Byzantine failure or network failure, such a chain may fork. The protocol has to be such that if a fork develops by network failure, it will be fixed, with one of the forks dying when the network functions better, and if it fails by Byzantine failure, -we get two sets of reliable broadcast channels, -each testifying that the other reliable broadcast channel is unreliable, +we get two sets of total order broadcast channels, +each testifying that the other total order broadcast channel is unreliable, and each testifying that a different fork is the valid fork, -and which fork you follow depends on which reliable broadcast channel you +and which fork you follow depends on which total order broadcast channel you subscribe to. Another use case is for wallet recovery, with mutable data structures encrypted by the private key whose primary key is the public key. -## implementing and funding a reliable broadcast channel +## implementing and funding a total order broadcast channel Tardigrade has a somewhat similar architecture to the proposed Reliable Broadcast network charges $120 per year for per TB of storage, $45 per terabyte of download. So for uploading a single signature, and downloading it six times, which one hash, one elliptic point, and two scalars, one hundred and twenty eight bytes, so the cost of doing what tardigrade does -with reliable broadcast network operated by a single operator would be +with total order broadcast network operated by a single operator would be $4 × 10^{-7}$ dollars. Which might as well be free, except we have to charge some tiny amount to prevent DDoS. -But, when the system is operating at scale, will want the reliable broadcast +But, when the system is operating at scale, will want the total order broadcast network to have many operators, who synchronize with each other so that the data is available to each of them and all of them and from each of them and all of them, and can testify when the @@ -498,11 +498,11 @@ transaction. Maybe we should charge for opening the account, and then every hundredth transaction. We also want the operators to be genuinely independent and separate from each -other. We don’t want a single inherently authorized reliable broadcast channel, +other. We don’t want a single inherently authorized total order broadcast channel, because it is inherently a low cost target for the fifty one percent attack. -I have been thinking about implementing a reliable broadcast channel as +I have been thinking about implementing a total order broadcast channel as byzantine Paxos protocol, but this gives us a massive low cost fifty one -percent attack vulnerability. If the reliable broadcast channel is cheap +percent attack vulnerability. If the total order broadcast channel is cheap enough to be useful, it is cheap enough for the fifty one percent attack. We want cheap testimony of valuable facts, which makes consensus mechanisms unlikely to work. diff --git a/docs/manifesto/SWIFT.md b/docs/manifesto/SWIFT.md index 922fd27..b9f3a14 100644 --- a/docs/manifesto/SWIFT.md +++ b/docs/manifesto/SWIFT.md @@ -56,14 +56,14 @@ but that is small potatoes compared to capturing a tiny sliver of SWIFT fees. SWIFT is a messaging system that handles about five hundred standardized structured messages per second (many messages of many types) between a few hundred banks, with certain special security guarantees, in particular reliable and provable delivery. To eat SWIFT's lunch, -need a sharded reliable broadcast channel with open entry, without centralization. +need a sharded total order broadcast channel with open entry, without centralization. -I am using “reliable broadcast channel” in the cryptographic sense. +I am using “total order broadcast channel” in the cryptographic sense. It will not be reliable in the ordinary sense, since you may attempt to put a message on it, and the message may not get on it, and you have to try again. It will not be broadcast in the ordinary sense, since most messages are end to end encrypted so that only the two parties can read them. -What makes it a reliable broadcast channel in the cryptographic sense, +What makes it a total order broadcast channel in the cryptographic sense, is that if Bob sends a message to Carol over it, as part of a protocol where Bob has to send a message, and Carol has to send a reply, then if the protocol fails because of Bob, Carol can prove it, @@ -71,9 +71,21 @@ and if the protocol fails because of Carol, Bob can prove it. And both can prove what messages they received, and what messages they sent that the counterparty should have received. +This more or less corresponds to the Celestia blockchain -- + atomic broadcast and data availability without a massively + replicated state machine. Transactions being somehow achieved somewhere else. + +Celestia is an Ethereum data availability layer, which is in some respects the +opposite of what we want to achieve -- we want a privacy layer so that people +can communicate and transact without revealing network addresses where valuable +secrets that could be stolen reside, but the underlying technological problems +that need to be solved are the same. + +Celestia uses erasure coding to achieve scaling. + Being sharded, can handle unlimited volume. And once that exists as a neutral protocol with open entry and no central control, -can put dexes on it, Daos on it, +can put dexes on it, daos on it, uncensored social media on it, web 3.0 on it, and coins on it. And the first thing that should go on it is a dex that can exchange @@ -142,7 +154,7 @@ BitcoinOS are addressing that. When last I looked their solution was far from r but it does not yet urgently need to be ready. To take over from SWIFT, lightning is unlikely to suffice. -Going to need Liquid. Since Liquid uses polynomail commits, it +Going to need Liquid. Since Liquid uses polynomial commits, it might be possible to shard it, but the path to that is unclear, in which case replacing SWIFT is going to need to need Liquid Lightning. @@ -295,10 +307,10 @@ and Bitmessage being out of support. > > > > Therefore, fund a privacy protocol that is an update to bitmessage, > > with additional capability of -> > zooko names and reliable broadcast, reliable in the cryptographic +> > zooko names and total order broadcast, reliable in the cryptographic > > sense. > > -> > Reliable broadcast in the cryptographic sense being that if one has +> > total order broadcast in the cryptographic sense being that if one has > > a transaction protocol in which Bob is supposed > > to send a message to Carol, and Carol supposed to send a > > corresponding response to Bob, the blockchain diff --git a/docs/manifesto/scalability.md b/docs/manifesto/scalability.md index 3eb18a7..337259b 100644 --- a/docs/manifesto/scalability.md +++ b/docs/manifesto/scalability.md @@ -6,7 +6,7 @@ sidebar: true notmine: false abstract: >- Bitcoin does not scale to the required size. - The Bitcoin reliable broadcast channel is a massively replicated public ledger of every transaction that ever there was, + The Bitcoin total order broadcast channel is a massively replicated public ledger of every transaction that ever there was, each of which has to be evaluated for correctness by every full peer. With recursive snarks, we can now instead have a massively replicated public SQL index of private ledgers. Such a blockchain with as many transactions as bitcoin, will, after running for as long as Bitcoin, @@ -27,7 +27,7 @@ Which means either centralization, a central bank digital currency, which is the path Ethereum is walking, or privacy. You cure both blockchain bloat and blockchain analysis by *not* -putting the data on the reliable broadcast channel in the first +putting the data on the total order broadcast channel in the first place, rather than doing what Monero does, putting it on the blockchain in cleverly encrypted form, bloating the blockchain with chaff intended to obfuscate against blockchain analysis. @@ -132,19 +132,19 @@ is valid, given that rest of the chain was valid, and produce a recursive snark that the new block, which chains to the previous block, is valid. -## reliable broadcast channel +## total order broadcast channel -If you publish information on a reliable broadcast channel, +If you publish information on a total order broadcast channel, everyone who looks at the channel is guaranteed to see it and to see the same thing, and if someone did not get the information that you were supposed to send over the channel, it is his fault, not yours. You can prove you performed the protocol correctly. -A blockchain is a merkle chain *and* a reliable broadcast channel. -In Bitcoin, the reliable broadcast channel contains the entire +A blockchain is a merkle chain *and* a total order broadcast channel. +In Bitcoin, the total order broadcast channel contains the entire merkle chain, which obviously does not scale, and suffers from a massive lack of privacy, so we have to introduce the obscure -cryptographic terminology "reliable broadcast channel" to draw a +cryptographic terminology "total order broadcast channel" to draw a distinction that does not exist in Bitcoin. In Bitcoin the merkle vertices are very large, each block is a single huge merkle vertex, and each block lives forever on an ever growing public broadcast @@ -155,18 +155,18 @@ which is what is happening with Ethereum's use of recursive snarks. So we need to structure the data as large dag of small merkle vertices, with all the paths through the dag for which we need to generate proofs being logarithmic in the size of the contents of -the reliable broadcast channel and the height of the blockchain. +the total order broadcast channel and the height of the blockchain. -### scaling the reliable broadcast channel to billions of peers,exabytes of data, and terabytes per second of bandwidth +### scaling the total order broadcast channel to billions of peers,exabytes of data, and terabytes per second of bandwidth At scale, which is not going to happen for a long time, -the reliable broadcast channel will work much like bittorrent. +the total order broadcast channel will work much like bittorrent. There are millions of torrents in bittorrent, each torrent is shared between many bittorrent peers, each bittorrent peer shares many different torrents, but it only shares a handfull of all of the millions of torrents. -Each shard of the entire enormous reliable broadcast channel will be +Each shard of the entire enormous total order broadcast channel will be something like one torrent of many in bittorrent, except that torrents in bittorent are immutable, while shards will continually agree on a new state, which is a valid state given the previous @@ -380,7 +380,7 @@ Those of them that control the inputs to the transaction commit unspent transactions outputs to that transaction, making them spent transaction outputs. But does not reveal that transaction, or that they are spent to the same transaction – -though his peer can probably guess quite accurately that they are. The client creates a proof that this an output from a transaction with valid inputs, and his peer creates a proof that the peer verified the client's proof and that output being committed was not already committed to another different transaction, and registers the commitment on the blockchain. The output is now valid for that transaction, and not for any other, without the reliable broadcast channel containing any information about the transaction of which it is an output, nor the transaction of which it will become an input. +though his peer can probably guess quite accurately that they are. The client creates a proof that this an output from a transaction with valid inputs, and his peer creates a proof that the peer verified the client's proof and that output being committed was not already committed to another different transaction, and registers the commitment on the blockchain. The output is now valid for that transaction, and not for any other, without the total order broadcast channel containing any information about the transaction of which it is an output, nor the transaction of which it will become an input. In the next block that is a descendant of that block the parties to the transaction prove that the new transaction outputs are @@ -388,7 +388,7 @@ valid, and being new are unspent transaction outputs, without revealing the transaction outputs, nor the transaction, nor the inputs to that transaction. You have to register the unspent transaction outputs on the public -index, the reliable broadcast channel, within some reasonable +index, the total order broadcast channel, within some reasonable time, say perhaps below block height $\lfloor(h/32⌋+2\rfloor)*32$, where h is the block height on which the first commit of an output to the transaction was registered. If not all the inputs to @@ -520,10 +520,10 @@ have download and validate every single transaction, which validation is quite costly, and more costly with Monero than Bitcoin. Once all the necessary commits have been registered on the -reliable broadcast channel, only the client wallets of the parties to +total order broadcast channel, only the client wallets of the parties to the transaction can produce a proof for each of the outputs from that transaction that the transaction is valid. They do not need to -publish on the reliable broadcast channel what transaction that +publish on the total order broadcast channel what transaction that was, and what the inputs to that transaction were. So we end up with the blockchain carrying only $\bigcirc\ln(h)$ @@ -578,7 +578,7 @@ blockchain to buy in first. The rest of the blockchain does not have to know how to verify valid authority, does not need to know the preimage of the hash of the method of verification, just verify that the party committing did the correct verification, -whatever it was. Rather than showing the reliable broadcast +whatever it was. Rather than showing the total order broadcast channel a snark for the verification of authority, which other people might not be able to check, the party committing a transaction shows it a recursive snark that shows that he verified diff --git a/docs/manifesto/social_networking.md b/docs/manifesto/social_networking.md index 01991d6..b6762a5 100644 --- a/docs/manifesto/social_networking.md +++ b/docs/manifesto/social_networking.md @@ -535,7 +535,7 @@ implement private rooms on that model. Indeed we have to, in order to implement a gateway between our crypto currency and bitcoin lightning, without which we cannot have a liquidity event for our startup. -In order to do money over a private room, you need a `reliable broadcast channel`, +In order to do money over a private room, you need a `total order broadcast channel`, so that Bob cannot organize a private room with Ann and Carol, and make Ann see one three party transaction, and Carol see a different three party transaction. @@ -550,9 +550,9 @@ increasing the likelihood of leaking metadata. But if we want to get paid, need to address this concern, possibly at the expense of the usability and security for other uses of our private rooms. -Existing software does not implement reliable broadcast in private rooms, +Existing software does not implement total order broadcast in private rooms, and is generally used in ways and for purposes that do not need it. And -anything that does implement reliable broadcast channel is apt to leak +anything that does implement total order broadcast channel is apt to leak more metadata than software that does not implement it. So for most existing uses, existing software is better than this software will be. But to do business securely and privately over the internet, you need a reliable @@ -640,7 +640,7 @@ unshared secrets. The signed contents of the room are flood filled around the participants, rather than each participant being required to communicate his content to each of the others. -To provide a reliable broadcast channel, a hash of the total state of the room is continually constructed +To provide a total order broadcast channel, a hash of the total state of the room is continually constructed # signing and immutability @@ -793,7 +793,7 @@ private room system. The elegant cryptography of [Scriptless Scripts] using adaptive Schnorr signatures and of [Anonymous Multi-Hop Locks] assumes and presupposes -a lightning network that has a private room with a reliable broadcast +a lightning network that has a private room with a total order broadcast channel (everyone in the room can know that everyone else in the room sees the same conversation he is seeing, and a reliable anonymous broadcast channel within the private room. "We also assume the existence diff --git a/docs/names/multisignature.md b/docs/names/multisignature.md index 29b2ec3..3d7fb42 100644 --- a/docs/names/multisignature.md +++ b/docs/names/multisignature.md @@ -283,12 +283,12 @@ a subset of the key recipients can generate a Schnorr signature. Cryptosystems](./threshold_shnorr.pdf) gives reasonably detailed instructions for implementing threshold Schnorr, without any trusted central authority but assumes various cryptographic primitives that we do not in fact have, among -them a reliable broadcast channel. +them a total order broadcast channel. -## Reliable Broadcast Channel +## total order broadcast Channel A key cryptographic primitive in threshold signatures, and indeed in almost -every group cryptographic protocol, is the reliable broadcast channel – that +every group cryptographic protocol, is the total order broadcast channel – that any participant can reliably send a message that is available to all participants. @@ -310,7 +310,7 @@ honest, so one can construct a broadcast channel using the Paxos protocol. Every distributed cryptographic protocol needs a secure broadcast channel, and every blockchain is a secure broadcast channel. -One of the requirements of secure reliable broadcast channel is that _it +One of the requirements of secure total order broadcast channel is that _it stays up_. But a secure broadcast channel for a lightning type transaction is going to be created and shut down. And if it can be legitimately shut down, it can be shut down at exactly the wrong moment @@ -318,7 +318,7 @@ for some of the participants and exactly the right time for some of the participants. Hence the use of a “trusted” broadcast authority, who stays up. -We could attain the same effect by a hierarchy of secure reliable broadcast +We could attain the same effect by a hierarchy of secure total order broadcast channels, in which a narrow subchannel involving a narrower set of participants can be set up on the broader channel, and shutdown, with its final shutdown signature available in the broader channel, such that someone @@ -347,11 +347,11 @@ The Byzantine Paxos protocol is designed for a large group and is intended to keep going permanently in the face of the hardware or software failure of some participants, and Byzantine defection by a small conspiracy of participants. -For a reliable broadcast channel to be reliable, you are relying on it to +For a total order broadcast channel to be reliable, you are relying on it to stay up, because if it goes down and stays down, its state for transactions near the time it went down cannot be clearly defined. -For a reliable broadcast channel to be in a well defined state on shutdown, +For a total order broadcast channel to be in a well defined state on shutdown, it has to have continued broadcasting its final state to anyone interested for some considerable time after it reached its final state. So you are trusting someone to keep it going and available. In this sense, no group @@ -379,7 +379,7 @@ corporations, and capable of electing producing a chain of signatures officer identity, CEO organizes the election of the board), capable of being evaluated by everyone interacting with the business over the net. -Obviously the reliable broadcast protocol of such a very large scale key +Obviously the total order broadcast protocol of such a very large scale key generation will look more like a regular blockchain, since many entities will drop out or fail to complete directly. diff --git a/docs/paxos_protocol.md b/docs/paxos_protocol.md index 3fd7390..dc98928 100644 --- a/docs/paxos_protocol.md +++ b/docs/paxos_protocol.md @@ -22,12 +22,12 @@ senators, so Paxos is not worried about people gaming the system to exclude voters they do not want, nor worried about people gaming the somewhat arbitrary preselection of the agenda to be voted up and down. -# Analysing [Paxos Made Simple] in terms of Arrow and Reliable Broadcast +# Analysing [Paxos Made Simple] in terms of Arrow and total order broadcast [Paxos Made Simple]:./paxos-simple.pdf The trouble with Lamport’s proposal, described in [Paxos Made Simple] is that -it assumes no byzantine failure, and that therefore the reliable broadcast +it assumes no byzantine failure, and that therefore the total order broadcast channel is trivial, and it assumes that any proposal will be acceptable, that all anyone cares about is converging on one proposal, therefore it always converges on the first proposal accepted by one acceptor. @@ -61,7 +61,7 @@ how everything fits together before critiquing that. If a majority of acceptors accept some proposal, then we have a result. But we do not yet have everyone, or indeed anyone, knowing the result. -Whereupon we have the reliable broadcast channel problem, which Lamport hand +Whereupon we have the total order broadcast channel problem, which Lamport hand waves away. The learners are going to learn it. Somehow. And once we have a result accepted, we then happily go on to the next round. @@ -105,11 +105,11 @@ prepare, and accept, instead of two phases, prepare and accept. Pre-prepare is the “primary” (leader, CEO, primus inter pares) notifying the “replicas” (peers) of the total order of a client message. -Prepare is the “replicas” (peers, reliable broadcast channel) notifying each +Prepare is the “replicas” (peers, total order broadcast channel) notifying each other of association between total order and message digest. Accept is the “replicas” and the client learning that $33\%+1$ of the -“replicas” (peers, reliable broadcast channel) agree on the total order of the +“replicas” (peers, total order broadcast channel) agree on the total order of the client’s message. # Analysing Raft Protocol