--- title: Libraries --- # Wireguard, Tailwind, and identity Wireguard is a secure vpn. Tailwind is peer to peer built on Oauth2 and Wireguard. Lacks peering through NAT facilities, though this is probably not hard to fix or add. The Tailwind server just tells both peers to start pinging each other simultaneously, and tells each peer when the other peer has acked a meeting time. With most nats, the first ping to arrive after a ping has been sent will get through. Wireguard has a provision to keep on pinging. If a peer has a stable IP and port accessible from the internet, it does not ping. The other guy has to ping. If he does not ping for a while, the connection may stop working. If the peer with the stable IP and port finds it cannot get through, discards the connection information, but if it never attempts to use that information, it may hang around for a very long time. Oauth is a generic interface to identity protocols. Anyone can implement Oauth in any way. The Zooko identity model is that each party has his own mapping between non unique Zooko human readable, typeable, and memorable names, and globally unique non human memorable, non human typeable, public keys. We put on top of that a consensus mapping, which is mutable. The end user sees his local name for an identity, that identity's name for itself, and a recent consensus human readable human typeable name for that identity. For major and important widely known identities these should all be the same, and the end user should see a single short human readable name. If the end user sees two or three different human readable names for a counterparty, there is likely to be an issue. If he sees three different human readable names, plus the public key, definitely an issue. The end user's mapping from local petnames to global keys is locally unique, and mutable at the end user's discretion. The consensus mapping is mutable by consensus. For friends who are well known to himself, but not well known to others, the global consensus name may merely be a distraction, and he may turn it off. If someone is on his buddy list, people whitelisted the global consensus name is turned off, unless it is the same, in which case it is turned on, and if the consensus changes, the end user sees that change. # Consensus I have no end of smart ideas about how a blockchain should work, but no actual blockchain. Smart ideas are worth two cents a bale, but only if already baled. Need to port in someone else's blockchain code, with a bridge to their blockchain. Then I can make a start on implementing my bright ideas as part of working code. [Near]:https://near.org/papers/the-official-near-white-paper/#introduction {target="_blank"} [Near] is actually implementing no end of things that I have been thinking about, so seems like a good fit. # Git submodules Libraries are best dealt with as [Git submodules]. [Git submodules]: https://github.com/psi4/psi4/wiki/External-subprojects-using-Git-and-CMake [build libraries]:https://git-scm.com/book/en/v2/Git-Tools-Submodules Git submodules leak complexity and surprising and inconvenient behavior all over the place if one is trying to make a change that affects multiple modules simultaneously. But having your libraries separate from your git repository results in non portable surprises and complexity. Makes it hard for anyone else to build your project, because they will have to, by hand, tell your project where the libraries are on their system. You need an enormous pile of source code, the work of many people over a very long time, and GitSubmodules allows this to scale, because the local great big pile of source code references many independent and sovereign repositories in the cloud. If you have one enormous pile of source code in one enormous git repository, things get very very slow. If you rely someone else's compiled code, things break and you get accidental and deliberate backdoors, which is a big concern when you are doing money and cryptography. GitSubmodules is hierarchical, but source code has strange loops. The Bob module uses the Alice module and the Carol module, but Alice uses Bob and Carol, and Carol uses Alice and Bob. How do you make sure that all your modules are using the same commit of Alice? Well, if modules have strange loops you make one of them the master, and the rest of them direct submodules of that master, brother subs to each other, and they are all using the same commit of Alice as the master. And you should try to write or modify the source code so that they all call their brother submodules through the one parent module above them in the hierarchy, that they use the source code of their brothers through the source code of their master, rather than directly incorporating the header files of their brothers at compile time, albeit the header file of the master that they include may well include the header of their brother, so that they are indirectly, through the master header file, including the brother header file. # Git subtrees Git subtrees are an alternative to submodules, and many people recommend them because they do not break the git model the way submodules do. But subtrees do not scale. If you have an enormous pile of stuff in your repository, Git has to check every file to see if it has changed every time, which rather rapidly becomes painfully slow if one is incorporating a lot of projects reflecting a lot of work by a lot of people. GitSubmodules means you can incorporate unlimited amounts of stuff, and Git only has to check the particular module that you are actually working on. Maybe subtrees would work better if one was working on a project where several parts were being developed at once, thus a project small enough that scaling is not an issue. But such projects, if successful, grow into projects where scaling is an issue. And if you are a pure consumer of a library, you don't care that you are breaking the git model, because you are seldom making synchronized changes in module and submodule. The submodule model works fine, provided the divisions between one submodule and the next are such that one is only likely to make changes in one module at at time. # Passphrases All wallets now use random words - but you cannot carry an eighteen word random phrase though an airport in you head Should use [grammatically correct passphrases](https://github.com/lungj/passphrase_generator). Using those dictionaries, the phrase (adjective noun adverb verb adjective noun) can encode sixty eight bits of entropy. Two such phrases suffice, being stronger than the underlying elliptic curve. With password strengthening, we can randomly leave out one of the adjectives or adverbs from one of the passphrases. # Polkadot, Near, substack and gitcoin It has become painfully apparent that building a blockchain is a very large project. Polkadot is a blockchain ecosystem, and substack a family of libraries for constructing blockchains. It is a lot a easier to refactor an existing blockchain than to start entirely from scratch. [Near] is way ahead of me, because not suffering from not invented here syndrome. Polkadot is designed to make its ecosystem subordinate to the primary blockchain, which I do not want - but it also connects its ecosystem to bitcoin by De-Fi (or promises to do so, I don't know how well it works) so accepting that subordination is a liquidity event. We can fix things so that the tail will wag the dog once the tail gets big enough, as China licensed from ARM, then formed a joint venture with ARM, then hijacked the joint venture, once it felt it no longer needed to keep buying the latest ARM intellectual property. Licensing was a fully subordinate relationship, the joint venture was cooperation between unequal parties, and now ARM China is a fully independent and competing technology, based on the old ARM technology, but advancing it separately, independently, and in its own direction. China forked the ARM architecture. Accepting a fully subordinate relationship to get connected, and then defecting on subordination when strong enough, is a sound strategy. [Gitcoin]:https://gitcoin.co/ "Build and Fund the Open Web Together" And talking about connections: [Gitcoin] Gitcoin promises connection to money, and connection to a community of open source developers. It is Polkadot's money funnel from VCs to developers. The amount of cash in play is rather meagre, but it provides a link to the real money, which is ICOs. I suspect that its git hosting has been co-opted by the enemy, but that is OK, provided our primary repo is not co-opted by the enemy. # Installers Wine to run Windows 10 software under Linux is a bad idea, and Windows Subsystem for Linux to run Linux software under Windows 10 is a much worse idea – it is the usual “embrace and extend” evil plot by Microsoft against open source software, considerably less competently executed than in the past. ## The standard gnu installer ```bash ./configure && make && make install ``` ## The standard windows installer Wix creating an `*.msi` file. Which `*.msi` file can be wrapped in an executable, but there is no sane reason for this and you are likely to wind up with installs that consist of an executable that wraps an msi that wraps an executable that wraps an msi. To build an `*.msi`, you need to download the Wix toolset, which is referenced in the relevant Visual Studio extensions, but cannot be downloaded from within the Visual Studio extension manager. The Wix Toolset, however, requires the net framework in order to install it and use it, which is the cobbler’s children going barefoot. You want a banana, and have to install a banana tree, a monkey, and a jungle. There is a [good web page](https://stackoverflow.com/questions/1042566/how-can-i-create-an-msi-setup) on WIX resources There is an automatic wix setup: Visual Studio-> Tools-> Extensions&updates ->search Visual Studio Installer Projects Which is the Microsoft utility for building wix files. It creates a quite adequate wix setup by gui, in the spirit of the skeleton windows gui app. ## [NSIS](https://nsis.sourceforge.io/Download) Nullsoft Scriptable Install System. People who know what they are doing seem to use this install system, and they write nice installs with it. To build setup program: 1. Build both x64 and Win32 Release configs 1. When you construct wallet.nsi in nullsoft, add it to your project. 1. When building a deliverable, Right click on the WalletSetup.nsi file in Visual Studio project and select properties. 1. Set Excluded from Build to No 1. OK Properties 1. Right click .nsi file again and choose Compile. 1. Set the .nsi file properties back to Excluded from Build. This manual building of the setup is due to the fact that we need both x64 and Win32 exes for the setup program and Visual Studio doesn’t provide a way to do this easily. # Package managers Lately, however, package managers have appeared: Conan and [vcPkg](https://blog.kitware.com/vcpkg-a-tool-to-build-open-source-libraries-on-windows/). Conan lacks wxWidgets, and has far fewer packages than [vcpkg](https://libraries.io/github/Microsoft/vcpkg). I have attempted to use package managers, and not found them very useful. It is easier to deal with each package as its own unique special case. The uniform abstraction that a package manager attempts to provide invariably leaks badly, while piling cruft on top of the library. Rather than simplifying library use, piles its own idiosyncratic complexification on top of the complexities of the library, often inducing multiplicative complexity, as one attempts to deal with the irregularities and particulars of a particular library though a package manager that is unaware of and incapable of dealing with the particularity of that particular package, and is unshakeably convinced that the library is organized in way that is different from the way it is in fact organized. # Multiprecision Arithmetic I will need multiprecision arithmetic if I represent information in a base or dictionary that is not a power of two. [MPIR]:]http://mpir.org/ {target="_blank"} [GMP]:https://gmplib.org {target="_blank"} The best libraries are [GMP] for Linux and [MPIR] for windows. These are reasonably compatible, and generally only require very trivial changes to produce a Linux version and a windows version. Boost attempts to make the changes invisible, but adds needless complexity and overhead in doing so, and obstructs control. MPIR has a Visual Studio repository on Github, and a separate Linux repository on Github. GMP builds on a lot of obscure platforms, but not really supported on Windows. For supporting Windows and Linux only, MPIR all the way is the way to go. For compatibility with little used and obscure environments, you might want to have your own custom thin layer that maps GMP integers and MPIR integers to your integers, but that can wait till we have conquered the world. My most immediate need for MPIR is the extended Euclidean algorithm for modular multiplicative inverse, which it, of course, supports, `mpz_gcdext`, greatest common divisor extended, but which is deeply hidden in the [documentation](http://www.mpir.org/mpir-3.0.0.pdf). # [wxWidgets](./libraries/building_and_using_libraries.html#instructions-for-wxwidgets) # Networking ## notbit client A bitmessage client written in C. Designed to run on a linux mail server and interface bitmessage to mail. Has no UI, intended to be used with the linux mail UI. Unfortunately, setting up a linux mail server is a pain in the ass. Needs the Zooko UI. But its library contains everything you need to share data around a group of people, many of them behind NATs. Does not implement NAT penetration. Participants behind a NAT are second class unless they implement port forwarding, but participants with unstable IPs are not second class. ## Game Networking sockets [Game Networking Sockets](https://github.com/ValveSoftware/GameNetworkingSockets) A reliable udp library with congestion control which has vastly more development work done on it than any other reliable udp networking library, but which is largely used to work with Steam gaming, and Steam's closed source code. Has no end of hooks to closed source built into it, but works fine without those hooks. Written in C++. Architecture overly specific and married to Steam. Would have to be married to Tokio to have massive concurrency. But you don't need to support hundreds of clients right away. Well, perhaps I do, because in the face of DDOS attack, you need to keep a lot of long lived inactive connections around for a long time, any of which could receive a packet at any time. I need to look at the GameNetworkingSockets code and see how it listens on lots and lots of sockets. If it uses [overlapped IO], then it is golden. Get it up first, and it put inside a service later. [Overlapped IO]:client_server.html#the-select-problem {target="_blank"} The nearest equivalent Rust application gave up on congestion control, having programmed themselves into a blind alley. ## Tokio Tokio is a Rust framework for writing highly efficient highly scalable services. Writing networking for a service with large numbers of clients is very different between Windows and Linux, and I expect Tokio to take care of the differences. There really is not any good C or C++ environment for writing services except Wt, which is completely specialized for the case of writing a web service whose client is the browser, and which runs only on Linux. ## wxWidgets wxWidgets has basic networking capability built in and integrated with its event loop, but it is a bit basic, and is designed for a gui app, not for a server – though probably more than adequate for initial release. It only supports http, but not https and websockets. [LibSourcery](https://sourcey.com/libsourcey) is a far more powerful networking library, which supports https and websockets, and is designed to interoperate with nginx and node.js. But integrating it with wxWidgets is likely to be nontrivial. WxWidgets sample code for sockets is in %WXWIN%/samples/sockets. There is a [recently updated version on github]. Their example code supports TCP and UDP. But some people argue that the sampling is insufficiently responsive - you really need a second thread that damned well sits on the socket, rather than polling it. And that second thread cannot use wxSockets. [recently updated version on github]:https://github.com/wxWidgets/wxWidgets/tree/master/samples/sockets Programming sockets and networking in C is a mess. The [much praised guide to sockets](https://beej.us/guide/bgnet/html/single/bgnet.html) goes on for pages and pages describing a “simple” example client server. Trouble is that C, and old type Cish C++ exposes all the dangly bits. The [QT client server example](https://stackoverflow.com/questions/5773390/c-network-programming), on the other hand, is elegant, short, and self explanatory. The code project has [example code written in C++](https://www.codeproject.com/Articles/13071/Programming-Windows-TCP-Sockets-in-C-for-the-Begin), but it is still mighty intimidating compared to the QT client server example. I have yet to look at the wxWidgets client server examples – but looking for wxWidgets networking code has me worried that it is a casual afterthought, not adequately supported or adequately used. ZeroMQ is Linux, C, and Cish C++. Boost Asio is highly praised, but I tried it, and concluded its architecture is broken, trying to make simplicity and elegance where it cannot be made, resulting in leaky abstractions which leak incomprehensible complexity the moment you stray off the beaten path – I feel they have lost control of their design, and are just throwing crap at it trying to make something that cannot work, work. I similarly found the Boost time libraries failed, leaking complexity that they tried to hide, with the hiding merely adding complexity. [cpp-httplib](https://github.com/yhirose/cpp-httplib) is wonderful in its elegance, simplicity, and ease of integration. You just include a single header. Unfortunately, it is strictly http/https, and we need something that can deal with the inherently messy lower levels. [Poco](http://pocoproject.org/) does everything, and is C++, but hey, let us first see how far we can get with wxWidgets. Further, the main reason for doing https integration with the existing browser web ecosystem, whose security is fundamentally broken, due the state’s capacity to seize names, and the capacity of lots of entities to intercept ssl. It might well be easier to fork opera or embed chromium. I notice that Chromium has features supporting payment built into it, a bunch of “PaymentMethod\*\*\*\*\*Event” The best open source browser, and best privacy browser, is Opera, in that it comes from an entity less evil than Google. [Opera](https://bit.ly/2UpSTFy) needs to be configured with [a bunch of privacy add ons](https://gab.com/PatriotKracker80/posts/c3kvL3pBbE54NEFaRGVhK1ZiWCsxZz09) [HTTPS Everywhere Add-on](https://bit.ly/2ODbPeE), [uBlock](https://bit.ly/2nUJLqd), [DisconnectMe](https://bit.ly/2HXEEks), [Privacy-Badger](https://bit.ly/2K5d7R1), [AdBlock Plus](https://bit.ly/2U81ddo), [AdBlock for YouTube](https://bit.ly/2YBzqRh), two tracker blockers, and three ad blockers. It would be great if we could make our software another addon, possibly chatting by websocket to the wallet. The way it would work be to add another protocol to the browser: ro://name1.name2.name3/directory/directory/endpoint. When you connect to such an endpoint, your wallet, possibly a wallet with no global name, connects to the named wallet, and gets IP, a port, a virtual server name, a cookie unique for your wallet, and the hash of the valid ssl certificate for that name, and then the browser makes a connection to the that server, ignoring the CA system and the DNS system. The name could be a DNS name and the certificate a CA certificate, in which case the connection looks to the server like any other, except for the cookie which enables it to send messages, typically a payment request, to the wallet. # zk-snarks [Aurora]:https://eprint.iacr.org/2018/828.pdf {target="_blank"} Supposedly there is a language, R1CS, such that you can express a program that gives a true false answer, such that [Aurora] can execute the program and generate a prover and a verifier. [starkware]:https://iacr.org/submit/files/slides/2021/rwc/rwc2021/1005/slides.pdf {target="_blank"} According to [starkware], they have the fastest proving time, but their proofs are rather large, 138KiB, Groth16 Snarks have the most compact proofs. Not actually seeing it as a useful library yet that I could actually use, but more like a proof of principle that someone could build such a library. To be actually useful, a zk-snark system needs to be a compiler, that compiles a program written in what Starkware calls R1CS, and other people are calling script, and generates two programs, a prover and a\ verifier. The prover operates on two blobs, the public blob and private blob, and produces a boolean result, true or false, pass or fail, and a proof that it\ did so. The proof is approximately constant size, regardless of how much computation is required and regardless of how large the private blob was, but takes a very long time. The verifier operates on the public blob and the proof, takes a short and approximately constant time to do so, regardless of how big the computation was, and regardless of how big the private data was and determines, with 2^(126) likelihood of error, what result the prover got. But at present I get the impression that neither script nor R1CS have any real existence, though I have seen a script language that operates on a stack, and, though it has no variables, can dupe any item on the stack to the top of the stack. It seems to have only been ever used to generate one prover and one verifier, because actually creating the prover and verifier still required some coding by hand. Also lacked certain control structures. At present, people seem to be writing the prover and the verifier by hand, a very difficult operation with a very high likelihood of bugs. The prover and the verifier do very simple tasks like proving the encoded inputs to a transaction are greater than or equal to the encoded outputs and that no numeric underflow or overflow occurred. Another problem is that we would really like the public data to be the root hash of a merkle tree, and no one seems to have a script language that contains a useful hash function Stackware is built out of hash functions, but last time I looked, you could not call a hash function from R1CS. We need a script language that can not merely add and subtract, but can also do hashes and elliptic point operations. zk-stark systems are built out of hashes and elliptic point operations, but it seems to be uphill trying to generate proofs that prove something about the results of hashes and elliptic point operations, making very difficult to produce a proof that a pile of proofs in the pre-image of a merkle tree have been verified. I suspect that a prover might take a very very long time to produce such a proof. The proofs are succinct, in that you can prove something about a gigantic pile of data and the size of the proof and the time taken to verify scarcely grows - about 128 KiB, for the smallest that anyone would care about, to utterly gigantic proofs. But proof generation is not all that fast, and grows with the matter to be proven, so to be useful for utterly gigantic proofs, you would need to be able to distribute proof generation over an enormous multitude of untrusting shards. Which you can obviously do by proving a verification. Not sure how long it takes to produce a proof that a large number of proofs were verified. What you want is to be able to prove that a final hash is the root of of an enormous merkle tree, some generalization of a merkle-patricia tree, representing an immutable append only data structure consisting of a sequence of piles of transactions, and the state generated by these transactions, represents a valid branch of a chain of signatures, that the final state is correctly derived by applying the batch of transactions to the previous state. And then you want to do this for states so enormous, and piles of transactions so enormous, that no one person has all of them. And then you still have the problem of resolving forks. You would like to have a blockchain of blockchains of blockchains, such that your state, and your transactions, are divided into a product of substates, with consensus on each substate advancing a bit ahead of the consensus on the combination of several substates, so that transactions within a substate finalize fast, but transactions between substates take longer. (because the number of forks of the product state is the product of the number of forks of each substate) Each of the substates very quickly comes up with a proof that a transaction within a substate is valid and quickly comes up with consensus as to which fork everyone is on, but the proof for a transaction between substates is finalized quickly in the paying substate, and quickly affects the paying substate, but the transaction does not get included in the state that is a product of the receiving and paying substate for a while, does not get proven valid in the product substate for a while, and does not get included in the receiving substate till a bit after than it is included in the product substate, whereupon it is in due course quickly proven to be a valid addition of value to the receiving substate. So that the consensus problem remains manageable, we need insulation and delay between the states, so that the product state has its own pile of state, representing the delay between a transaction affecting a the payer factor state, and the transaction affecting the payee factor state. A transaction has no immediate affect. The payer mutable substate changes in a way reflecting the transaction block at the next block boundary. And that change then has effect on product mutable state at a subsequent product state block boundary, changing the stake possessed by the substate. Which then has effect on the payee mutable substate at its next block boundary when the payee substate links back to the previous product state. # Safe maths [Safeint]:https://github.com/dcleblanc/SafeInt {target="_blank"} We could implement transaction outputs and inputs as a fixed amount of fungible tokens, limited to $2^{64}-1$ tokens, using [Safeint] That will be future proof for a long time, but not forever. Indeed, anything that does not use Zksnarks is not future proof for the indefinite future. Or we could implement decimal floating point with unlimited exponents and mantissa implemented on top of [MPIR] Or we could go ahead with the canonical representation being unlimited decimal exponent and unlimited mantissa, but the wallet initially only generates, and only can handle, transactions that can be represented by[Safeint], and always converts the mantissa plus decimal exponent to and from a safeint. if we rely on safeint, and our smallest unit is the microrho, that is room for eighteen trillion rho. We can start actually using the unlimited precision of the exponent and the mantissa in times to come - not urgent, merely architect it into the canonical format. From the point of view of the end user, this will merely be an upgrade that allows nanorho, picorho, femptorho, attorho, zeptorho, yoctorho, and allows a decimal point in yoctorho quantities. And then we go to a new unit, the jim, with one thousand yottajim equals one yoctorho, a billion yoctojim equals one attorho, a trillion exajim equals one attorho. To go all the way around to two byte exponents, for testing purposes, will need some additional new units after the jim. (And we should impose a minimum unit size of $10^{-195}$ rho or $10{-6} rho, thereby ensuring that transaction size is bounded while allowing compatibility for future expansion.) Except in test and development code, any attempt to form a transaction involving quantities with exponents less than $1000^{-2}$ will cause a gracefully handled exception, and in all code any attempt to display or perform calculations on transaction inputs and outputs for which no display units exist will cause an ungracefully handled exception. In the first release configuration parameters, the lowest allowed exponent will be $1000^{-2}$, corresponding to microrho, and the highest allowed exponent $1000^4$, corresponding to terarho, and machines will be programmed to vote "incapable" and "no" on any proposal to change those parameters. However they will correctly handle transactions beyond those limits provided that when quantities are expressed in the smallest unit of any of the inputs and outputs, the sum of all the inputs and of all the outputs remains below $2^{64}$. To ensure that all releases are future compatible, the blockchain should have some exajim transactions, and unspent transaction outputs but the peers should refuse to form any more of them. The documentation will say that arbitrarily small and large new transaction outputs used to be allowed, but are currently not allowed, to reduce the user interface attack surface that needs to be security checked and to limit blockchain bloat, and since there is unlikely to be demand for this, this will probably not be fixed for a very long time. Or perhaps it would be less work to support humungous transactions from the beginning, subject to some mighty large arbitrary limit to prevent denial of service attack, and eventually implementing native integer handling of normal sized transactions as an optimization, for transactions where all quantities fit within machine sized words, and rescaled intermediate outputs will be less than $64 - \lceil log_2($number of inputs and outputs$) \rceil$ bits. Which leads me to digress how we are going to handle protocol updates: ## handling protocol updates 1. Distribute software capable of handling the update. 1. A proposed protocol update transaction is placed on the blockchain. 1. Peers indicate capability to handle the protocol update. Or ignore it, or indicate that they cannot. If a significant number of peers indicate capability, peers that lack capability push their owners for an update. 1. A proposal to start emitting data that can only handled by more recent peers is placed on the blockchain. 1. If a significant number of peers vote yes, older peers push more vigorously for an update. 1. If a substantial supermajority votes yes by a date specified in the proposal, then they start emitting data in the new format on a date shortly afterwards. If no supermajority by the due date, the proposal is dead. # [Zlib compression libraries.](./libraries/zlib.html) Built it, easy to use, easy to build, easy to link to. Useful for large amounts of text, provides, but does not use, CRC32 [Cap\'n Proto](./libraries/capnproto.html) [Crypto libraries](./libraries/crypto_library.html) [Memory Safety](./libraries/memory_safety.html). [C++ Automatic Memory Management](./libraries/cpp_automatic_memory_management.html) [C++ Multithreading](./libraries/cpp_multithreading.html) [Catch testing library](https://github.com/catchorg/Catch2) [Boost](https://github.com/boostorg/boost) ------------------------------------------------------------------------ ## Boost My experience with Boost is that it is no damned good: They have an over elaborate pile of stuff on top of the underlying abstractions, which pile has high runtime cost, and specializes the underlying stuff in ways that only work with boost example programs and are not easily generalized to do what one actually wishes done. Their abstractions leak. [Boost high precision arithmetic `gmp_int`]:https://gmplib.org/ [Boost high precision arithmetic `gmp_int`] A messy pile built on top of GMP. Its primary benefit is that it makes `gmp` look like `mpir` Easier to use [MPIR] directly. The major benefit of boost `gmp` is that it runs on some machines and operating systems that `mpir` does not, and is for the most part source code compatible with `mpir`. A major difference is that boost `gmp` uses long integers, which are on sixty four bit windows `int32_t`, where `mpir` uses `mpir_ui` and `mpir_si`, which are on sixty four bit windows `uint64_t` and `int64_t`. This is apt to induce no end of major porting issues between operating systems. Boost `gmp` code running on windows is apt to produce radically different results to the same boost `gmp` code running on linux. Long `int` is just not portable, and should never be used. This kind of issue is absolutely typical of boost. In addition to the portability issue, it is also a typical example of boost abstractions denying you access to the full capability of the thing being abstracted away. It is silly to have a thirty two bit interface between sixty four bit hardware and unlimited arithmetic precision software. ------------------------------------------------------------------------ ## Database The blockchain is a massively distributed database built on top of a pile of single machine, single disk, databases communicating over the network. If you want a single machine, single disk, database, go with SQLite, which in WAL mode implements synch interaction on top of hidden asynch. [SQLite](https://www.Sqlite.org/src/doc/trunk/README.md) have their own way of doing things, that does not play nice with Github. The efficient and simple way to handle interaction with the network is via callbacks rather than multithreading, but you usually need to handle databases, and under the hood, all databases are multithreaded and blocking. If they implement callbacks, it is usually on top of a multithreaded layer, and the abstraction is apt to leak, apt to result in unexpected blocking on a supposedly asynchronous callback. SQLite recommends at most one thread that writes to the database, and preferably only one thread that interacts with the database. ## The Invisible Internet Project (I2P) [Comes](https://geti2p.net/en/) with an I2P webserver, and the full api for streaming stuff. These appear as local ports on your system. They are not tcp ports, but higher level protocols, *and* UDP. (Sort of UDP - obviously you have to create a durable tunnel, and one end is the server, the other the client.) Inconveniently, written in java. ## Internet Protocol [QUIC] UDP with flow control and reliability. Intimately married to http/2, https/2, and google chrome. Cannot call as library, have to analyze code, extract their ideas, and rewrite. And, looking at their code, I think they have written their way into a blind alley. But QUIC is http/2, and there is a gigantic ecosystem supporting http/2. We really have no alternative but to somehow interface to that ecosystem. [QUIC]: https://github.com/private-octopus/picoquic [QUIC] is UDP with flow control, reliability, and SSL/TLS encryption, but no DDoS resistance, and total insecurity against CA attack.) ## Boost Asynch Boost implements event oriented multithreading in IO service, but don’t like it because it fails to interface with Microsoft’s implementation of asynch internet protocol, WSAAsync, and WSAEvent. Also because brittle, incomprehensible, and their example programs do not easily generalize to anything other than that particular example. To the extent that you need to interact with a database, you need to process connections from clients in many concurrent threads. Connection handlers are run in thread, that called `io_service::run()`. You can create a pool of threads processing connection handlers (and waiting for finalizing database connection), by running `io_service::run()` from multiple threads. See Boost.Asio docs. ## Asynch Database access MySQL 5.7 supports [X Plugin / X Protocol, which allows asynchronous query execution and NoSQL But X devapi was created to support node.js and stuff. The basic idea is that you send text messages to mysql on a certain port, and asynchronously get text messages back, in google protobuffs, in php, JavaScript, or sql. No one has bothered to create a C++ wrapper for this, it being primarily designed for php or node.js](https://dev.mysql.com/doc/refman/5.7/en/document-store-setting-up.html) SQLite nominally has synchronous access, and the use of one read/write thread, many read threads is recommended. But under the hood, if you enable WAL mode, access is asynchronous. The nominal synchrony sometimes leaks into the underlying asynchrony. By default, each `INSERT` is its own transaction, and transactions are excruciatingly slow. Wal normal mode fixes this. All writes are writes to the writeahead file, which gets cleaned up later. The authors of SQLite recommend against multithreading writes, but we do not want the network waiting on the disk, nor the disk waiting on the network, therefore, one thread with asynch for the network, one purely synchronous thread for the SQLite database, and a few number crunching threads for encryption, decryption, and hashing. This implies shared nothing message passing between threads. ------------------------------------------------------------------------ [Facebook Folly library]provides many tools, with such documentation as exists amounting to “read the f\*\*\*\*\*g header files”. They are reputed to have the highest efficiency queuing for interthread communication, and it is plausible that they do, because facebook views efficiency as critical. Their [queuing header file] (https://github.com/facebook/folly/blob/master/folly/MPMCQueue.h) gives us `MPMCQueue`. [Facebook Folly library]:https://github.com/facebook/folly/blob/master/folly/ On the other hand, boost gives us a lockless interthread queue, which should be very efficient. Assuming each thread is an event handler, rather than pseudo synchronous, we queue up events in the boost queue, and handle all unhandled exceptions from the event handler before getting the next item from the queue. We keep enough threads going that we do not mind threads blocking sometimes. The queue owns objects not currently being handled by a particular thread. Objects are allocated in a particular thread, and freed in a particular thread, which process very likely blocks briefly. Graphic events are passed to the master thread by the wxWindows event code, but we use our own mutltithreaded event code to handle everything else. Posting an event to the gui code will block briefly. I was looking at boost’s queues and lockless mechanisms from the point of view of implementing my own thread pool, but this is kind of stupid, since boost already has a thread pool mechanism written to handle the asynch IO problem. Thread pools are likely overkill. Node.js does not need them, because its single thread does memory to memory operations. Boost provides us with an [`io_service` and `boost::thread` group], used to give effect to asynchronous IO with a thread pool. `io_service` was specifically written to perform io, but can be used for any thread pool activity whatsoever. You can “post” tasks to the io_service, which will get executed by one of the threads in the pool. Each such task has to be a functor. [`io_service` and `boost::thread` group]:http://thisthread.blogspot.com/2011/04/multithreading-with-asio.html Since supposedly nonblocking operations always leak and block, all we can do is try to have blocking minimal. For example nonblocking database operations always block. Thus our threadpool needs to be many times larger than our set of hardware threads, because we will always wind up doing blocking operations. The C++11 multithreading model assumes you want to do some task in parallel, for example you are multiplying two enormous matrices, so you spawn a bunch of threads, then you wait for them all to complete using `join`, or all to deliver their payload using futures and promises. This does not seem all that useful, since the major practical issue is that you want your system to continue to be responsive while it is waiting for some external hardware to reply. When you are dealing with external events, rather than grinding a matrix in parallel, event oriented architecture, rather than futures, promises, and joins is what you need. Futures, promises, and joins are useful in the rather artificial case that responding to an remote procedure call requires you to make two or more remote procedure calls, and wait for them to complete, so that you then have the data to respond to a remote procedure call. Futures, promises, and joins are useful on a server that launches one thread per client, which is often a sensible way to do things, but does not fit that well to the request response pattern, where you don’t have a great deal of client state hanging around, and you may well have ten thousand clients If you can be pretty sure you are only going to have a reasonably small number of clients at any one time, or and significant interaction between clients, one thread per client may well make a lot of sense. I was planning to use boost asynch, but upon reading the boost user threads, sounds fragile, a great pile of complicated unintelligible code that does only one thing, and if you attempt to do something slightly different, everything falls apart, and you have to understand a lot of arcane details, and rewrite them. [Nanomsg](http://nanomsg.org/)is a socket library, that provides a layer on top of everything that makes everything look like sockets, and provides sockets specialized to various communication patterns, avoiding the roll your own problem. In the zeroMQ thread, people complained that [a simple hello world TCP-IP program tended to be disturbingly large and complex] Looks to me that [Nanomsg] wraps a lot of that complexity. [a simple hello world TCP-IP program tended to be disturbingly large and complex]:http://250bpm.com/blog # Sockets A simple hello world TCP-IP program tends to be disturbingly large and complex, and windows TCP-IP is significantly different from posix TCP-IP. Waiting on network events is deadly, because they can take arbitrarily large time, but multithreading always bites. People who succeed tend to go with single thread asynch, similar to, [or part of, the window event handling loop]. [or part of, the window event handling loop]:https://www.codeproject.com/Articles/13071/Programming-Windows-TCP-Sockets-in-C-for-the-Begin Asynch code should take the form of calling a routine that returns immediately, but passing it a lambda callback, which gets executed in the most recently used thread. Interthread communication bites – you don’t want several threads accessing one object, as synch will slow you down, so if you multithread, better to have a specialist thread for any one object, with lockless queues passing data between threads. One thread for all writes to SQLite, one thread for waiting on select. Boost Asynch supposedly makes sockets all look alike, but I am frightened of their work guard stuff – looks to me fragile and incomprehensible. Looks to me that no one understands boost asynch work guard, not even the man who wrote it. And they should not be using boost bind, which has been obsolete since lambdas have been available, indicating bitrot. Because work guard is incomprehensible and subject to change, will just keep the boost io object busy with a polling timer. And I am having trouble finding boost asynch documented as a sockets library. Maybe I am just looking in the wrong place. [A nice clean tutorial depicting strictly synchronous tcp.](https://www.binarytides.com/winsock-socket-programming-tutorial/) [Libpcap and Win10PCap](https://en.wikipedia.org/wiki/Pcap#Wrapper_libraries_for_libpcap) provide very low level, OS independent, access to packets, OS independent because they are below the OS, rather than above it. [Example code for visual studio.](https://www.csie.nuk.edu.tw/~wuch/course/csc521/lab/ex1-winpcap/) [Simple sequential procedural socket programming for windows sockets.](https://www.binarytides.com/winsock-socket-programming-tutorial/) If I program from the base upwards, the bottom most level would be a single thread sitting on a select statement. Whenever the select fired, would execute a corresponding functor transfering data between userspace and system space. One thread, and only one thread, responsible for timer events and transferring network data between userspace and systemspace. If further work required in userspace that could take significant time (disk operations, database operations, cryptographic operations) that functor under that thread would stuff another functor into a waitless stack, and a bunch of threads would be waiting for that waitless stack to be signaled, and one of those other threads would execute that functor. The reason we have a single userpace thread handling the select and transfers between userpace and systemspace is that that is a very fast and very common operation, and we don’t want to have unnecessary thread switches, wherein one thread does something, then immediately afterwards another thread does almost the same thing. All quickie tasks should be handled sequentially by one thread that works a state machine of functors. The way to do asynch is to wrap sockets in classes that reflect the intended use and function of the socket. Call each instance of such a class a connection. Each connection has its own state machine state and its own **message dispatcher, event handler, event pump, message pump**. A single thread calls select and poll, and drives all connection instances in all transfers of data between userspace and systemspace. Connections also have access to a thread pool for doing operations (such as file, database and cryptography, that may involve waits. The hello world program for this system is to create a derived server class that does a trivial transformation on input, and has a path in server name space, and a client class that sends a trivial input, and displays the result. Microsoft WSAAsync\[Socketprocedure\] is a family of socket procedures designed to operate with, and be driven by, the Window ui system, wherein sockets are linked to windows, and driven by the windows message loop. Could benefit considerably by being wrapped in connection classes. I am guessing that wxWidgets has a similar system for driving sockets, wherein a wxSocket is plugged in to the wxWidget message loop. On windows, wxWidget wraps WSASelect, which is the behavior we need. Microsoft has written the asynch sockets you need, and wxWidgets has wrapped them in an OS independent fashion. WSAAsyncSelect WSAEventSelect select Using wxSockets commits us to having a single thread managing everything. To get around the power limit inherent in that, have multiple peers under multiple names accessing the same database, and have a temporary and permanent redirect facility – so that if you access `peername,` your connection, and possibly your link, get rewritten to `p2.peername` by peers trying to balance load. Microsoft tells us: > receiving, applications use the WSARecv or WSARecvFrom functions to supply buffers into which data is to be received. If one or more buffers are posted prior to the time when data has been received by the network, that data could be placed in the user’s buffers immediately as it arrives. Thus, it can avoid the copy operation that would otherwise occur at the time the recv or recvfrom function is invoked. Moral is, we should use the sockets that wrap WSA. # Tcl Tcl is a really great language, and I wish it would become the language of my new web, as JavaScript is the language of the existing web. But it has been semi abandoned for twenty years. It consists of a string (which is implemented under the hood as a copy on write rope, with some substrings of the rope actually being run time typed C++ types that can be serialized and deserialized to strings) and a name table, one name table per interpreter, and at least one interpreter per thread. The entries in the name table can be strings, C++ functions, or run time typed C++ types, which may or may not be serializable or deserializable, but conceptually, it is all one big string, and the name table is used to find C and C++ functions which interpret the string following the command. Execution consists of executing commands found in the string, which transform it into a new string, which in turn gets transformed into a new string, until it gets transformed into the final result. All code is metacode. If elements of the string need to be deserialized to and from a C++ run time type, (because the command does not expect that run time type) but cannot be, because there is no deserialization for that run time type, you get a run time error, but most of the time you get, under the hood, C++ code executing C++ types – it is only conceptually a string being continually transformed into another string. The default integer is infinite precision, because integers are conceptually arbitrary length strings of numbers. To sandbox third party code, including third party gui code, just restrict the nametable to have no dangerous commands, and to be unable to load c++ modules that could provide dangerous commands. It is faster to bring up a UI in Tcl than in C. We get, for free, OS independence. Tcl used to be the best level language for attaching C programs to, and for testing C programs, or it would be if SWIG actually worked. The various C components of Tcl provide an OS independent layer on top of both Linux and Windows, and it has the best multithread and asynch system. It is also a metaprogramming language. Every Tcl program is a metaprogram – you always write code that writes code. The Gui is necessarily implemented as asynch, something like the JavaScript dom in html, but with explicit calls to the event/idle loop. Multithreading is implemented as multiple interpreters, at least one interpreter per thread, sending messages to each other. # Time After spending far too much time on this issue, which is has sucked in far too many engineers and far too much thought, and generated far too many libraries, I found the solution was c++11 Chrono: For short durations, we use the steady time in milliseconds, where each machine has its own epoch, and no two machines have exactly the same milliseconds. For longer durations, we use the system time in seconds, where all machines are expected to be within a couple of seconds of each other. For the human readable system time in seconds to be displayed on a particular machine, we use the ISO format 2012‑01‑14_15:39:34+10:00 (timezone with 10 hour offset equivalent to Greenwich time 2012‑01‑14_05:39:34+00:00) [For long durations, we use signed system time in seconds, for short durations unsigned steady time in milliseconds.](./libraries/rotime.cpp) Windows and Unix both use time in seconds, but accessed and manipulated in incompatible ways. Boost has numerous different and not altogether compatible time libraries, all of them overly clever and all of them overly complicated. wxWidgets has OS independent time based on milliseconds past the epoch, which however fails to compress under Cap\'n Proto. I was favourably impressed by the approach to time taken in tcp packets, that the time had to be approximately linear, and in milliseconds or larger, but they were entirely relaxed about the two ends of a tcp connection using different clocks with different, and variable, speeds. It turns out you can go a mighty long way without a global time, and to the extent that you do need a global time, should be equivalent to that used in email, which magically hides the leap seconds issue. # UTF‑8 strings Are supported by the wxWidgets wxString, which provide support to and from wide character variants and locale variants. (We don't want locale variants, they are obsolete. The whole world is switching to UTF, but our software and operating environments lag) `wString::ToUTF8()` and `wString::FromUTF8()` do what you would expect. On visual studio, need to set your source files to have bom, so that Visual Studio knows that they are UTF‑8, need to set the compiler environment in Visual Studio to UTF‑8 with `/Zc:__cplusplus /utf-8 %(AdditionalOptions)` And you need to set the run time environment of the program to UTF‑8 with a manifest. You will need to place all UTF‑8 string literals and string constants in a resource file, which you will use for translated versions. If you fail to set the compilation and run time environment to UTF‑8 then for extra confusion, your debugger and compiler will *look* as if they are handling UTF‑8 characters correctly as single byte characters, while at least wxString alerts you that something bad is happening by run time translating to the null string. Automatic string conversion in wxWidgets is *not* UTF‑8, and if you have any unusual symbols in your string, you get a run time error and the empty string. So wxString automagic conversions will rape you in the ass at runtime, and for double the confusion, your correctly translated UTF‑8 strings will look like errors. Hence the need to make sure that the whole environment from source code to run time execution is consistently UTF‑8, which has to be separately ensured in three separate place. When wxWidgets is compiled using `#define wxUSE_UNICODE_UTF8 1`, it provides UTF‑8 iterators and caches a character index, so that accessing a character by index near a recently used character is fast. The usual iterators `wx.begin()`, `wx.end()`, const and reverse iterators are available. I assume something bad happens if you advance a reverse iterator after writing to it. wxWidgets compiled with `#define wxUSE_UNICODE_UTF8 1` is the way of the future, but not the way of the present. Still a work in progress Does not build under Windows. Windows now provide UTF8 entries to all its system functions, which should make it easy. wxWidgets provides `wxRegEx` which, because wxWidgets provides index by entity, should just work. Eventually. Maybe the next release. # [UTF8-CPP](http://utfcpp.sourceforge.net/ "UTF-8 with C++ in a Portable Way") A powerful library for handling UTF‑8. This somewhat duplicates the facilities provided by wxWidgets with `wxUSE_UNICODE_UTF8==1` For most purposes, wxString should suffice, when it actually works with UTF8. Which it does not yet on windows. We shall see. wxWidgets recommends not using wxString except to communicate with wxWidgets, and not using it as general UTF‑8 system. Which is certainly the current state of play with wxWidgets. For regex to work correctly, probably need to do it on wxString's native UTF‑16 (windows) or UTF‑32 (unix), but it supposedly works on `UTF8`, assuming you can successfully compile it, which you cannot. # Cap\'n Proto [Designed for a download from github and run cmake install.](https://capnproto.org/install.html) As all software should be. But for mere serialization to of data to a form invariant between machine architectures and different compilers and different compilers on the same machine, overkill for our purposes. Too much capability. # Awesome C++ [Awesome C++] A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things [Awesome C++]:https://cpp.libhunt.com "A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things" {target="_blank"} I encountered this when looking at the Wt C++ Web framework, which seems to be mighty cool except I don't think I have any use for a web framework. But [Awesome C++] has a very pile of things that I might use. Wt has the interesting design principle that every open web page maps to a windows class, every widget on the web page, maps to a windows class, every row in the sql table maps to a windows class. Cool design. # Opaque password protocol [Opaque] is PAKE done right. [Opaque]:https://blog.cryptographyengineering.com/2018/10/19/lets-talk-about-pake/ "Let’s talk about PAKE" {target="_blank"} Server stores a per user salt, the users public key, and the user's secret key encrypted with a secret that only the user ever learns. Secret is generated by the user from the salt and his password by interaction with the server without the the user learning the salt, nor the hash of the salt, nor the server the password or the hash of the password. User then strengthens the secret generated from salt and password applying a large work factor to it, and decrypts the private key with it. User and server then proceed with standard public key cryptography. If the server is evil, or the bad guys seize the server, everything is still encrypted and they have to run, not a hundred million trial passwords against all users, but a hundred million passwords against *each* user. And user can make the process of trying a password far more costly and slow than just generating a hash. Opaque zero knowledge is designed to be as unfriendly as possible to big organizations harvesting data on an industrial scale. The essential design principle of this password protocol is that breaking a hundred million passwords by password guessing should be a hundred million times as costly as breaking one password by password guessing. The protocol is primarily designed to obstruct the NSA's mass harvesting. It has the enormous advantage that if you have one strong password which you use for many accounts, one evil server cannot easily attack your accounts on other servers. To do that, it has to try every password - which runs into your password strengthening.