From e645c3b381ef8ac33b74c8e83e2f3a9f38ff473b Mon Sep 17 00:00:00 2001 From: "reaction.la" Date: Mon, 2 Oct 2023 21:40:37 +1000 Subject: [PATCH] Added the proposal for variable length quantities that sort as bitstrings --- docs/manifesto/scalability.md | 78 ++++++++++++-- docs/manifesto/social_networking.md | 45 +++++++- docs/rootDocs/README.md | 1 - docs/setup/contributor_code_of_conduct.md | 66 ++++++++++++ docs/variable-length-quantity.md | 120 ++++++++++++++++++++++ 5 files changed, 297 insertions(+), 13 deletions(-) create mode 100644 docs/variable-length-quantity.md diff --git a/docs/manifesto/scalability.md b/docs/manifesto/scalability.md index ccaffeb..81c51ff 100644 --- a/docs/manifesto/scalability.md +++ b/docs/manifesto/scalability.md @@ -47,7 +47,9 @@ This is in part malicious, the enemy pouring mud into the tech waters. So I need A zk-snark or a zk-stark proves that someone knows something, knows a pile of data that has certain properties, without revealing -that pile of data. Such that he has a preimage of a hash that has certain properties – such as the property of being a valid transaction. +that pile of data. Such that he has a preimage of a certain hash +and that this preimage has certain properties – +such as the property of being a valid transaction. You can prove an arbitrarily large amount of data with an approximately constant sized recursive snark. So you can verify in a quite short time that someone proved @@ -61,9 +63,11 @@ verified a zk-snark that proves that someone has verified … So every time you perform a transaction, you don't have to prove all the previous transactions and generate a zk-snark verifying that you proved it. You have to prove that you verified -the recursive snark that proved the validity of the unspent +the recursive snark that proved the validity of the inputs transaction outputs that you are spending. - +Which you do by proving that the inputs are part +of the merkle tree of unspent transaction outputs, +of which the current root of the blockchain is the root hash. ## structs A struct is simply some binary data laid out in well known and agreed format. @@ -94,7 +98,8 @@ So, you have a merkle chain of blocks, each block containing a merkle patricia tree of merkle dags. You have a recursive snark that proves the chain, and everything in it, is valid (no one created tokens out of thin air, each transaction merely moved -the ownership of tokens) And then you prove that the new block is valid, given that rest of the chain was valid, and produce a +the ownership of tokens) And then you prove that the new block +is valid, given that rest of the chain was valid, and produce a recursive snark that the new block, which chains to the previous block, is valid. @@ -123,7 +128,7 @@ vertices, with all the paths through the dag for which we need to generate proofs being logarithmic in the size of the contents of the reliable broadcast channel. -## Merkle patricia tree +## merkle patricia tree A merkle patricia tree is a representation of an sql index as a merkle tree. Each edge of a vertex is associated with a short @@ -135,7 +140,7 @@ that corresponds to path you took through the merkle tree, and to the leading bits of the bitstring that make that key unique in the index. Thus the sql operation of looking up a key in an index corresponds to a walk through the merkle patricia tree -guided by the key. +guided by the key. # Blockchain @@ -340,7 +345,7 @@ height is currently near a hundred thousand, at which height we will be keeping about fifty blocks around, instead of a hundred thousand blocks around. -## Bigger than Visa +# Bigger than Visa And when it gets so big that ordinary people cannot handle the bandwidth and storage, recursive snarks allow sharding the @@ -349,7 +354,7 @@ shard might lie, so every peer would have to evaluate every transaction of every shard. But with recursive snarks, a shard can prove it is not lying. -### sidechaining +## sidechaining One method of sharding is sidechaining @@ -378,7 +383,9 @@ channel a snark for the verification of authority, which other people might not be able to check, the party committing a transaction shows it a recursive snark that shows that he verified the verification of authority using the verification method -specified by the output. And what that method was, outsiders do +specified by the output, +without bloating the public broadcast channel by revealing +what method the output specified. What that method was, outsiders do not need to know, reducing the burden of getting everyone playing by the same complex rules. If a contract or a sidechain looks indistinguishable from any other transaction, it not only @@ -387,3 +394,56 @@ on the blockchain have to handle and know how to handle, it also radically simplifies blockchain governance, bringing us closer to the ideal of transactions over distance being governed by mathematics, rather than men. + +# Private ledger + +An enterprise derives its collective existence from its ledger. +The enterprise as a collective entity is a thirteenth century accounting fiction +that fourteenth century businessmen imagined into reality. + +For sovereign corporations, a great deal of corporate governance +can be done by the laws of mathematics, +rather than the laws of men + +The commits form a directed acyclic graph. +Each particular individual who knows the preimage of *some* +of the hashes of outputs and commits committed to the public broadcast channel +knows *some* paths through the directed acyclic graph. + +One of those paths corresponds to his private ledger, for which +eventually we should write database and bookkeeping software. +And that path can prove the ledgers immutable and append only. + +But we would like him to be able to prove to a counterparty +that his ledger is immutable and append only, +and that the information he is showing the counterparty is +consistent with the information he shows every other counterparty + +To accomplish this, an output needs to be able to own a name +and the associated public key, thus the name identifies a +single path through the merkle dag, and it is possible to prove +the ledger consistent along this named path. + +*And* we want him to be able to prove that he is showing facts +about his ledger that are consistent with everyone else's ledgerss. +To do that, triple entry accounting, where a journal entry +that lists an obligation of a counterparty as an asset, +or an obligation to a counterparty as a liability, +references a jointly signed row that must exist in both party's ledgers, +jointly signed by the non fungible name tokens of both parties. + +Double entry accounting shows the books balance. +Triple entry accounting shows that obligations between parties +recorded on their books balance. + +Thus for sovereign corporations, a great deal of corporate governance +can be done by the laws of mathematics, +rather than the laws of men, +which was one of the original cypherpunk goals and slogans +that Satoshi was attempting to fulfil. +We always intended from the very beginning +to destroy postmodern capitalism and restore +the modern capitalism of Charles the Second. + +Such a non fungible name tokens would also be necessary +for a reputation system, if we want to eat Amazon's and Ebay's lunch. diff --git a/docs/manifesto/social_networking.md b/docs/manifesto/social_networking.md index 1506be2..9a1b411 100644 --- a/docs/manifesto/social_networking.md +++ b/docs/manifesto/social_networking.md @@ -1091,6 +1091,11 @@ Discussion groups are a necessary first step. ## Cold Start Problem and Metcalf's law. +Metcalf's law: The usefulness of a network to any one person +considering joining it depends on how many people have already joined. + +Cold Start Problem: If no one is on the network, no one wants to be on it. + The value of joining a network depends on the number of other people already using that network. So if there are big competing networks, no one wants to join the new network. @@ -1102,7 +1107,25 @@ in the door, then we can get rolling. ### Bitmessage -The lowest hanging fruit of all, (because, unfortunately, there is no money or prospect for money in it) is to replace Bitmessage. +The lowest hanging fruit of all, (because, unfortunately, there is +no money or prospect for money in it) is to replace Bitmessage. +Which is currently abandonware, which has ceased working on most platforms, +has never supported humanly intelligible names, and lacks the +moderation capability to grey list, blacklist, and whitelist names +on discussion groups, but is widely used +because it does not reveal the sender or recipient's IP address. + +In particular, it is widely used for crypto currency payments. +So next step is to integrate payment mechanisms, which brings us +a little closer to the faint smell of money. + +### Integrating money. + +So we create a currency. But because it will be created on sovcorp model +people cannot obtain it by proof of work - they have to buy it. Which +will require gateways between bitcoin lightning and the currency supported by +by the network, and gateways between the conversations on the network and +nostr. # Development sequence @@ -1210,7 +1233,9 @@ Increasingly, the value of shares is not physical things, but "goodwill" Domino’s does not sell pizzas, and Apple does not sell computers. It sets standards, and sells the expectation that stuff sold with its brand name will conform to expectations. Domino's does not make the -pizza dough, does not make the pizzas. It sells the brand. +pizza dough, does not make the pizzas. It sells the brand. It also +organizes supply and delivery of flour, cheese, and so on and so forth +without itself producing or delivering the materials. The latest, and one of the biggest, jewels in Apple’s tech crown, at the time of writing, is the M1 chip. Which is *designed* by Apple. It is @@ -1220,7 +1245,7 @@ ingredients. But it was not cooked in a Domino’s owned oven, was not cooked by a Domino’s employee, and it is unlikely that any of the ingredients where ever anywhere near Domino’s owned physical property or a Domino’s direct employee. Domino's does -not cook pizzas, and Apple does not build computers it. It designs +not cook pizzas, and Apple does not build computers. It designs computers and set standards. Most businesses are in practice distributed over a network, and @@ -1232,6 +1257,20 @@ suppliers, and customer and supplier expectations of employee roles enforced by the corporation. *This*, we can move to the blockchain and protect from governments. +An enterprise is not a physical thing. It is not buildings +and machines and all that. It is a thirteenth century accounting fiction +that fourteenth century businessmen imagined into reality, in that +the group of people the accounting fiction represented acted as +one person in reality. In the seventeenth century +Charles the Second created modern capitalism, by merging this accounting +fiction with the Roman legal fiction of the corporation, +to create the publicly traded joint stock limited liability for profit corporation. +We now, however have postmodern capitalism, in that a multitude of "stakeholders" +are subverting its corporateness, because the legal, accounting, +and human resources have power derived from the state, +rather than the corporation, pulling the corporation apart. +With blockchains, we can return from postmodern capitalism to modern capitalism. + A huge amount of what matters, a major proportion of the value represented by shares, is in the social network. Which is increasingly, like Apple and Google, scarcely attached to anything diff --git a/docs/rootDocs/README.md b/docs/rootDocs/README.md index ee258a0..05dbd67 100644 --- a/docs/rootDocs/README.md +++ b/docs/rootDocs/README.md @@ -127,7 +127,6 @@ of the pull request process is getting the puller to trust your public key, and you will not be able to pull updates unless you tell `gpg` to trust the key that is in the root directory as `public_key.gpg`. - Never use any email address on a gpg key related to this project unless it is only used for project purposes, or a fake email, or the email of an enemy. We don't want Gpg used to link different email diff --git a/docs/setup/contributor_code_of_conduct.md b/docs/setup/contributor_code_of_conduct.md index 6ffacf4..de40a9c 100644 --- a/docs/setup/contributor_code_of_conduct.md +++ b/docs/setup/contributor_code_of_conduct.md @@ -182,3 +182,69 @@ Unless, when a female contributor unnecessarily and irrelevantly informs everyone she is female, she is told that she is seeking special treatment on account of sex, and is not going to get it, no organization or group that attempts to develop software is going to survive. Linux is a dead man walking. + +# Style + +Contributions should be gpg signed. + +Never use any email address on a gpg key related to this project +unless it is only used for project purposes, or a fake email, or the +email of an enemy. We don't want Gpg used to link different email +addresses as owned by the same entity, and we don't want email +addresses used to link people to the project, because those +identities would then come under state and quasi state pressure. + +if you add the recommended repository configuration defaults to your local repository configuration + +```bash +git config --local include.path ../.gitconfig +``` + +This will implement signed commits and will insist that you have `gpg` on your path, and that you have cohfigured a signing key in your local config, and will refuse to pull updates that are signed by a gpg key that you have not locally trusted. + +This may be inconvenient if you do not have `gpg` installed and set up. + +It also means that subsequent pulls and merges will require you to have `gpg `ltrust the key `public_key.gpg`, and if you submit a pull request, the puller will need to ltrust your `gpg` public key. + +`.gitconfig` adds several git aliases: + +1. `git utcmt` to do a commit without recording your timezone in the git history +1. `git lg` to display the gpg trust information for the last few commits. + For this to be useful you need to import the repository public key + `public_key.gpg` into gpg, and locally sign that key. +1. `git graph` to graph the commit tree with signing status +1. `git alias` to display the git aliases. + +```bash +# To verify that the signature on future pulls is +# unchanged. +gpg --import public_key.gpg +gpg --lsign 096EAE16FB8D62E75D243199BC4482E49673711C +``` + +We ignore the Gpg Web of Trust model and instead use the Zooko +identity model. + +We use Gpg signatures to verify that remote repository code +is coming from an unchanging entity, not for Gpg Web of Trust. Web +of Trust is too complicated and too user hostile to be workable or safe. + +Never --sign any Gpg key related to this project. --lsign it. + +`gitconfig` disallows merges unless you have told `gpg` to trust the +public key corresponding to the private key that signed the tip of +the root. So part of the pull request process is getting the puller to +trust your public key, and you will not be able to pull updates +unless you tell `gpg` to trust the key that is in the root directory as +`public_key.gpg`. + +Never check any Gpg key related to this project against a public +gpg key repository. It should not be there. + +`gitconfig` disallows merges unless you have told `gpg` to trust the public +key corresponding to the private key that signed the tip of the root. So part +of the pull request process is getting the puller to trust your public key, and +you will not be able to pull updates unless you tell `gpg` to trust the key that +is in the root directory as `public_key.gpg`. + +`.gitconfig` also imposes a whitespace style. diff --git a/docs/variable-length-quantity.md b/docs/variable-length-quantity.md new file mode 100644 index 0000000..b03b207 --- /dev/null +++ b/docs/variable-length-quantity.md @@ -0,0 +1,120 @@ +--- +title: Variable Length Quantity +--- + +I originally implemented variable length quantities following the standard. + +And then I realized that an sql index represented as a merkle-patricia tree inherently sorts in byte string order. +Which is fine if we represent integers as fixed length integers in big endian format, +but does not correctly sort variable length quantities if we follow the standard: + +So: To represent variable signed numbers in byte string sortable order: + +# For positive signed integers + +If the leading bits are $10$, it represents a number in the range\ +$0$ ... $2^6-1$ So only one byte + +If the leading bits are $110$, it represents a number in the range\ +$2^6$ ... $2^6+2^{13}-1$ So two bytes + +if the leading bits are $1110$, it represents a number in the range\ +$2^6+2^{13}+2^{20}$ ... $2^6+2^{13}+2^{20}+2^{27}-1$ So four bytes long +(five bits of header, twenty seven bits to represent $2^{27}$ different +values as the trailing twenty seven bits of an ordinary thirty two bit +positive integer in big endian format). + +if the leading bits are $1111\,0$, it represents a number in the range\ +$2^6+2^{13}+2^{20}+2^{27}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}-1$ +So five bytes long. + +if the leading bits are $1111\,10$, it represents a number in the range\ +$2^6+2^{13}+2^{20}+2^{27}+2^{34}-1$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}-1$ +So six bytes long. + +if the leading bits are $1111\,110$, it represents a number in the range\ +$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}$ +So seven bytes long. + +if the leading bits are $1111\,1110$, it represents a number in the range\ +$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}-1$ +So eight bytes long. + +if the leading bits are $1111\,1111\,0$, it represents a number in the range\ +$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}+2^{62}-1$ +So nine bytes long (ten bits of header, sixty two bits to represent $2^{62}$ +different values as the trailing sixty two bits of an ordinary sixty four bit positive integer in big endian format). + +if the leading bits are $1111\,1111\,10$, it represents a number in the range\ +$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}+2^{62}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}+2^{62}+2^{69}-1$ +So ten bytes long. + +And so on and so forth in the same pattern for positive signed numbers of unlimited size. + +The reason for these complicated offsets is to ensure that the byte string are strictly sequential. + +# For negative signed integers + +If the leading bits are $01$, it represents a number in the range\ +$-2^6$ ... $-1$ So only one byte (two bits of header, +six bits to represent $2^6$ different values as the +trailing six bits of an ordinary eight bit negative integer). + +If the leading bits are $001$, it represents a number in the range\ +$-2^{13}-2^6$ ... $2^6-1$ So two bytes (three bits of header, +thirteen bits to represent $2^{13}$ different values as the trailing +thirteen bits of an ordinary sixteen bit negative integer in big endian format). + +if the leading bits are $0001$, it represents a number in the range\ +$-2^6-2^{13}-2^{20}$ ... $-2^6-2^{13}-1$ So three bytes long. + +if the leading bits are $0000\,1$, it represents a number in the range\ +$-2^6-2^{13}-2^{20}-2^{27}$ ... $-2^6-2^{13}-2^{20}-1$ +So four bytes long (five bits of header, twenty seven bits to represent +$2^{27}$ different values as the trailing twenty seven bits of +an ordinary thirty two bit negative integer in big endian format). + +if the leading bits are $0000\,01$, it represents a number in the range\ + $-2^6-2^{13}-2^{20}-2^{27}-2^{34}$ ... $-2^6-2^{13}-2^{20}-2^{27}-1$ + So five bytes long. + +if the leading bits are $0000\,001$, it represents a number in the range\ + $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-1$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-1$ + So six bytes long. + +if the leading bits are $0000\,0001$, it represents a number in the range\ +$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}$ +So seven bytes long. + +if the leading bits are $0000\,0000\,1$, it represents a number in the range\ +$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-1$ +So eight bytes long. + +if the leading bits are $0000\,0000\,01$, it represents a number in the range\ +$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-2^{62}$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-1$ +So nine bytes long (ten bits of header, sixty two bits to represent $2^{62}$ +different values as the trailing sixty two bits of an ordinary sixty four bit +negative integer in big endian format). + +if the leading bits are $0000\,0000\,001$, it represents a number in the range\ +$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-2^{62} +$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-1$ So ten bytes long. + +And so on and so forth in the same pattern for negative signed numbers of unlimited size. + +# bitstrings + +Bitstrings in merkle patricia tree representing an sql index +are typically very short, so should be represented by a +variable length quantity, except for the leaf edge, +which is fixed size and large, so should not be +represented by variable length quantity. + +We use the integer zero to represent this special case, +the integer one to represent the zero length bit string, +integers two and three to represent the one bit bitstring, +integers four to seven to represent the two bit bit string, +and so on and so forth. + +In other words, we represent it as the integer obtained +by prepending a leading one bit to the bit string.