Added the proposal for variable length quantities

that sort as bitstrings
This commit is contained in:
reaction.la 2023-10-02 21:40:37 +10:00
parent 4b2ac88e44
commit e645c3b381
No known key found for this signature in database
GPG Key ID: 99914792148C8388
5 changed files with 297 additions and 13 deletions

View File

@ -47,7 +47,9 @@ This is in part malicious, the enemy pouring mud into the tech waters. So I need
A zk-snark or a zk-stark proves that someone knows something,
knows a pile of data that has certain properties, without revealing
that pile of data. Such that he has a preimage of a hash that has certain properties such as the property of being a valid transaction.
that pile of data. Such that he has a preimage of a certain hash
and that this preimage has certain properties
such as the property of being a valid transaction.
You can prove an arbitrarily large amount of data
with an approximately constant sized recursive snark.
So you can verify in a quite short time that someone proved
@ -61,9 +63,11 @@ verified a zk-snark that proves that someone has verified …
So every time you perform a transaction, you don't have to
prove all the previous transactions and generate a zk-snark
verifying that you proved it. You have to prove that you verified
the recursive snark that proved the validity of the unspent
the recursive snark that proved the validity of the inputs
transaction outputs that you are spending.
Which you do by proving that the inputs are part
of the merkle tree of unspent transaction outputs,
of which the current root of the blockchain is the root hash.
## structs
A struct is simply some binary data laid out in well known and agreed format.
@ -94,7 +98,8 @@ So, you have a merkle chain of blocks, each block containing a
merkle patricia tree of merkle dags. You have a recursive snark
that proves the chain, and everything in it, is valid (no one
created tokens out of thin air, each transaction merely moved
the ownership of tokens) And then you prove that the new block is valid, given that rest of the chain was valid, and produce a
the ownership of tokens) And then you prove that the new block
is valid, given that rest of the chain was valid, and produce a
recursive snark that the new block, which chains to the previous
block, is valid.
@ -123,7 +128,7 @@ vertices, with all the paths through the dag for which we need to
generate proofs being logarithmic in the size of the contents of
the reliable broadcast channel.
## Merkle patricia tree
## merkle patricia tree
A merkle patricia tree is a representation of an sql index as a
merkle tree. Each edge of a vertex is associated with a short
@ -135,7 +140,7 @@ that corresponds to path you took through the merkle tree, and to
the leading bits of the bitstring that make that key unique in the
index. Thus the sql operation of looking up a key in an index
corresponds to a walk through the merkle patricia tree
guided by the key.
guided by the key.
# Blockchain
@ -340,7 +345,7 @@ height is currently near a hundred thousand, at which height we will
be keeping about fifty blocks around, instead of a hundred thousand
blocks around.
## Bigger than Visa
# Bigger than Visa
And when it gets so big that ordinary people cannot handle the
bandwidth and storage, recursive snarks allow sharding the
@ -349,7 +354,7 @@ shard might lie, so every peer would have to evaluate every
transaction of every shard. But with recursive snarks, a shard can
prove it is not lying.
### sidechaining
## sidechaining
One method of sharding is sidechaining
@ -378,7 +383,9 @@ channel a snark for the verification of authority,
which other people might not be able to check, the party committing a
transaction shows it a recursive snark that shows that he verified
the verification of authority using the verification method
specified by the output. And what that method was, outsiders do
specified by the output,
without bloating the public broadcast channel by revealing
what method the output specified. What that method was, outsiders do
not need to know, reducing the burden of getting everyone
playing by the same complex rules. If a contract or a sidechain
looks indistinguishable from any other transaction, it not only
@ -387,3 +394,56 @@ on the blockchain have to handle and know how to handle, it also
radically simplifies blockchain governance, bringing us closer to
the ideal of transactions over distance being governed by
mathematics, rather than men.
# Private ledger
An enterprise derives its collective existence from its ledger.
The enterprise as a collective entity is a thirteenth century accounting fiction
that fourteenth century businessmen imagined into reality.
For sovereign corporations, a great deal of corporate governance
can be done by the laws of mathematics,
rather than the laws of men
The commits form a directed acyclic graph.
Each particular individual who knows the preimage of *some*
of the hashes of outputs and commits committed to the public broadcast channel
knows *some* paths through the directed acyclic graph.
One of those paths corresponds to his private ledger, for which
eventually we should write database and bookkeeping software.
And that path can prove the ledgers immutable and append only.
But we would like him to be able to prove to a counterparty
that his ledger is immutable and append only,
and that the information he is showing the counterparty is
consistent with the information he shows every other counterparty
To accomplish this, an output needs to be able to own a name
and the associated public key, thus the name identifies a
single path through the merkle dag, and it is possible to prove
the ledger consistent along this named path.
*And* we want him to be able to prove that he is showing facts
about his ledger that are consistent with everyone else's ledgerss.
To do that, triple entry accounting, where a journal entry
that lists an obligation of a counterparty as an asset,
or an obligation to a counterparty as a liability,
references a jointly signed row that must exist in both party's ledgers,
jointly signed by the non fungible name tokens of both parties.
Double entry accounting shows the books balance.
Triple entry accounting shows that obligations between parties
recorded on their books balance.
Thus for sovereign corporations, a great deal of corporate governance
can be done by the laws of mathematics,
rather than the laws of men,
which was one of the original cypherpunk goals and slogans
that Satoshi was attempting to fulfil.
We always intended from the very beginning
to destroy postmodern capitalism and restore
the modern capitalism of Charles the Second.
Such a non fungible name tokens would also be necessary
for a reputation system, if we want to eat Amazon's and Ebay's lunch.

View File

@ -1091,6 +1091,11 @@ Discussion groups are a necessary first step.
## Cold Start Problem and Metcalf's law.
Metcalf's law: The usefulness of a network to any one person
considering joining it depends on how many people have already joined.
Cold Start Problem: If no one is on the network, no one wants to be on it.
The value of joining a network depends on the number of other people already using that network.
So if there are big competing networks, no one wants to join the new network.
@ -1102,7 +1107,25 @@ in the door, then we can get rolling.
### Bitmessage
The lowest hanging fruit of all, (because, unfortunately, there is no money or prospect for money in it) is to replace Bitmessage.
The lowest hanging fruit of all, (because, unfortunately, there is
no money or prospect for money in it) is to replace Bitmessage.
Which is currently abandonware, which has ceased working on most platforms,
has never supported humanly intelligible names, and lacks the
moderation capability to grey list, blacklist, and whitelist names
on discussion groups, but is widely used
because it does not reveal the sender or recipient's IP address.
In particular, it is widely used for crypto currency payments.
So next step is to integrate payment mechanisms, which brings us
a little closer to the faint smell of money.
### Integrating money.
So we create a currency. But because it will be created on sovcorp model
people cannot obtain it by proof of work - they have to buy it. Which
will require gateways between bitcoin lightning and the currency supported by
by the network, and gateways between the conversations on the network and
nostr.
# Development sequence
@ -1210,7 +1233,9 @@ Increasingly, the value of shares is not physical things, but "goodwill"
Dominos does not sell pizzas, and Apple does not sell computers. It
sets standards, and sells the expectation that stuff sold with its
brand name will conform to expectations. Domino's does not make the
pizza dough, does not make the pizzas. It sells the brand.
pizza dough, does not make the pizzas. It sells the brand. It also
organizes supply and delivery of flour, cheese, and so on and so forth
without itself producing or delivering the materials.
The latest, and one of the biggest, jewels in Apples tech crown, at
the time of writing, is the M1 chip. Which is *designed* by Apple. It is
@ -1220,7 +1245,7 @@ ingredients. But it was not cooked in a Dominos owned oven, was
not cooked by a Dominos employee, and it is unlikely that any of
the ingredients where ever anywhere near Dominos owned
physical property or a Dominos direct employee. Domino's does
not cook pizzas, and Apple does not build computers it. It designs
not cook pizzas, and Apple does not build computers. It designs
computers and set standards.
Most businesses are in practice distributed over a network, and
@ -1232,6 +1257,20 @@ suppliers, and customer and supplier expectations of employee
roles enforced by the corporation. *This*, we can move to the
blockchain and protect from governments.
An enterprise is not a physical thing. It is not buildings
and machines and all that. It is a thirteenth century accounting fiction
that fourteenth century businessmen imagined into reality, in that
the group of people the accounting fiction represented acted as
one person in reality. In the seventeenth century
Charles the Second created modern capitalism, by merging this accounting
fiction with the Roman legal fiction of the corporation,
to create the publicly traded joint stock limited liability for profit corporation.
We now, however have postmodern capitalism, in that a multitude of "stakeholders"
are subverting its corporateness, because the legal, accounting,
and human resources have power derived from the state,
rather than the corporation, pulling the corporation apart.
With blockchains, we can return from postmodern capitalism to modern capitalism.
A huge amount of what matters, a major proportion of the value
represented by shares, is in the social network. Which is
increasingly, like Apple and Google, scarcely attached to anything

View File

@ -127,7 +127,6 @@ of the pull request process is getting the puller to trust your public key, and
you will not be able to pull updates unless you tell `gpg` to trust the key that
is in the root directory as `public_key.gpg`.
Never use any email address on a gpg key related to this project
unless it is only used for project purposes, or a fake email, or the
email of an enemy. We don't want Gpg used to link different email

View File

@ -182,3 +182,69 @@ Unless, when a female contributor unnecessarily and irrelevantly informs
everyone she is female, she is told that she is seeking special treatment on
account of sex, and is not going to get it, no organization or group that
attempts to develop software is going to survive. Linux is a dead man walking.
# Style
Contributions should be gpg signed.
Never use any email address on a gpg key related to this project
unless it is only used for project purposes, or a fake email, or the
email of an enemy. We don't want Gpg used to link different email
addresses as owned by the same entity, and we don't want email
addresses used to link people to the project, because those
identities would then come under state and quasi state pressure.
if you add the recommended repository configuration defaults to your local repository configuration
```bash
git config --local include.path ../.gitconfig
```
This will implement signed commits and will insist that you have `gpg` on your path, and that you have cohfigured a signing key in your local config, and will refuse to pull updates that are signed by a gpg key that you have not locally trusted.
This may be inconvenient if you do not have `gpg` installed and set up.
It also means that subsequent pulls and merges will require you to have `gpg `ltrust the key `public_key.gpg`, and if you submit a pull request, the puller will need to ltrust your `gpg` public key.
`.gitconfig` adds several git aliases:
1. `git utcmt` to do a commit without recording your timezone in the git history
1. `git lg` to display the gpg trust information for the last few commits.
For this to be useful you need to import the repository public key
`public_key.gpg` into gpg, and locally sign that key.
1. `git graph` to graph the commit tree with signing status
1. `git alias` to display the git aliases.
```bash
# To verify that the signature on future pulls is
# unchanged.
gpg --import public_key.gpg
gpg --lsign 096EAE16FB8D62E75D243199BC4482E49673711C
```
We ignore the Gpg Web of Trust model and instead use the Zooko
identity model.
We use Gpg signatures to verify that remote repository code
is coming from an unchanging entity, not for Gpg Web of Trust. Web
of Trust is too complicated and too user hostile to be workable or safe.
Never --sign any Gpg key related to this project. --lsign it.
`gitconfig` disallows merges unless you have told `gpg` to trust the
public key corresponding to the private key that signed the tip of
the root. So part of the pull request process is getting the puller to
trust your public key, and you will not be able to pull updates
unless you tell `gpg` to trust the key that is in the root directory as
`public_key.gpg`.
Never check any Gpg key related to this project against a public
gpg key repository. It should not be there.
`gitconfig` disallows merges unless you have told `gpg` to trust the public
key corresponding to the private key that signed the tip of the root. So part
of the pull request process is getting the puller to trust your public key, and
you will not be able to pull updates unless you tell `gpg` to trust the key that
is in the root directory as `public_key.gpg`.
`.gitconfig` also imposes a whitespace style.

View File

@ -0,0 +1,120 @@
---
title: Variable Length Quantity
---
I originally implemented variable length quantities following the standard.
And then I realized that an sql index represented as a merkle-patricia tree inherently sorts in byte string order.
Which is fine if we represent integers as fixed length integers in big endian format,
but does not correctly sort variable length quantities if we follow the standard:
So: To represent variable signed numbers in byte string sortable order:
# For positive signed integers
If the leading bits are $10$, it represents a number in the range\
$0$ ... $2^6-1$ So only one byte
If the leading bits are $110$, it represents a number in the range\
$2^6$ ... $2^6+2^{13}-1$ So two bytes
if the leading bits are $1110$, it represents a number in the range\
$2^6+2^{13}+2^{20}$ ... $2^6+2^{13}+2^{20}+2^{27}-1$ So four bytes long
(five bits of header, twenty seven bits to represent $2^{27}$ different
values as the trailing twenty seven bits of an ordinary thirty two bit
positive integer in big endian format).
if the leading bits are $1111\,0$, it represents a number in the range\
$2^6+2^{13}+2^{20}+2^{27}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}-1$
So five bytes long.
if the leading bits are $1111\,10$, it represents a number in the range\
$2^6+2^{13}+2^{20}+2^{27}+2^{34}-1$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}-1$
So six bytes long.
if the leading bits are $1111\,110$, it represents a number in the range\
$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}$
So seven bytes long.
if the leading bits are $1111\,1110$, it represents a number in the range\
$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}-1$
So eight bytes long.
if the leading bits are $1111\,1111\,0$, it represents a number in the range\
$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}+2^{62}-1$
So nine bytes long (ten bits of header, sixty two bits to represent $2^{62}$
different values as the trailing sixty two bits of an ordinary sixty four bit positive integer in big endian format).
if the leading bits are $1111\,1111\,10$, it represents a number in the range\
$2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}+2^{62}$ ... $2^6+2^{13}+2^{20}+2^{27}+2^{34}+2^{41}+2^{48}+2^{55}+2^{62}+2^{69}-1$
So ten bytes long.
And so on and so forth in the same pattern for positive signed numbers of unlimited size.
The reason for these complicated offsets is to ensure that the byte string are strictly sequential.
# For negative signed integers
If the leading bits are $01$, it represents a number in the range\
$-2^6$ ... $-1$ So only one byte (two bits of header,
six bits to represent $2^6$ different values as the
trailing six bits of an ordinary eight bit negative integer).
If the leading bits are $001$, it represents a number in the range\
$-2^{13}-2^6$ ... $2^6-1$ So two bytes (three bits of header,
thirteen bits to represent $2^{13}$ different values as the trailing
thirteen bits of an ordinary sixteen bit negative integer in big endian format).
if the leading bits are $0001$, it represents a number in the range\
$-2^6-2^{13}-2^{20}$ ... $-2^6-2^{13}-1$ So three bytes long.
if the leading bits are $0000\,1$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}$ ... $-2^6-2^{13}-2^{20}-1$
So four bytes long (five bits of header, twenty seven bits to represent
$2^{27}$ different values as the trailing twenty seven bits of
an ordinary thirty two bit negative integer in big endian format).
if the leading bits are $0000\,01$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}-2^{34}$ ... $-2^6-2^{13}-2^{20}-2^{27}-1$
So five bytes long.
if the leading bits are $0000\,001$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-1$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-1$
So six bytes long.
if the leading bits are $0000\,0001$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}$
So seven bytes long.
if the leading bits are $0000\,0000\,1$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-1$
So eight bytes long.
if the leading bits are $0000\,0000\,01$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-2^{62}$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-1$
So nine bytes long (ten bits of header, sixty two bits to represent $2^{62}$
different values as the trailing sixty two bits of an ordinary sixty four bit
negative integer in big endian format).
if the leading bits are $0000\,0000\,001$, it represents a number in the range\
$-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-2^{62}
$ ... $-2^6-2^{13}-2^{20}-2^{27}-2^{34}-2^{41}-2^{48}-2^{55}-1$ So ten bytes long.
And so on and so forth in the same pattern for negative signed numbers of unlimited size.
# bitstrings
Bitstrings in merkle patricia tree representing an sql index
are typically very short, so should be represented by a
variable length quantity, except for the leaf edge,
which is fixed size and large, so should not be
represented by variable length quantity.
We use the integer zero to represent this special case,
the integer one to represent the zero length bit string,
integers two and three to represent the one bit bitstring,
integers four to seven to represent the two bit bit string,
and so on and so forth.
In other words, we represent it as the integer obtained
by prepending a leading one bit to the bit string.