forked from cheng/wallet
3c6ec5283d
length bitfields to that they will sort correctly in a Merkle Patricia tree. Have written no end of rubbish on this with needs to be deleted or modified
706 lines
32 KiB
Markdown
706 lines
32 KiB
Markdown
---
|
||
title:
|
||
Merkle-patricia Dac
|
||
# katex
|
||
...
|
||
# Definition
|
||
|
||
## Merkle-patricia Trees
|
||
|
||
A Merkle-patricia tree is a way of hashing a map, an associative
|
||
array, such that the map can have stuff added to it, or removed
|
||
from it, without having to rehash the entire map, and such that one
|
||
can prove a subset of the map, such as a single mapping, is part of
|
||
the whole map, without needing to have to have the whole map
|
||
present to construct the hash.
|
||
|
||
The need to have the entire blockchain present to validate the
|
||
current state of the blockchain and any particular fact about a
|
||
blockchain is a huge problem with existing blockchains, as they
|
||
grow enormous, and rapidly becoming a bigger problem.
|
||
|
||
In a Merkle dag, vertices are data structures, and edges are hashes,
|
||
as if one followed a link by looking up the preimage of a hash.
|
||
Obviously this is seldom an efficient way of actually implementing a
|
||
edges, and one is in practice going to use a pointer or handle for
|
||
data structures in memory, and an rowid for structures in a database,
|
||
but this is an internal representation. The canonical form has no
|
||
pointers, no oids and no handles, these are internal representations
|
||
of the structure defined and implied by the hashes, and in
|
||
communications between humans and machines, and when defining
|
||
algorithms and operation on the data structure, the algorithm should
|
||
be defined, and the data communicated, as if one was actually using
|
||
the hashes, rather than oids and pointers.
|
||
|
||
Its practical application is constructing a global consensus on what
|
||
public keys have the right to control what digital assets (such as
|
||
crypto currencies, and globally defined human readable names) and
|
||
proving that everyone who matters agrees on ownership.
|
||
|
||
If a large group of peers, each peer acting on behalf of a large group
|
||
of clients each of whom have rights to a large number of digital assets,
|
||
agree on what public keys are entitled to control what digitial assets,
|
||
then presumably their clients also agree, or they would not use that
|
||
peer.
|
||
|
||
Thus, for example, we don’t want the Certificate Authority to be able
|
||
to tell Bob that his public key is a public key whose corresponding
|
||
secret key is on his server, while at the same telling Carol that Bob’s
|
||
public key is a public key whose corresponding secret key is in fact
|
||
controlled by the secret police.
|
||
|
||
The Merkle-patricia tree not only allows peers to form a consensus on an
|
||
enormous body of data, it allows clients to efficiently verify that any
|
||
quite small piece of data, any datum, is in accord with that consensus.
|
||
|
||
## Patricia trees
|
||
|
||
A patricia tree is a way of structuring a potentially very large list of
|
||
bitstrings sorted by bitstring such that bitstrings can be added or
|
||
deleted without resorting to shifting the whole list of bitstrings.
|
||
|
||
In practice, we are not interested in bitstrings, we are interested in
|
||
fields. We want to represent an sql table by patricia tree, and do sql
|
||
operations on the tree as if it were a table.
|
||
|
||
But it is in fact a tree, and its interior vertices do not have complete
|
||
fields, they have bitstrings representing part of a field. If we have a
|
||
binary patricia tree repesenting a database table with $n$ entries, it
|
||
will have $n$ leaf vertices, and $n-1$ internal vertices.
|
||
|
||
So, to map from a bitstring to a field representing the primary key
|
||
of an sql index, we append to the bitstring a `1` bit, followed by as
|
||
many `0` bits as are needed to bring it up to one bit past the right
|
||
boundary of the field plus one additional bit.
|
||
|
||
If it is already at the right boundary, we append merely one
|
||
additional `1` bit.
|
||
|
||
The additional bit is flag indicating a final vertex, a leaf vertex of the
|
||
index, false (`0`) for interior vertices, true (`1`) for leaf vertices of
|
||
the index -- so we now have a full field, plus a flag.
|
||
|
||
A bitstring represents the path through the Merkle-patricia tree to
|
||
vertex, and we will, for consistency with sql database terminology,
|
||
call the bitstring padded to one bit past the field boundary the key,
|
||
the key being the sql field plus the one additional trailing bit, the
|
||
field boundary flag (because we are dealing with the tree
|
||
representation of an sql table, so need to know whether we have
|
||
finally reached the actual record, or are still walking through the
|
||
index, and the field boundary flag represents whether we have
|
||
reached the actual record or not.)
|
||
|
||
To obtain the bitstring from the key, we remove the trailing `0` bits
|
||
and last 1 bit. Which is to say, if the field boundary flag is true, our
|
||
bitstring is the field, and if the field boundary flag is false our
|
||
bitstring is the field with last `1` bit and the following `0` bits discarded.
|
||
|
||
In a patricia tree each vertex is associated with a bitstring.
|
||
|
||
The when you walk the tree to the left child of a vertex, you add a
|
||
zero bit, plus the bits, if any associated with that link, to predict the
|
||
bitstring of the left child, and when you walk to the right hand child,
|
||
a one bit,plus the bits if any associated with that link.
|
||
|
||
This enables you, given the bitstring you start with, and bitstring of
|
||
the vertex you want to find, the path through the patricia tree.
|
||
And, if it is a Merkle-patricia tree, this enables you to not only
|
||
produce a short efficient proof that proves the presence of a
|
||
certain datum in an enormous pile of data, but the absence of a datum.
|
||
|
||
Suppose we have database table whose primary key is a three bit
|
||
integer, and it contains four records, with oids 2, 4, 5, and 6. The
|
||
big endian representation of those primary keys 0b010, 0b100,
|
||
0b101, and 0b111
|
||
|
||
The resulting patricia tree with infix keys is:
|
||
|
||
<svg
|
||
xmlns="http://www.w3.org/2000/svg"
|
||
width="29em" height="24em"
|
||
viewBox="0 -10 320 265"
|
||
style="background-color:#FFF" stroke-width="1.5"
|
||
stroke-linecap="round" >
|
||
<g font-family="'Times New Roman'" font-size="10" font-weight="400"
|
||
fill-rule="evenodd" fill="black" >
|
||
<path stroke="#000000" fill="none" d="
|
||
M 156,18 c10,60 -121,149 -111,209
|
||
M 156,18 c-8,55 70,20 60,69
|
||
c 8,55 -70,20 -63,70
|
||
M 273,227 c20,-30 -60,-100 -57,-140
|
||
M 197,227 c12,-35 -52,-35 -44,-70
|
||
c 2,35 -38,35 -32,70" />
|
||
<g fill=#FFF>
|
||
<g id="link_bits" font-size="18">
|
||
<text font-weight="bold" fill=#FFF>
|
||
<tspan y="74" x="134">10</tspan>
|
||
<tspan y="134" x="228">0</tspan></text>
|
||
<text fill=#000>
|
||
<tspan y="74" x="134">10</tspan>
|
||
<tspan y="134" x="228">0</tspan></text>
|
||
</g>
|
||
</g>
|
||
<use font-weight="400" href="#link_bits"/>
|
||
<g id="rect">
|
||
<rect width="66" height="27" x="123" y="4" rx=5
|
||
fill="#0FF" />
|
||
<text y="2">
|
||
<tspan dy="12" x="126">bitstring</tspan>
|
||
<tspan dy="12" x="126" >key</tspan></text>
|
||
</g>
|
||
<text y="2">
|
||
<tspan dy="12" x="172" >""</tspan>
|
||
<tspan dy="12" x="153" >4, false</tspan></text>
|
||
<use transform="translate(60 70)" href="#rect"/>
|
||
<text y="72">
|
||
<tspan dy="12" x="238" >1</tspan>
|
||
<tspan dy="12" x="215" >6, false</tspan></text>
|
||
<use transform="translate(-3 140)"
|
||
href="#rect"/>
|
||
<text y="142">
|
||
<tspan dy="12" x="170" >10</tspan>
|
||
<tspan dy="12" x="152" >5, false</tspan></text>
|
||
<g transform="translate(-43 50)">
|
||
<use transform="translate(-68 160)"
|
||
href="#rect"/>
|
||
<text y="162">
|
||
<tspan dy="12" x="98" >010</tspan>
|
||
<tspan dy="12" x="91" >2, true</tspan></text>
|
||
<use transform="translate(8 160)"
|
||
href="#rect"/>
|
||
<text y="162">
|
||
<tspan dy="12" x="174" >100</tspan>
|
||
<tspan dy="12" x="167" >4, true</tspan></text>
|
||
<use transform="translate(84 160)"
|
||
href="#rect"/>
|
||
<text y="162">
|
||
<tspan dy="12" x="250" >101</tspan>
|
||
<tspan dy="12" x="243" >5, true</tspan></text>
|
||
<use transform="translate(160 160)"
|
||
href="#rect"/>
|
||
<text y="162">
|
||
<tspan dy="12" x="326" >110</tspan>
|
||
<tspan dy="12" x="319" >6, true</tspan></text>
|
||
</g>
|
||
</g>
|
||
</svg>
|
||
|
||
On the rightmost path from the top, the path gains two $1$ bits
|
||
because it goes right both times, then a $0$ bit from the link, thus the
|
||
bitstring identity of the path and the vertex is $110$.
|
||
|
||
We call the bits that link adds, in addition to the bit that results
|
||
from the choice to go left or right, the skip bits, because we are
|
||
skipping levels in the binary tree.
|
||
|
||
On the left most path from the top, the path gains one $0$ bit
|
||
because it goes left, then $10$ from the link, thus the
|
||
bitstring identity of the path and the vertex is $010$.
|
||
|
||
The two middle paths have identity purely from the left/right
|
||
choices, not receiving any additional bits from the links other than
|
||
the bit that comes from going left or right.
|
||
|
||
Each bitstring, thus each key field, identifies a vertex and a path
|
||
through the patricia tree.
|
||
|
||
We do not necessarily want to actually manipulate or represent
|
||
the bitstrings of vertices and skip fields as bitstrings. It is likely to
|
||
be a good more convenient to represent and manipulate keys, and to
|
||
represent the skip bits by the key of the target vertex.
|
||
|
||
Fields have meanings for the application using the Merkle-patricia
|
||
dag, bitstrings lack meaning.
|
||
|
||
But to understand what a patricia tree is, and to manipulate it, our
|
||
actions have to be equivalent to an algorithm described in terms of
|
||
bitstrings. We use keys because computers manipulate bytes better
|
||
than bits, just as we use pointers because we don't want to look up
|
||
the preimages of hashes in a gigantic table of hashes. But a Merkle
|
||
tree algorithm must work as if we were looking up preimages by
|
||
their hash, and sometimes we will have to look up a preimage by its
|
||
hash,and a patricia tree algorithm as if we were manipulating
|
||
bitstrings, and sometimes we will have to manipulate bitstrings.
|
||
|
||
The total number of vertexes equals the twice the number of leaves
|
||
minus one. Each parent node has as its identifier, a sequence of
|
||
bits not necessarily aligned on field boundaries, that both its children
|
||
have in common.
|
||
|
||
A Merkle-patricia dac is a patricia tree with binary radix (which is
|
||
the usual way patricia trees are implemented) where the hash of each
|
||
node depends on the hash and the skip of its two children; Which means
|
||
that each node contains proof of the entire state of all its descendant
|
||
nodes.
|
||
|
||
The skip of a branch is the bit string that differentiates its bit
|
||
string from its parent, with the first such bit excluded as it is
|
||
implied by being a left or right branch. This is often the empty
|
||
bitstring, which when mapped to a byte string for hashing purposes, maps
|
||
to the empty byte string.
|
||
|
||
It would often be considerably faster and more efficient to hash the
|
||
full bitstring, rather than the skip, and that may sometimes be not
|
||
merely OK, but required, but often we want the hash to depend only on
|
||
the data, and be independent of the metadata, as when the leaf index is
|
||
an arbitrary precision integer representing the global order of a
|
||
transaction, that is going to be constructed at some later time and
|
||
determined by a different authority.
|
||
|
||
Most of the time we will be using the tree to synchronize two
|
||
blocks pending transactions, so though a count of the number of
|
||
children of a vertex or an edge is not logically part of a Merkle-patricia
|
||
tree, it will make synchronization considerably more
|
||
efficient, since the peer that has the block with fewer children wants
|
||
information from the peer that has the node with more children.
|
||
|
||
## A sequential append only collection of postfix binary trees
|
||
|
||
<svg
|
||
xmlns="http://www.w3.org/2000/svg"
|
||
width="30em" height="19.6em"
|
||
viewBox="0 170 214 140"
|
||
style="background-color:ivory"
|
||
stroke-width="1"
|
||
stroke-linecap="round" >
|
||
<g font-family="'Times New Roman'" font-size="10"
|
||
font-weight="400" fill-rule="evenodd" fill="black" >
|
||
<g id="blockchain_id" >
|
||
<ellipse cx="10" cy="240" fill="#0D0" rx="8" ry="5"/>
|
||
<text fill="black">
|
||
<tspan x="6" y="243.2">id</tspan>
|
||
</text>
|
||
</g>
|
||
<path stroke="#0D0" fill="none" d="
|
||
M9,236 c24,-36 55,-28 84,-28 s70,6 70,-9
|
||
M11,236 c30,-36 64,2 70,-14
|
||
M13,236 c20,-16 22,-6 22,0 c 0,3 0,6.3 2,6.3 q 2,0 3,-3.5
|
||
M14,237 c29,-14 4,12 2,15c -2,3 -1,6.5 1,6.5 q 2,0 3,-3.5
|
||
M15,238 c24,-10 -11,20.5 -11,28.5c -0,2 1,4 2,4 q 2,0 3,-4
|
||
"/>
|
||
<g id="balanced merkle tree" fill="none">
|
||
<path stroke="#F00" d="
|
||
M201,238 q 1,-4 4,-4 c6,0 -8,36.5 -2,36.5 q 2,0 3,-4
|
||
M164,200 c1,-4 10,-6 16,-6 c22,0 10,48 17,48 q 2,0 3,-3.5
|
||
M164,200 c2,-3 4,-4 10,-4 c22,0 -7,62 3,62 q 2,0 3,-3.5
|
||
M164,200 c2,-2 2,-2.3 6,-2.3 c16,0 -12,73 -4,73 q 2,0 3,-4
|
||
"/>
|
||
<g id="height_4_tree" >
|
||
<path stroke="#F00" d="
|
||
M82,221 c1,-4 6,-6 16,-6 c18,0 10, 27 16,27 q 2,0 3,-3
|
||
M82,221 c2,-3 2,-4 10,-4 c18,0 12, 41.5 18,41.5 q 2,0 3,-3.5
|
||
M82,221 c2,-2 2,-2.3 6,-2.3 c18,0 -14, 51.9 -5,51.9 q 2,0 3,-4
|
||
"/>
|
||
<path stroke="#000" d="
|
||
M81,222 c -4,-20 80,0 83,-22
|
||
c 3,6 -6,8 -6,22
|
||
"/>
|
||
<g id="height_3_tree">
|
||
<path stroke="#000" d="
|
||
M41,238 c 2,-14 39,-4 40,-16
|
||
c 3,4 -6,8 -6,16
|
||
"/>
|
||
<path stroke="#F00" d="
|
||
M42,237 c1,-1 1,-4 6,-4 c7,0 -4,25.5 3,25.5 q 2,0 3,-3.5
|
||
M42,237 c1,-1 2,-2.5 4,-2.5 c7,0 -14,36 -6,36 q 2,0 3,-4
|
||
"/>
|
||
<g id="height_2_tree">
|
||
<path stroke="#F00"
|
||
d="
|
||
M22,254 q 0,-4.5 3,-4.5 c5,0 -8,21 -3,21 q 2,0 3,-4
|
||
"/>
|
||
<path stroke="#000" d="
|
||
M21,254
|
||
c 2,-14 19,-4 20,-16
|
||
c 3,4 -4,8 -4,16
|
||
x "/>
|
||
<g id="height_1_tree">
|
||
<path stroke="#000"
|
||
d="
|
||
M10,266 c0,-10 10,1 11.5,-13
|
||
M16,266 c0,-6 4,0 5.5,-13
|
||
"/>
|
||
<g id="leaf_vertex" >
|
||
<g style="stroke:#000;">
|
||
<path id="path1024"
|
||
d="
|
||
M 10,265 L9,272
|
||
M 10,265 L11,272
|
||
">
|
||
</g>
|
||
<rect id="merkle_vertex" width="4" height="4" x="8" y="264" fill="#00F"/>
|
||
</g><!-- end id="leaf vertex" -->
|
||
<use transform="translate(6)" href="#leaf_vertex"/>
|
||
<use transform="translate(11 -12)" href="#merkle_vertex"/>
|
||
</g><!-- end id="height_1_tree" -->
|
||
<use transform="translate(16)" href="#height_1_tree"/>
|
||
<use transform="translate(31 -28)" href="#merkle_vertex"/>
|
||
</g><!-- end id="height_2_tree" -->
|
||
<g >
|
||
<use transform="translate(34)" href="#height_2_tree"/>
|
||
<use transform="translate(71 -44)" href="#merkle_vertex"/>
|
||
</g>
|
||
</g><!-- end id="height_3_tree" -->
|
||
<use transform="translate(77)" href="#height_3_tree"/>
|
||
<use transform="translate(154 -66)" href="#merkle_vertex"/>
|
||
<use transform="translate(160)" href="#height_2_tree"/>
|
||
<use transform="translate(197)" href="#leaf_vertex"/>
|
||
</g><!-- end id="height_4_tree" -->
|
||
</g> <!-- end id="balanced merkle tree" -->
|
||
<text y="168">
|
||
<tspan dy="12" x="6" >Immutable append only file as a collection of</tspan>
|
||
<tspan dy="12" x="6" >balanced binary Merkle trees</tspan>
|
||
<tspan dy="12" x="6" >in postfix order</tspan>
|
||
</text>
|
||
<g id="merkle_chain">
|
||
<use transform="translate(0,60)" href="#blockchain_id"/>
|
||
<path
|
||
style="fill:none;stroke:#00D000;"
|
||
d="m 16,297 c 6,-14 9,16 14,0"/>
|
||
<g id="16_leaf_links">
|
||
<g id="8_leaf_links">
|
||
<g id="4_leaf_links">
|
||
<g id="2_leaf_links">
|
||
<g id="leaf_link">
|
||
<path
|
||
style="fill:none;stroke:#000;"
|
||
d="m 29,299 c 4,-6 4,-6 5.6,3 C 35,305 38,304 38.5,300"/>
|
||
<use transform="translate(20,33)" href="#leaf_vertex"/>
|
||
</g><!-- end id="leaf link" -->
|
||
<use transform="translate(10,0)"
|
||
href="#leaf_link"
|
||
/>
|
||
</g> <!-- end id="2 leaf links" -->
|
||
<use transform="translate(20,0)"
|
||
href="#2_leaf_links"
|
||
/>
|
||
</g> <!-- end id="4 leaf links" -->
|
||
<use transform="translate(40,0)"
|
||
href="#4_leaf_links"
|
||
/>
|
||
</g> <!-- end id="8 leaf links" -->
|
||
<use transform="translate(80,0)"
|
||
href="#8_leaf_links"
|
||
/>
|
||
</g> <!-- end id="16 leaf links" -->
|
||
<use transform="translate(160,0)"
|
||
href="#16_leaf_links"
|
||
/>
|
||
</g> <!-- end id="merkle chain" -->
|
||
<rect width="210" height=".4" x="8" y="276" fill="#000"/>
|
||
<text y="280">
|
||
<tspan dy="8" x="6" >Immutable append only file as a Merkle chain</tspan>
|
||
</text>
|
||
</g>
|
||
</svg>
|
||
|
||
This data structure means that instead of having one gigantic
|
||
proof that takes weeks to evaluate that the entire blockchain is
|
||
valid, you have an enormous number of small proofs that each
|
||
particular part of the blockchain is valid. This has three
|
||
advantages over the chain structure.
|
||
|
||
1. A huge problem with proof of share is "nothing at stake".
|
||
There is nothing stopping the peers from pulling a whole
|
||
new history out of their pocket.\
|
||
With this data structure, there is something stopping them. They
|
||
cannot pull a brand new history out of their pocket, because the
|
||
clients have a collection of very old roots of very large balanced
|
||
binary merkle trees of blocks. They keep the hash paths to all their
|
||
old transactions around, and if the peers invent a brand new history,
|
||
the clients find that the context of all their old transactions has
|
||
changed.
|
||
1. If a block gets lost or corrupted that peer can identify that one
|
||
specific block that is a problem. At present peers have to download,
|
||
or at least re-index, the entire blockchain far too often, and a full
|
||
re-index takes days or weeks.
|
||
1. It protects clients against malicious peers, since any claim the peer
|
||
makes about the total state of the blockchain can be proven with
|
||
$\bigcirc(\log_2n)$ hashes.
|
||
1. We don't want the transaction metadata to be handled
|
||
outside the secure wallet system, so we need client wallets
|
||
interacting directly with other client wallets, so we need any
|
||
client to be able to verify that the other client is on a
|
||
consensus about the state of the blockchain that is a
|
||
successor, predecessor, or the same as its consensus, that
|
||
each client can itself verify that the consensus claimed by its
|
||
peer is generally accepted.
|
||
|
||
This is not a Merkle-patricia tree. This is a generalization of a Merkle
|
||
patricia dag to support immutability..
|
||
|
||
In a binary patricia tree each vertex has two links to other vertices,
|
||
one of which corresponds to appending a $0$ bit to the bitstring that
|
||
identifies the vertex and the path to the vertex, and one of which
|
||
corresponds to adding a $1$ bit to the bitstring.
|
||
|
||
In an immutable append only Merkle-patricia dag vertices identified
|
||
by bit strings ending in a $0$ bit have a third hash link, that links to a
|
||
vertex whose bit string is truncated back by removing the trailing $0$
|
||
bits back to rightmost $1$ bit and zeroing that $1$ bit. Thus, whereas in
|
||
blockchain (Merkle chain) you need $n$ hashes to reach and prove
|
||
a vertext $n$ blocks back, in a immutable append only Merkle-patricia
|
||
dag, you only need $\bigcirc(\log_2n)$ hashes to reach a vertex $n$ blocks back.
|
||
|
||
The vertex $0010$ has an extra link back to the vertex $000$, the
|
||
vertices $0100$ and $010$ have extra links back to the vertex $00$, the
|
||
vertices $1000$, $100$, and $10$ have extra links back to the vertex $0$,
|
||
and so on and so forth.
|
||
|
||
This enables clients to reach any previous vertex through a chain of
|
||
hashes, and thus means that each new item in sequence is a hash of
|
||
all previous data in the tree. Each new item has a hash
|
||
commitment to all previous items.
|
||
|
||
The clients keep the old roots of the balanced binary trees of
|
||
blocks around, so the peers cannot sodomize them. This will matter
|
||
more and more as the blockchain gets bigger, and bigger, resulting
|
||
in ever fewer peers with ever greater power and ever more clients,
|
||
whose interests are apt to be different from those of ever the fewer
|
||
and ever greater and more powerful peers.
|
||
|
||
The superstructure of balanced binary Merkle trees allows us to
|
||
verify any part of it with only $O(log)$ hashes, and thus to verify that
|
||
one version of this data structure that one party is using is a later
|
||
version of the same data structure that another party is using.
|
||
|
||
This reduces the amount of trust that clients have to place in peers.
|
||
When the blockchain gets very large there will be rather few peers
|
||
and a great many clients, thus there will be a risk that the peers will
|
||
plot together to bugger the clients. This structure enables a client
|
||
to verify that any part of the blockchain is what his peer say it is,
|
||
and thus avoids the risk that peer may tell different clients different
|
||
accounts of the consensus. Two clients can quickly verify that they
|
||
are on the same total order and total set of transactions, and that
|
||
any item that matters to them is part of this same total order and
|
||
total set.
|
||
|
||
When the chain becomes very big, sectors and disks will be failing
|
||
all the time, and we don't want such failures to bring everything to a
|
||
screaming halt. At present, such failures far too often force you to
|
||
reindex the blockchain, and redownload a large part of it, which
|
||
happens far too often and happens more and more as the
|
||
blockchain becomes enormous.
|
||
|
||
And, when the chain becomes very big, most people will be
|
||
operating clients, not peers, and they need to be able to ensure
|
||
that the peers are not lying to them.
|
||
|
||
### storage
|
||
|
||
We would like to represent an immutable append only data
|
||
structure by append only files, and by sql tables with sequential and
|
||
ever growing oids.
|
||
|
||
When we defined the key for a Merkle-patricia tree, the key
|
||
definition gave us the parent node with a key field in the middle of
|
||
its chilren, infix order. For the tree depicted above, we want postfix order.
|
||
|
||
Normally, if the bitstring is a full field width, the vertex contains the
|
||
information we actually care about, while if the bitstring is less than
|
||
the field width, it just contains hashes ensuring the data is
|
||
immutable, that the past consensus has not been changed
|
||
underneath us, so, regardless of how the data is actually physically
|
||
stored on disk, these belong in differnt sql tables.
|
||
|
||
So, the rowid of a vertex that has a full field width sized bitstring is
|
||
simply that bitstring, while the rowid of its parent vertices is obtained
|
||
by appending $1$ bits to pad the bitstring out to full field width, and
|
||
subtracting a count of the number of $1$ bits in the original bitstring,
|
||
`std::popcount`, which gives us sequential and ever increasing oids
|
||
for the parent vertices, if the leaf vertices, the vertices with full field
|
||
width bitstrings, are sequential and ever increasing..
|
||
|
||
Let us suppose the leaf nodes of the tree depicted above are fixed size $c$, and the interior vertices are fixed size $d$ ($d$ is probably thirty two or sixty four bytes) and they are being physically stored in
|
||
memory or a file in sequence.
|
||
|
||
Let us suppose the leaf nodes are stored with the interior vertices
|
||
and are sequentially numbered.
|
||
|
||
Then the location of leaf node $n$ begins at $n\times c+\big(n-$`std::popcount`$(n)\times d\big)$ (which unfortunately lacks a simple
|
||
relationship to the bitstring of a vertex corresponding to a complete
|
||
field, which is the field that represents the meaning that we actually
|
||
care about).
|
||
|
||
# Blockchain
|
||
|
||
A Merkle-patricia block chain represents *an immutable past and a constantly changing present*.
|
||
|
||
Which represents an immutable and ever growing sequence of transactions,
|
||
and also a large and mutable present state of the present database that
|
||
is the result of those transactions, the database of unspent transaction
|
||
outputs.
|
||
|
||
When we are assembling a new block, the records live in memory as native
|
||
format C++ objects. Upon a new block being finalized, they get written
|
||
to disk in key order, with implementation dependent offsets between
|
||
records and implementation dependent compression, which compression
|
||
likely reflects canonical form. Once written to disk, they are accessed
|
||
by native format records in memory, which access by bringing disk
|
||
records into memory in native format, but the least recently loaded
|
||
entry, or least recetly used entry, gets discarded. Even when we are
|
||
operating at larger scale than visa, a block representing five minutes
|
||
of transactions fits easily in memory.
|
||
|
||
Further, a patricia tree is a tree. But we want, when we have the Merkle
|
||
patricia tree representing registered names organized by names or the
|
||
Merkle-patricia tree represenging as yet unspent transaction outputs its
|
||
Merkle characteristic to represent a directed acyclic graph. If two
|
||
branches have the same hash, despite being at different positions and
|
||
depths in the tree, all their children will be identical. And we want to
|
||
take advantage of this in that block chain will be directed acyclic
|
||
graph, each block being a tree representing the state of the system at
|
||
that block commitment, but that tree points back into previous block
|
||
commitments for those parts of the state of the system that have not
|
||
changed. So the hash of the node in such a tree will identify, probably
|
||
through an rowid, a record of the block it was a originally constructed
|
||
for, and its index in that tree.
|
||
|
||
A Merkle-patricia directed acyclic graph, Merkle-patricia dac, is a
|
||
Merkle dac, like a git repository or the block chain, with the patricia
|
||
key representing the path of hashes, and acting as index through that
|
||
chain of hashes to find the data that you want.
|
||
|
||
The key will thread through different computers under the control of
|
||
different people, thus providing a system of witness that the current
|
||
global consensus hash accurately reflects past global consensus hashes,
|
||
and that each entities version of the past agree with the version it
|
||
previously espoused.
|
||
|
||
This introduces some complications when a portion of the tree represents
|
||
a database table with more than one index.
|
||
|
||
[Ethereum has a discussion and
|
||
definition](https://github.com/ethereum/wiki/wiki/Patricia-Tree) of this
|
||
data structure.
|
||
|
||
Suppose, when the system is at scale, we have thousand trillion entries
|
||
in the public, readily accessible, and massively replicated part of the
|
||
blockchain. (I intend that every man and his dog will also have a
|
||
sidechain, every individual, every business. The individual will
|
||
normally not have his side chain publicly available, but in the event of
|
||
a dispute, may make a portion of it visible, so that certain of his
|
||
payments, an the invoice they were payments for, become visible to
|
||
others.)
|
||
|
||
In that case, a new transaction output is typically going to require
|
||
forty thirty two byte hashes, taking up about two kilobytes in total on
|
||
any one peer. And a single person to person payment is typicaly going to
|
||
take ten transaction outputs or so, taking twenty kilobytes in total on
|
||
any one peer. And this is going to be massively replicated by a few
|
||
hundred peers, taking about four megabytes in total.
|
||
|
||
(A single transaction will typically be much larger than this, because
|
||
it will mingle several person to person payments.
|
||
|
||
Right now you can get system with sixty four terabytes of hard disk,
|
||
thirty two gigabytes of ram, under six thousand, for south of a hundred
|
||
dollars per terabyte, so storing everything forever is going to cost
|
||
about a twentieth of a cent per person to person payment. And a single
|
||
such machine will be good to hold the whole blockchain for the first few
|
||
trillion person to person payments, good enough to handle paypal volumes
|
||
for a year.
|
||
|
||
“OK”, I hear you say. “And after the first few trillion transactions?”.
|
||
|
||
Well then, if we have a few trillion transactions a year, and only a few
|
||
hundred peers, then the clients of any one peer will be doing about ten
|
||
billion transactions a year. If he profits half a cent per transaction,
|
||
he is making about fifty million a year. He can buy a few more sixty
|
||
four terabyte computers every year.
|
||
|
||
The target peer machine we will write for will have thirty two gigabytes
|
||
of ram and sixty four terabytes of hard disk, but our software should
|
||
run fine on a small peer machine, four gigabytes of ram and two
|
||
terabytes of hard disk, until the crypto currency surpasses bitcoin.
|
||
|
||
# vertex identifiers
|
||
|
||
We need a canonical form for all data structures, the form which is
|
||
hashed, even if it is not convenient to use or manipulate the data in
|
||
that form on a particular machine with particular hardware and a
|
||
particular complier.
|
||
|
||
A patricia tree representation of a field and record of fields does
|
||
not gracefully represent variable sized records.
|
||
|
||
If we represented the bitstring that corresponds to the block
|
||
number, the block height, has having a large number of leading
|
||
zero bits, so that it corresponds to a sixty three bit integer (we need
|
||
the additional low order bit for operations translating the bitstring
|
||
to its representation as a key field or rowid field) a fixed field of sixty
|
||
four bits will do us fine for a trillion years or so.
|
||
|
||
But I have an aesthetic objection to representing things that are not
|
||
fixed sized as fixed sized.
|
||
|
||
Therefore I am inclined to represent bit strings as count of bytes, a
|
||
byte string containing the zero padded bitstring, the bitstring being
|
||
byte aligned with the field boundary, and count of the distance in
|
||
bits between the right edge of the bitstring, and the right edge of
|
||
the field, that being the height of the interior vertex above the
|
||
leaf vertices containing the actual data that we are interested in, in
|
||
its representation as an sql table.
|
||
|
||
We do not hash the leading zero bytes of a bitstring that is part of
|
||
an integer field because we do not know, and do not care, how
|
||
many zero bytes there are. A particular machine running a program
|
||
compiled by a particular compiler will represent that integer with a
|
||
particular sufficiently large machine word, but the hash cannot
|
||
depend on the word size of a particular machine.
|
||
|
||
A particular machine will represent the bitstring that is part of an
|
||
integer field with a particular number of leading zero bytes,
|
||
depending on its hardware and its compiler, but this cannot be
|
||
allowed to affect the representation on the wire or the value of the hash.
|
||
|
||
If one peer represents the block number as a thirty two bit value,
|
||
and another peer as a sixty four bit value, and thus thinks the
|
||
bitstring has four more leading zero bytes than the former peer
|
||
does, this should have no effect. They should both get the same
|
||
hashes, because the preimage of our hash should be independent
|
||
of the number of leading zero bytes.
|
||
|
||
For integer fields such as the block number, we would like to
|
||
represent integers in a form independent of the computer word
|
||
size, so we do not know the alignment from start of field for a
|
||
bitstring that is part of an integer field, only the size of the byte
|
||
aligned bitstring, and how far the end of the bistring is from
|
||
the end of the integer field. Each particular peer executing the algorithm then applies as many leading zero bytes to the bitstring
|
||
as suits the way it represents integers.
|
||
|
||
The skip field of a link crossing a field boundary into an integer field
|
||
should not tell the machine following that link how many leading
|
||
zero bytes to add to the bitstring, but where the first non zero byte
|
||
of the bitstring is above the right edge of the integer, and the peer
|
||
interpreting that skip field will add as many leading zero bytes to
|
||
the bitstring as it finds handy for its hardware.
|
||
|
||
Some fields, notably text strings, do not have a definite right hand
|
||
boundary, representing the boundary inline. In that case, we
|
||
represent the vertex depth below the start of field, rather than the
|
||
vertex height above the end of field.
|
||
|
||
We always start walking the vertexes representing an immutable
|
||
append only Merkle-patricia tree knowing the bitstring, so their
|
||
preimages do not need to contain a vertex bitstring, nor do their
|
||
links need to add bits to the bitstring, because all the bits added
|
||
or subtracted are implicit in the choice of branch to take, so those
|
||
links do not contain representations of skip field bit string either.
|
||
However, when passing blocks around, we do need to communicate
|
||
the bitstring of a block, and when passing a hash path, we do need
|
||
to communicate the bitstring of the root vertex of the path, and
|
||
many of the hashes will be interior to a block, and thus their links
|
||
do need bitstrings for their skip fields, and we will need to sign
|
||
those messages, thus need to hash them, so we need a canonical
|
||
hash of a bitstring, which requires a canonical representation of a bitstring. A bitstring lives in a field, so the position of the bitstring
|
||
relative to the field boundary needs a canonical representation,
|
||
though when we are walking the tree, this information is usually
|
||
implicit, so does not inherently need to present in the preimage of
|
||
the vertex. But, since we are putting byte aligned bitfields in byte
|
||
strings, we need to know where the bitstring of a skip field for a link
|
||
ends within the byte, which is most conveniently done by giving the
|
||
field alignment of the end of the bitstring within the field as part of
|
||
vertex skip fields - assuming there is a skip field, which there
|
||
frequently will not be.
|