finally figured out how to represent numbers and variable
length bitfields to that they will sort correctly in a Merkle Patricia tree. Have written no end of rubbish on this with needs to be deleted or modified
This commit is contained in:
parent
06b9fc4017
commit
3c6ec5283d
@ -1320,7 +1320,7 @@ verification. Not sure how long it takes to produce a proof that a large
|
|||||||
number of proofs were verified.
|
number of proofs were verified.
|
||||||
|
|
||||||
What you want is to be able to prove that a final hash is the root of of an
|
What you want is to be able to prove that a final hash is the root of of an
|
||||||
enormous merkle tree, some generalization of a merkle-patricia tree,
|
enormous merkle tree, some generalization of a Merkle-patricia tree,
|
||||||
representing an immutable append only data structure consisting of a
|
representing an immutable append only data structure consisting of a
|
||||||
sequence of piles of transactions, and the state generated by these
|
sequence of piles of transactions, and the state generated by these
|
||||||
transactions, represents a valid branch of a chain of signatures, that the
|
transactions, represents a valid branch of a chain of signatures, that the
|
||||||
|
@ -80,7 +80,7 @@ The additional bit is flag indicating a final vertex, a leaf vertex of the
|
|||||||
index, false (`0`) for interior vertices, true (`1`) for leaf vertices of
|
index, false (`0`) for interior vertices, true (`1`) for leaf vertices of
|
||||||
the index -- so we now have a full field, plus a flag.
|
the index -- so we now have a full field, plus a flag.
|
||||||
|
|
||||||
A bitstring represents the path through the merkle patricia tree to
|
A bitstring represents the path through the Merkle-patricia tree to
|
||||||
vertex, and we will, for consistency with sql database terminology,
|
vertex, and we will, for consistency with sql database terminology,
|
||||||
call the bitstring padded to one bit past the field boundary the key,
|
call the bitstring padded to one bit past the field boundary the key,
|
||||||
the key being the sql field plus the one additional trailing bit, the
|
the key being the sql field plus the one additional trailing bit, the
|
||||||
@ -104,7 +104,7 @@ a one bit,plus the bits if any associated with that link.
|
|||||||
|
|
||||||
This enables you, given the bitstring you start with, and bitstring of
|
This enables you, given the bitstring you start with, and bitstring of
|
||||||
the vertex you want to find, the path through the patricia tree.
|
the vertex you want to find, the path through the patricia tree.
|
||||||
And, if it is a Merkle patricia tree, this enables you to not only
|
And, if it is a Merkle-patricia tree, this enables you to not only
|
||||||
produce a short efficient proof that proves the presence of a
|
produce a short efficient proof that proves the presence of a
|
||||||
certain datum in an enormous pile of data, but the absence of a datum.
|
certain datum in an enormous pile of data, but the absence of a datum.
|
||||||
|
|
||||||
@ -209,7 +209,7 @@ the bitstrings of vertices and skip fields as bitstrings. It is likely to
|
|||||||
be a good more convenient to represent and manipulate keys, and to
|
be a good more convenient to represent and manipulate keys, and to
|
||||||
represent the skip bits by the key of the target vertex.
|
represent the skip bits by the key of the target vertex.
|
||||||
|
|
||||||
Fields have meanings for the application using the Merkle patricia
|
Fields have meanings for the application using the Merkle-patricia
|
||||||
dag, bitstrings lack meaning.
|
dag, bitstrings lack meaning.
|
||||||
|
|
||||||
But to understand what a patricia tree is, and to manipulate it, our
|
But to understand what a patricia tree is, and to manipulate it, our
|
||||||
@ -434,12 +434,12 @@ one of which corresponds to appending a $0$ bit to the bitstring that
|
|||||||
identifies the vertex and the path to the vertex, and one of which
|
identifies the vertex and the path to the vertex, and one of which
|
||||||
corresponds to adding a $1$ bit to the bitstring.
|
corresponds to adding a $1$ bit to the bitstring.
|
||||||
|
|
||||||
In an immutable append only Merkle patricia dag vertices identified
|
In an immutable append only Merkle-patricia dag vertices identified
|
||||||
by bit strings ending in a $0$ bit have a third hash link, that links to a
|
by bit strings ending in a $0$ bit have a third hash link, that links to a
|
||||||
vertex whose bit string is truncated back by removing the trailing $0$
|
vertex whose bit string is truncated back by removing the trailing $0$
|
||||||
bits back to rightmost $1$ bit and zeroing that $1$ bit. Thus, whereas in
|
bits back to rightmost $1$ bit and zeroing that $1$ bit. Thus, whereas in
|
||||||
blockchain (Merkle chain) you need $n$ hashes to reach and prove
|
blockchain (Merkle chain) you need $n$ hashes to reach and prove
|
||||||
a vertext $n$ blocks back, in a immutable append only Merkle patricia
|
a vertext $n$ blocks back, in a immutable append only Merkle-patricia
|
||||||
dag, you only need $\bigcirc(\log_2n)$ hashes to reach a vertex $n$ blocks back.
|
dag, you only need $\bigcirc(\log_2n)$ hashes to reach a vertex $n$ blocks back.
|
||||||
|
|
||||||
The vertex $0010$ has an extra link back to the vertex $000$, the
|
The vertex $0010$ has an extra link back to the vertex $000$, the
|
||||||
@ -492,7 +492,7 @@ We would like to represent an immutable append only data
|
|||||||
structure by append only files, and by sql tables with sequential and
|
structure by append only files, and by sql tables with sequential and
|
||||||
ever growing oids.
|
ever growing oids.
|
||||||
|
|
||||||
When we defined the key for a Merkle patricia tree, the key
|
When we defined the key for a Merkle-patricia tree, the key
|
||||||
definition gave us the parent node with a key field in the middle of
|
definition gave us the parent node with a key field in the middle of
|
||||||
its chilren, infix order. For the tree depicted above, we want postfix order.
|
its chilren, infix order. For the tree depicted above, we want postfix order.
|
||||||
|
|
||||||
@ -682,7 +682,7 @@ represent the vertex depth below the start of field, rather than the
|
|||||||
vertex height above the end of field.
|
vertex height above the end of field.
|
||||||
|
|
||||||
We always start walking the vertexes representing an immutable
|
We always start walking the vertexes representing an immutable
|
||||||
append only Merkle patricia tree knowing the bitstring, so their
|
append only Merkle-patricia tree knowing the bitstring, so their
|
||||||
preimages do not need to contain a vertex bitstring, nor do their
|
preimages do not need to contain a vertex bitstring, nor do their
|
||||||
links need to add bits to the bitstring, because all the bits added
|
links need to add bits to the bitstring, because all the bits added
|
||||||
or subtracted are implicit in the choice of branch to take, so those
|
or subtracted are implicit in the choice of branch to take, so those
|
||||||
|
@ -9,9 +9,9 @@ in protocols tend to become obsolete. Therefore, for future
|
|||||||
upwards compatibility, we want to have variable precision
|
upwards compatibility, we want to have variable precision
|
||||||
numbers.
|
numbers.
|
||||||
|
|
||||||
Secondly, to represent integers within a patricia merkle tree representing a database index, we want all values to be left field aligned, rather than right field aligned.
|
Secondly, to represent integers within a Merkle-patricia tree representing a database index, we want all values to be left field aligned, rather than right field aligned.
|
||||||
|
|
||||||
## Merkle patricia dag
|
## Merkle-patricia dag
|
||||||
|
|
||||||
We intend to have a vast Merkle dag, and a vast collection of immutable
|
We intend to have a vast Merkle dag, and a vast collection of immutable
|
||||||
append only data structures. Each new block in the append only data
|
append only data structures. Each new block in the append only data
|
||||||
@ -40,29 +40,29 @@ package.
|
|||||||
|
|
||||||
## Compression algorithm preserving sort order
|
## Compression algorithm preserving sort order
|
||||||
|
|
||||||
We want to represent integers by byte strings whose lexicographic order reflects their order as integers, which is to say, when sorted as a left aligned field, sort like integers represented as a right aligned field. (Because a Merkle patricia tree has a hard time with right aligned fields)
|
We want to represent integers by byte strings whose lexicographic order reflects their order as integers, which is to say, when sorted as a left aligned field, sort like integers represented as a right aligned field. (Because a Merkle-patricia tree has a hard time with right aligned fields)
|
||||||
|
|
||||||
To do this we have a field that is a count of the number of bytes, and the size of that field is encoded in unary.
|
To do this we have a field that is a count of the number of bytes, and the size of that field is encoded in unary.
|
||||||
|
|
||||||
Thus a single byte value, representing integers in the range $0\le n \lt 2^7$ starts with a leading zero bit
|
Thus a single byte value, representing integers in the range $0\le n \lt 2^7$ starts with a leading zero bit
|
||||||
|
|
||||||
A two byte value, representing integers in the range $2^7\le n \lt 2^{13}+2^7$ starts with the bits 100
|
A two byte value, representing integers in the range $2^7\le n \lt 2^{13}+2^7$ starts with the bits 10 0
|
||||||
|
|
||||||
A three byte value, representing integers in the range $2^{13}+2^7 \le n \lt 2^{21}+2^{13}+2^7$ starts with the bits 101
|
A three byte value, representing integers in the range $2^{13}+2^7 \le n \lt 2^{21}+2^{13}+2^7$ starts with the bits 10 1
|
||||||
|
|
||||||
A four byte value representing integers in the range $2^{21}+2^{13}+2^7 \le n \lt 2^{27}+2^{21}+2^{13}+2^7$ starts with the bits 11000
|
A four byte value representing integers in the range $2^{21}+2^{13}+2^7 \le n \lt 2^{27}+2^{21}+2^{13}+2^7$ starts with the bits 110 00
|
||||||
|
|
||||||
A five byte value representing integers in the range $2^{21}+2^{13}+2^7 \le n \lt 2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 11001
|
A five byte value representing integers in the range $2^{21}+2^{13}+2^7 \le n \lt 2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 110 01
|
||||||
|
|
||||||
A six byte value representing integers in the range $2^{35}+2^{21}+2^{13}+2^7 \le n \lt 2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 11010
|
A six byte value representing integers in the range $2^{35}+2^{21}+2^{13}+2^7 \le n \lt 2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 110 10
|
||||||
|
|
||||||
A seven byte value representing integers in the range $2^{43}+2^{35}+2^{21}+2^{13}+2^7 \le n \lt2^{51}+2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 11011
|
A seven byte value representing integers in the range $2^{43}+2^{35}+2^{21}+2^{13}+2^7 \le n \lt2^{51}+2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 110 11
|
||||||
|
|
||||||
An eight byte value representing integers in the range $2^{51}2^{43}+2^{35}+2^{21}+2^{13}+2^7 \le n \lt2^{57}+2^{51}+2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 1110000
|
An eight byte value representing integers in the range $2^{51}2^{43}+2^{35}+2^{21}+2^{13}+2^7 \le n \lt2^{57}+2^{51}+2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 1110 000
|
||||||
|
|
||||||
A nine byte value representing integers in the range $2^{57}+2^{51}+2^{43}+2^{35}+2^{21}+2^{13}+2^7 \le n \lt2^{65}+2^{57}+2^{51}+2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 1110001
|
A nine byte value representing integers in the range $2^{57}+2^{51}+2^{43}+2^{35}+2^{21}+2^{13}+2^7 \le n \lt2^{65}+2^{57}+2^{51}+2^{43}+2^{35}+2^{27}+2^{21}+2^{13}+2^7+2^{13}+2^7$ starts with the bits 1110 001
|
||||||
|
|
||||||
Similarly the bits 111 0111 indicate a fifteen byte value representing 113 bit integers.
|
Similarly the bits 1110 111 indicate a fifteen byte value representing 113 bit integers.
|
||||||
|
|
||||||
To represent signed integers so that signed integers sort correctly with each other (but not with unsigned integers) the leading bit indicates the sign, a one bit for positive signed integers, and a zero bit for negative integers, and the if the signed integer is negative, we invert the bits of the byte count. Thus signed integers in the range $-2^6\le n \lt 2^6$ are represented by the corresponding eight bit value with its leading bit inverted.
|
To represent signed integers so that signed integers sort correctly with each other (but not with unsigned integers) the leading bit indicates the sign, a one bit for positive signed integers, and a zero bit for negative integers, and the if the signed integer is negative, we invert the bits of the byte count. Thus signed integers in the range $-2^6\le n \lt 2^6$ are represented by the corresponding eight bit value with its leading bit inverted.
|
||||||
|
|
||||||
@ -96,6 +96,103 @@ We display a value in the range $0\le n \lt 58/2$ as itself,
|
|||||||
|
|
||||||
a value $n$ in the range $58/2\le n \lt \lfloor 58*2^{-2}\rfloor*58 +58/2$ as the base 58 representation of $n+58*(58/2-1)$
|
a value $n$ in the range $58/2\le n \lt \lfloor 58*2^{-2}\rfloor*58 +58/2$ as the base 58 representation of $n+58*(58/2-1)$
|
||||||
|
|
||||||
|
## Variable length bit fields
|
||||||
|
|
||||||
|
To represent variable length bit fields in the postfix sort order,
|
||||||
|
such that a shorter bit field sorts after all longer bit fields
|
||||||
|
with same leading bits:
|
||||||
|
|
||||||
|
We break it into seven bit fields, with a final field representing zero to six bits.
|
||||||
|
|
||||||
|
A seven bit field is represented by a byte ending in a zero low order bit.
|
||||||
|
|
||||||
|
A variable length $m$ bit field where m is 0 to 6 (seven possible
|
||||||
|
values) by is represented by a fixed width eight bit field:
|
||||||
|
|
||||||
|
Where if\
|
||||||
|
$j$ is the bitfield interpreted as a number\
|
||||||
|
$m$ is the length of the bitfield\
|
||||||
|
$c$ is a count of the set bits in the bitfield
|
||||||
|
|
||||||
|
The value of the eight bit field is:\
|
||||||
|
$j*(2^{(7-m)}-1)+2*c+1$
|
||||||
|
|
||||||
|
----------------------
|
||||||
|
variable 7 bit
|
||||||
|
bit field bitfield
|
||||||
|
--------- ------------
|
||||||
|
000000 0000 0001
|
||||||
|
|
||||||
|
000001 0000 0011
|
||||||
|
|
||||||
|
00000 0000 0101
|
||||||
|
|
||||||
|
000010 0000 0111
|
||||||
|
|
||||||
|
000011 0000 1001
|
||||||
|
|
||||||
|
00001 0000 1011
|
||||||
|
|
||||||
|
0000 0000 1101
|
||||||
|
|
||||||
|
000100 0000 1111
|
||||||
|
|
||||||
|
000101 0001 1001
|
||||||
|
|
||||||
|
00010 0001 0011
|
||||||
|
|
||||||
|
000110 0001 0101
|
||||||
|
|
||||||
|
000111 0001 0111
|
||||||
|
|
||||||
|
00011 0001 1001
|
||||||
|
|
||||||
|
0001 0001 1011
|
||||||
|
|
||||||
|
000 0001 1101
|
||||||
|
|
||||||
|
... ...
|
||||||
|
|
||||||
|
111101 1110 1100
|
||||||
|
|
||||||
|
11110 1110 1101
|
||||||
|
|
||||||
|
111110 1110 1111
|
||||||
|
|
||||||
|
111111 1111 0001
|
||||||
|
|
||||||
|
11111 1111 0011
|
||||||
|
|
||||||
|
1111 1111 0101
|
||||||
|
|
||||||
|
111 1111 0111
|
||||||
|
|
||||||
|
11 1111 1001
|
||||||
|
|
||||||
|
1 1111 1011
|
||||||
|
|
||||||
|
empty 1111 1101
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
### SQL blobs.
|
||||||
|
|
||||||
|
In order for blobs in a database representing bitfields to sort
|
||||||
|
correctly, we do not use seven bit nibbles, but eight bit bytes,
|
||||||
|
with a final byte representing zero to seven bits as an eight bit byte.
|
||||||
|
|
||||||
|
For this we use the mapping:
|
||||||
|
Where if\
|
||||||
|
$j$ is the bitfield interpreted as a number\
|
||||||
|
$m$ is the length of the bitfield\
|
||||||
|
$c$ is a count of the set bits in the bitfield
|
||||||
|
|
||||||
|
The value of the eight bit field is:\
|
||||||
|
$j*(2^{(7-m)}-1)+c$
|
||||||
|
|
||||||
|
The difference is that blob is preceded by a count field
|
||||||
|
that is not used in the sort order, which is tricky to
|
||||||
|
do in a Merkle-patricia tree representing an sql index.
|
||||||
|
|
||||||
## Use case
|
## Use case
|
||||||
|
|
||||||
QR codes and prefix free number encoding is useful in cases where we want data to be self describing – this bunch of bits is to be interpreted in a certain way, used in a certain action, means one thing, and not another thing. At present there is no standard for self description. QR codes are given meanings by the application, and could carry completely arbitrary data whose meaning and purpose comes from outside, from the context.
|
QR codes and prefix free number encoding is useful in cases where we want data to be self describing – this bunch of bits is to be interpreted in a certain way, used in a certain action, means one thing, and not another thing. At present there is no standard for self description. QR codes are given meanings by the application, and could carry completely arbitrary data whose meaning and purpose comes from outside, from the context.
|
||||||
@ -118,7 +215,7 @@ will only be one, and it will be a long time before there are two.
|
|||||||
|
|
||||||
When I say "arbitrarily large" I do not mean arbitrarily large, since this creates the possibility that someone could break something by sending a number bigger than the software can handle. There needs to be an absolute limit, such as sixty four bits, on representable numbers. But the limit should be larger than is ever likely to have a legitimate use.
|
When I say "arbitrarily large" I do not mean arbitrarily large, since this creates the possibility that someone could break something by sending a number bigger than the software can handle. There needs to be an absolute limit, such as sixty four bits, on representable numbers. But the limit should be larger than is ever likely to have a legitimate use.
|
||||||
|
|
||||||
# Solutions
|
# Other Solutions
|
||||||
|
|
||||||
## Zero byte encoding
|
## Zero byte encoding
|
||||||
|
|
||||||
@ -128,7 +225,7 @@ When I say "arbitrarily large" I do not mean arbitrarily large, since this creat
|
|||||||
|
|
||||||
QUIC expresses a sixty two bit number as one to four sixteen bit numbers. This is the fastest to encode and decode.
|
QUIC expresses a sixty two bit number as one to four sixteen bit numbers. This is the fastest to encode and decode.
|
||||||
|
|
||||||
## Leading bit as number boundary
|
## VLQ Leading bit as number boundary
|
||||||
|
|
||||||
But it seems to me that the most efficient reasonably fast and elegant
|
But it seems to me that the most efficient reasonably fast and elegant
|
||||||
solution is a variant on utf8 encoding, though not quite as fast as the
|
solution is a variant on utf8 encoding, though not quite as fast as the
|
||||||
|
@ -79,7 +79,7 @@ around and are disinclined to make it available. And if they did make it
|
|||||||
available, the same peer would appear in far too many different and
|
available, the same peer would appear in far too many different and
|
||||||
unrelated branches of the tree, creating excessive [Kademlia] lookup costs.
|
unrelated branches of the tree, creating excessive [Kademlia] lookup costs.
|
||||||
|
|
||||||
## Merkle Patricia tree of signatures
|
## Merkle-patricia tree of signatures
|
||||||
|
|
||||||
Suppose that every block of the root primary blockchain contains hash of Merkle-patricia keys of signatures of blobs.
|
Suppose that every block of the root primary blockchain contains hash of Merkle-patricia keys of signatures of blobs.
|
||||||
|
|
||||||
@ -376,8 +376,8 @@ rather than what our enemies in Ethereum want done.
|
|||||||
|
|
||||||
The key is writing a language that operates on what looks to it like sql
|
The key is writing a language that operates on what looks to it like sql
|
||||||
tables, to produce proof that the current state, expressed as a collection of
|
tables, to produce proof that the current state, expressed as a collection of
|
||||||
tables represented as a Merkle Patricia tree, is the result of valid
|
tables represented as a Merkle-patricia tree, is the result of valid
|
||||||
operations on a collection of transactions, represented as Merkle patricia
|
operations on a collection of transactions, represented as Merkle-patricia
|
||||||
tree, that acted on the previous current state, that allows generic
|
tree, that acted on the previous current state, that allows generic
|
||||||
transactions, on generic tables, rather than Ethereum transactions on
|
transactions, on generic tables, rather than Ethereum transactions on
|
||||||
Ethereum data structures.
|
Ethereum data structures.
|
||||||
@ -454,7 +454,7 @@ rocket and calling it a space plane.
|
|||||||
|
|
||||||
A blockchain is of course a chain of blocks, and at scale, each block would be far too immense for any one peer to store or process, let alone the entire chain.
|
A blockchain is of course a chain of blocks, and at scale, each block would be far too immense for any one peer to store or process, let alone the entire chain.
|
||||||
|
|
||||||
Each block would be a Merkle patricia tree, or a Merkle tree of a number of Merkle patricia trees, because we want the block to be broad and flat, rather than deep and narrow, so that it can be produced in a massively parallel way, created in parallel by an immense number of peers. Each block would contain a proof that it was validly derived from the previous block, and that the previous block’s similar proof was verified. A chain is narrow and deep, but that does not matter, because the proofs are “scalable”. No one has to verify all the proofs from the beginning, they just have to verify the latest proofs.
|
Each block would be a Merkle-patricia tree, or a Merkle tree of a number of Merkle-patricia trees, because we want the block to be broad and flat, rather than deep and narrow, so that it can be produced in a massively parallel way, created in parallel by an immense number of peers. Each block would contain a proof that it was validly derived from the previous block, and that the previous block’s similar proof was verified. A chain is narrow and deep, but that does not matter, because the proofs are “scalable”. No one has to verify all the proofs from the beginning, they just have to verify the latest proofs.
|
||||||
|
|
||||||
Each peer would keep around the actual data and actual proofs that it cared about, and the chain of hashes linking the data it cared about to Merkle root of the latest block.
|
Each peer would keep around the actual data and actual proofs that it cared about, and the chain of hashes linking the data it cared about to Merkle root of the latest block.
|
||||||
|
|
||||||
|
@ -53,15 +53,15 @@ You find the expected scale height, the amount that causes the probability of a
|
|||||||
|
|
||||||
But if both sides have vast collections of identical or near identical transactions, as is highly likely because they probably just synchronized with the same people, each item in a filter is going to convey very little information. Further, you can never be sure that you are completely synchronized except by setting a lot of bits for each item.
|
But if both sides have vast collections of identical or near identical transactions, as is highly likely because they probably just synchronized with the same people, each item in a filter is going to convey very little information. Further, you can never be sure that you are completely synchronized except by setting a lot of bits for each item.
|
||||||
|
|
||||||
## Merkle Patricia tree
|
## Merkle-patricia tree
|
||||||
|
|
||||||
So, you build a Merkle patricia tree.
|
So, you build a Merkle-patricia tree.
|
||||||
|
|
||||||
And then you want to transmit a filter that represents the upper portion of the tree where the likelihood of a discrepancy between Bob's tree and Carol's tree is around fifty percent. When you see a discrepancy, you go deeper into that part of the tree on the next sub round. A large part of the time, the discrepancy will be a single transaction. When you have isolated all the discrepancies, rinse and repeat. Eventually the root hashes will agree, so the snapshot the Bob's concurrent process took is now synchronized to Carol, and the snapshot that Carol's concurrent process took is now synchronized to Bob. But new transactions have probably arrived, so time to take the next snapshot.
|
And then you want to transmit a filter that represents the upper portion of the tree where the likelihood of a discrepancy between Bob's tree and Carol's tree is around fifty percent. When you see a discrepancy, you go deeper into that part of the tree on the next sub round. A large part of the time, the discrepancy will be a single transaction. When you have isolated all the discrepancies, rinse and repeat. Eventually the root hashes will agree, so the snapshot the Bob's concurrent process took is now synchronized to Carol, and the snapshot that Carol's concurrent process took is now synchronized to Bob. But new transactions have probably arrived, so time to take the next snapshot.
|
||||||
|
|
||||||
You discover how deep that is by initially sending the full filter of vertex and leaf hashes for just a portion of the address space covered by the tree. From what shows up, in the next round you will be roughly right for filter depth.
|
You discover how deep that is by initially sending the full filter of vertex and leaf hashes for just a portion of the address space covered by the tree. From what shows up, in the next round you will be roughly right for filter depth.
|
||||||
|
|
||||||
You do want to use a cryptographically strong hash for the identifier of the each transaction, because that is global public information, and we do not want people to be able to cook up transactions that will force hash collisions, because that would enable them to engage in Byzantine defection. But you want to use Murmur for the vertices of the tree that represents transactions that Bob does not yet know whether Carol already has, since that is bilateral information maintained by concurrent process that is managing Bob's connection with Carol, so Byzantine defection is impossible. When, however, Bob's concurrent process managing the connection with Carol whips up a Merkle patricia tree, it should use Murmur3, because there will be a lot of such processes generating a lot of Merkle patricia trees, but only one cryptographic hashes representing each transaction. Lots of such trees are generated, and lots discarded.
|
You do want to use a cryptographically strong hash for the identifier of the each transaction, because that is global public information, and we do not want people to be able to cook up transactions that will force hash collisions, because that would enable them to engage in Byzantine defection. But you want to use Murmur for the vertices of the tree that represents transactions that Bob does not yet know whether Carol already has, since that is bilateral information maintained by concurrent process that is managing Bob's connection with Carol, so Byzantine defection is impossible. When, however, Bob's concurrent process managing the connection with Carol whips up a Merkle-patricia tree, it should use Murmur3, because there will be a lot of such processes generating a lot of Merkle-patricia trees, but only one cryptographic hashes representing each transaction. Lots of such trees are generated, and lots discarded.
|
||||||
|
|
||||||
[SMhasher]:https://github.com/aappleby/smhasher
|
[SMhasher]:https://github.com/aappleby/smhasher
|
||||||
|
|
||||||
@ -80,7 +80,7 @@ where $g=11400714819323198485$, the odd number nearest to $2^{64)} divided by th
|
|||||||
|
|
||||||
Which would be a disastrously weak hash if our starting values were highly ordered, but is likely to suffice because our starting values are strongly random. Needless to say, it has absolutely no resistance to cryptographic attack, but such an attack is pointless, because our starting values are cryptographically strong, our resulting values don't involve any public commitments and we intend to reveal the preimage in due course.
|
Which would be a disastrously weak hash if our starting values were highly ordered, but is likely to suffice because our starting values are strongly random. Needless to say, it has absolutely no resistance to cryptographic attack, but such an attack is pointless, because our starting values are cryptographically strong, our resulting values don't involve any public commitments and we intend to reveal the preimage in due course.
|
||||||
|
|
||||||
Come to think of it, we can get away with 64 bit hashes, provided we subsample the underlying cryptographically strong 256 bit hashes differently each time, since we do not need to get absolutely perfect synchronization in any one synchronization event. We can live with the occasional rare Merkle patricia tree that gives the same hash for two different sets of transactions. The error will be cleaned up in the next synchronization event.
|
Come to think of it, we can get away with 64 bit hashes, provided we subsample the underlying cryptographically strong 256 bit hashes differently each time, since we do not need to get absolutely perfect synchronization in any one synchronization event. We can live with the occasional rare Merkle-patricia tree that gives the same hash for two different sets of transactions. The error will be cleaned up in the next synchronization event.
|
||||||
|
|
||||||
Thus the hash of two 64 bit hashes, $U$ and $V$, is $(Ug+V)\%2^{64}$.
|
Thus the hash of two 64 bit hashes, $U$ and $V$, is $(Ug+V)\%2^{64}$.
|
||||||
|
|
||||||
|
@ -4,7 +4,7 @@ title: Variable Length Quantity
|
|||||||
|
|
||||||
I originally implemented variable length quantities following the standard.
|
I originally implemented variable length quantities following the standard.
|
||||||
|
|
||||||
And then I realized that an sql index represented as a merkle-patricia tree inherently sorts in byte string order.
|
And then I realized that an sql index represented as a Merkle-patricia tree inherently sorts in byte string order.
|
||||||
Which is fine if we represent integers as fixed length integers in big endian format,
|
Which is fine if we represent integers as fixed length integers in big endian format,
|
||||||
but does not correctly sort variable length quantities if we follow the standard:
|
but does not correctly sort variable length quantities if we follow the standard:
|
||||||
|
|
||||||
@ -106,13 +106,13 @@ So no longer using these complicated offset for the number itself,
|
|||||||
but are using them for the byte count.
|
but are using them for the byte count.
|
||||||
We use the negative of the count, in order to get the correct
|
We use the negative of the count, in order to get the correct
|
||||||
sort order on the underlying byte strings, so that they can be
|
sort order on the underlying byte strings, so that they can be
|
||||||
represented in a Merkle patricia tree representing and index.
|
represented in a Merkle-patricia tree representing and index.
|
||||||
|
|
||||||
And so on and so forth in the same pattern for negative signed numbers of unlimited size.
|
And so on and so forth in the same pattern for negative signed numbers of unlimited size.
|
||||||
|
|
||||||
# bitstrings
|
# bitstrings
|
||||||
|
|
||||||
Bitstrings in merkle patricia tree representing an sql index
|
Bitstrings in Merkle-patricia tree representing an sql index
|
||||||
are typically very short, so should be represented by a
|
are typically very short, so should be represented by a
|
||||||
variable length quantity, except for the leaf edge,
|
variable length quantity, except for the leaf edge,
|
||||||
which is fixed size and large, so should not be
|
which is fixed size and large, so should not be
|
||||||
|
Loading…
Reference in New Issue
Block a user