Saner plan for representing bit strings.
Because I realized that the merkle vertex encloses them and gives them order and the merkle vertex is itself enclosed and given order.
This commit is contained in:
parent
022d794d5f
commit
a88fc58e71
@ -207,22 +207,18 @@ are typically very short, so should be represented by a
|
||||
variable length quantity. Which does not need to have the correct
|
||||
bytestring sort order.
|
||||
|
||||
So for bitstrings of six bits or less, we represent it as a byte with
|
||||
a leading zero bit, and the bits following the first one bit are
|
||||
the bitstring, and if the leading bit is one, it is a byte count
|
||||
of byte aligned bits. Because we know the start alignment, the
|
||||
beginning of the bitfield is implicit, and the final byte encodes
|
||||
a bit field of zero to seven bits. This can represent a bytestring
|
||||
of one to 128 bytes. However unreasonably large values, representing
|
||||
variable length bytestrings representing unreasonably large bitstrings,
|
||||
we reserve for future expansion, since the largest bitstring that will
|
||||
be valid in normal usage will be thirty three bytes, being a full sized
|
||||
hash followed by a byte representing the zero length bitstring.
|
||||
I have no end of clever ideas to represent them in fully compressed form,
|
||||
but all we actually need is a count of the bits of the vertex, the difference
|
||||
counts for the number of bits in the left and right edges,
|
||||
the left bytestring which contains the parent and left bitstrings, and,
|
||||
if the right bitstring is more than one bit longer than the parent bitstring,
|
||||
the difference bytes for the right bitstring.
|
||||
|
||||
If the bitstring representing the edge brings us the end of field, it
|
||||
is leaf edge, which is a different type, being a pointer to what is
|
||||
being indexed rather than a pointer to another patricia vertex and
|
||||
may have additional data.
|
||||
If we want to be terribly clever at optimization, if both leaf bitstrings
|
||||
are only greater by one than the parent bistring, we have bytes containing
|
||||
the parent bitstring, otherwise the bytestring containing all the bits of
|
||||
the longest edge bitstring, plus the difference bytes for the shorter
|
||||
bitstring if it is longer than its parent by more than one.
|
||||
|
||||
An edge in a Merkle-patricia sql index contains the bit path
|
||||
of the thing pointed to, and the completely unrelated hash of the
|
||||
|
Loading…
Reference in New Issue
Block a user