From a88fc58e7176a87128babe21c01129a3673b1617 Mon Sep 17 00:00:00 2001 From: "reaction.la" Date: Fri, 3 Nov 2023 23:19:26 +0000 Subject: [PATCH] Saner plan for representing bit strings. Because I realized that the merkle vertex encloses them and gives them order and the merkle vertex is itself enclosed and given order. --- docs/number_encoding.md | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/docs/number_encoding.md b/docs/number_encoding.md index a85ba94..c7abfcc 100644 --- a/docs/number_encoding.md +++ b/docs/number_encoding.md @@ -1,5 +1,5 @@ --- -# katex +#katex title: Number encoding sidebar: true ... @@ -207,22 +207,18 @@ are typically very short, so should be represented by a variable length quantity. Which does not need to have the correct bytestring sort order. -So for bitstrings of six bits or less, we represent it as a byte with -a leading zero bit, and the bits following the first one bit are -the bitstring, and if the leading bit is one, it is a byte count -of byte aligned bits. Because we know the start alignment, the -beginning of the bitfield is implicit, and the final byte encodes -a bit field of zero to seven bits. This can represent a bytestring -of one to 128 bytes. However unreasonably large values, representing -variable length bytestrings representing unreasonably large bitstrings, -we reserve for future expansion, since the largest bitstring that will -be valid in normal usage will be thirty three bytes, being a full sized -hash followed by a byte representing the zero length bitstring. +I have no end of clever ideas to represent them in fully compressed form, +but all we actually need is a count of the bits of the vertex, the difference +counts for the number of bits in the left and right edges, +the left bytestring which contains the parent and left bitstrings, and, +if the right bitstring is more than one bit longer than the parent bitstring, +the difference bytes for the right bitstring. -If the bitstring representing the edge brings us the end of field, it -is leaf edge, which is a different type, being a pointer to what is -being indexed rather than a pointer to another patricia vertex and -may have additional data. +If we want to be terribly clever at optimization, if both leaf bitstrings +are only greater by one than the parent bistring, we have bytes containing +the parent bitstring, otherwise the bytestring containing all the bits of +the longest edge bitstring, plus the difference bytes for the shorter +bitstring if it is longer than its parent by more than one. An edge in a Merkle-patricia sql index contains the bit path of the thing pointed to, and the completely unrelated hash of the