wallet/docs/scale_clients_trust.md

---
title: Scaling, trust and clients
...

The fundamental strength of the blockchain architecture is that it is a immutable public ledger. The fundamental flaw of the blockchain architecture is that it is an immutable public ledger.

This is a problem for privacy and fungibility, but what is really biting is scalability, the sheer size of the thing. Every full peer has to download every transaction that anyone ever did, evaluate that transaction for validity, and store it forever. And we are running hard into the physical limits of that. Every full peer on the blockchain has to know every transaction and every output of every transaction that ever there was.

As someone said when Satoshi first proposed what became bitcoin: “it does not seem to scale to the required size.”

And here we are now, fourteen years later, at rather close to that scaling limit. And for fourteen years, very smart people have been looking for a way to scale without limits.

And, at about the same time as we are hitting scalability limits, “public” is becoming a problem for fungibility. The fungibility crisis and the scalability crisis are hitting at about the same time. The fungibility crisis is hitting eth and is threatening bitcoin.

That the ledger is public enables the blood diamonds attack on crypto currency. Some transaction outputs could be deemed dirty, and rendered unspendable by centralized power, and to eventually, to avoid being blocked, you have to make everything KYC, and then even though you are fully compliant, you are apt to get arbitrarily and capriciously blocked because the government, people in quasi government institutions, or random criminals on the revolving door between regulators and regulated decide they do not like you for some whimsical reason. I have from time to time lost small amounts of totally legitimate fiat money in this fashion, as an international transactions become ever more difficult and dangerous, and recently lost an enormous amount of totally legitimate fiat money in this fashion.

Eth is highly centralized, and the full extent that it is centralized and in bed with the state is now being revealed, as tornado eth gets demonetized.

Some people in eth are resisting this attack. Some are not.

Bitcoiners have long accused eth of being a shitcoin, which accusation is obviously false. With the blood diamonds attack under way on eth, likely to become true. It is not a shitcoin, but I have long regarded it as likely to become one. Which expectation may well come true shortly.

A highly centralized crypto currency is closer to being an unregulated bank than a crypto currency. Shitcoins are fraudulent unregulated banks posing as crypto currencies. Eth may well be about to turn into a regulated bank. When bitcoiners accuse eth of being a shitcoin, the truth in their accusation is dangerous centralization, and dangerous closeness to the authorities.

The advantage of crypto currency is that as elite virtue collapses, the regulated banking system becomes ever more lawless, arbitrary, corrupt, and unpredictable. An immutable ledger ensures honest conduct. But if a central authority has too much power over the crypto currency, they get to retroactively decide what the ledger means. Centralization is a central point of failure, and in world of ever more morally debased and degenerate elites, will fail. Maybe Eth is failing now. If not, will likely fail by and by.

# Scaling

The Bitcoin blockchain has become inconveniently large, and evaluating it
from beginning to end to determine the current mutable state is apt to fail
half way through.  Your computer takes a very long time and is apt to fail
to make it all the way.

And, to take over the world, needs to become one hundred times larger.
Instead of five hundred gigabytes of blockchain, fifty terabytes.  And if you
have eight eight terabyte drives attached to your computer, it is big
expensive computer, and drives keep failing from time to time.  And they
are not the only thing that fails from time to time.

It is doable, but the only people that would be full peers on the blockchain
would be a bunch of corporations and quite wealthy individuals.  You
would have a small data centre, rather than a computer Which is likely to
have bad consequences.

## Kademlia

[Kademlia]:https://codethechange.stanford.edu/guides/guide_kademlia.html
"Distributed Hash Tables with Kademlia"
{target="_blank"}

[Kademlia] is a patricia tree of hashes, except that instead of direct links to
data, we have groups of peers that are known to be up, running, and
handling that vertex of the patricia tree.

Trouble is, if we need to look up an enormous number of links, we don't
want to walk through that tree an enormous number of times, since each
step in the walk is relatively slow.

So, an algorithm to distribute the blockchain over a very large number of
peers is a little bit tricky.  We cannot have the tree split into too many tiny
fragments.  If we are looking for tiny bit of data, a transaction, each input
for a transaction, and each output for a transaction, we want to get a whole
pile of such little bits in one go over a single connection.

And, if we are using massively replicated transaction checking, rather than
recursive zk-snarks, we need to make sure that the community evaluating
each transaction is big enough and random enough, and need reputation
management and the capacity to kick out peers that invalidly evaluate
state.  So we want a big pile of peers on each branch of the transaction tree.

And each group of peers is providing services to and receiving services
from many other groups of peers.  We have to incentivize this in a way that
ensures that each peer is managing a decently large branch of the tree, and
each branch of the tree is managed by a decently large number of peers.
Which is a hard problem, which the bittorrent community has been failing
to solve for a long time, though in the case of bittorrent, the primary way
this problem manifests is that peers have a big pile of old data sitting
around and are disinclined to make it available.  And if they did make it
available, the same peer would appear in far too many different and
unrelated branches of the tree, creating excessive [Kademlia] lookup costs.

## Merkle Patricia tree of signatures

Suppose that every block of the root primary blockchain contains hash of Merkle-patricia keys of signatures of blobs.

Each blob represents the current state, current commitment of anyone who wants to be able to prove that he is committing to a single state, a single append only chain of states, to anyone who wants to know,
telling the same story to Bob as to Carol.  This can be used to prevent Byzantine defection, but does
not in itself prevent it.  The peers on the blockchain do not know who is committing to what, or
whether their sequence of commitments is internally consistent.

Among the Byzantine defections it can prevent is forking a chain of signatures, but there are no end
of algorithms where we want to exclude Byzantine defection.  Such algorithms are often complex, and
have results that are hard to explain and hard to make use of.  It is a powerful tool, but not obvious
how to take advantage of it such that everyone gets his due.

Suppose this is hash the state sequence of Bob.  Then everyone sharing this preimage can know that all the other people participating in this state sequence are seeing the same preimage.  Which is a good start on preventing bad things from happening, but does not in itself prevent bad things from happening.

We all know the company with good reputation, that gets into financial difficulties, sells its good name to new management, which cashes in the long accrued reputation for a quick buck, then the phones
stop being answered, the website gradually ceases to work, and large amounts of spam mail arrives.

## sharding, many blockchains

Coins in a shard are shares in [sovereign cipher corporations] whose
primary asset is a coin on the primary blockchain that vests power over
their name and assets in a frequently changing public key. Every time
money moves from the main chain to a sidechain, or from one sidechain to
another, the old coin is spent, and a new coin is created. The public key on
the mainchain coin corresponds to [a frequently changing secret that is distributed]
between the peers on the sidechain in proportion to their stake.

The mainchain transaction is a big transaction between many sidechains,
that contains a single output or input from each side chain, with each
single input or output from each sidechain representing many single
transactions between sidechains, and each single transaction between
sidechains representing many single transactions between many clients of
each sidechain.

The single big mainchain transaction merkle chains to the total history of
each sidechain, and each client of a sidechain can verify any state
information about his sidechain against the most recent sidechain
transaction on the mainchain, and routinely does.

# Client trust

When there are billions of people using the blockchain, it will inevitably
only be fully verified by a few hundred or at most a few thousand major
peers, who will inevitably have [interests that do not necessarily coincide]
with those of the billions of users, who will inevitably have only client
wallets.

[interests that do not necessarily coincide]:https://vitalik.ca/general/2021/05/23/scaling.html
"Vitalik Buterin talks blockchain scaling"
{target="_blank"}

And a few hundred seems to be the minimum size required to stop peers
with a lot of clients from doing nefarious things. At scale, we are going to
approach the limits of distributed trust.

There are several cures for this. Well, not cures, but measures that can
alleviate the disease

None of these are yet implemented, and we will not get around to
implementing them until we start to take over the world. But it is
necessary that what we do implement be upwards compatible with this scaling design:

## proof of share

Make the stake of a peer the value of coins (unspent transaction outputs)
that were injected into the blockchain through that peer. This ensures that
the interests of the peers will be aligned with the whales, with the interests
of those that hold a whole lot of value on  the blockchain. Same principle
as a well functioning company board.   A company board directly represents
major shareholders, whose interests are for the most part aligned with
ordinary shareholders. (This is apt to fail horribly when an accounting or
law firm is on the board, or a converged investment fund.) This measure
gives power the whales, who do not want their hosts to do nefarious things.

## client verification

every single client verifies the transactions that it is directly involved in,
and a subset of the transactions that gave rise to the coins that it receives.

If it verified the ancestry of every coin it received all the way back, it
would have to verify the entire blockchain, but it can verify the biggest
ancestor of the biggest ancestor and a random subset of  ancestors, thus
invalid transactions are going immediately generate problems. If every
client unpredictably verifies a small number of transactions, the net effect
is going to be that most transactions are going to be unpredictably verified
by several clients.

## lightning layer

The [lightning layer] is the correct place for privacy and contracts – because
we do not want every transaction, let alone every contract, appearing on
the mainchain.  Keeping as much stuff as possible *off* the blockchain helps
with both privacy and scaling.

## zk-snarks

Zk-snarks, zeeks, are not yet a solution.  They have enormous potential
benefits for privacy and scaling, but as yet, no one has quite found a way.

A zk-snark is a succinct proof that code *was* executed on an immense pile
of data, and produced the expected, succinct, result.  It is a witness that
someone carried out the calculation he claims he did, and that calculation
produced the result he claimed it did.  So not everyone has to verify the
blockchain from beginning to end.  And not everyone has to know what
inputs justified what outputs.

As "zk-snark" is not a pronounceable work, I am going to use the word "zeek"
to refer to the blob proving that a computation was performed, and
produced the expected result.  This is an idiosyncratic usage, but I just do
 not like acronyms.

The innumerable privacy coins around based on zk-snarks are just not
doing what has to be done to make a zeek privacy currency that is viable
at any reasonable scale.  They are intentionally scams, or by negligence,
unintentionally scams.   All the zk-snark coins are doing the step from a set
$N$ of valid coins, valid unspent transaction outputs, to set $N+1$, in the
old fashioned Satoshi way, and sprinkling a little bit of zk-snark magic
privacy pixie dust on top (because the task of producing a genuine zeek
proof of coin state for step $N$ to step $N+1$ is just too big for them). 
Which is, intentionally or unintentionally, a scam.

Not yet an effective solution for scaling the blockchain, for to scale the
blockchain, you need a concise proof that any spend in the blockchain was
only spent once, and while a zk-snark proving this is concise and
capable of being quickly evaluated by any client, generating the proof is
an enormous task.

### what is a Zk-stark or a Zk-snark?

Zk-snark stands for “Zero-Knowledge Succinct Non-interactive Argument of Knowledge.”

A zk-stark is the same thing, except “Transparent”, meaning it does not have
the “toxic waste problem”, a potential secret backdoor. Whenever you create
zk-snark parameters, you create a backdoor, and how do third parties know that
this backdoor has been forever erased?

zk-stark stands for Zero-Knowledge Scalable Transparent ARguments of Knowledge, where “scalable” means the same thing as “succinct”

Ok, what is this knowledge that a zk-stark is an argument of?

Bob can prove to Carol that he knows a set of boolean values that
simultaneously satisfy certain boolean constraints.

This is zero knowledge because he proves this to Carol without revealing
what those values are, and it is “succinct” or “scalable”, because he can
prove knowledge of a truly enormous set of values that satisfy a truly
enormous set of constraints, with a proof that remains roughly the same
reasonably small size regardless of how enormous the set of values and
constraints are, and Carol can check the proof in a reasonably short time,
even if it takes Bob an enormous time to evaluate all those constraints over all those booleans.

Which means that Carol could potentially check the validity of the
blockchain without having to wade through terabytes of other people’s
data in which she has absolutely no interest.

Which means that each peer on the blockchain would not have to
download the entire blockchain, keep it all around, and evaluate from the beginning. They could just keep around the bits they cared about.

The peers as a whole have to keep all the data around, and make certain
information about this data available to anyone on demand, but each
individual peer does not have to keep all the data around, and not all the
data has to be available.  In particular, the inputs to the transaction do not
have to be available, only that they existed, were used once and only once,
and the output in question is the result of a valid transaction whose outputs
are equal to its inputs.

Unfortunately producing a zeek of such an enormous pile of data, with
such an enormous pile of constraints, could never be done, because the
blockchain grows faster than you can generate the zeek.

### zk-stark rollups, zeek rollups

Zk-stark rollups are a privacy technology and a scaling technology.

A zeek rollup is a zeek that proves that two or more other zeeks were verified.

Instead of Bob proving to Alice that he knows the latest block was valid, having evaluated every transaction, he proves to Alice that *someone* evaluated every transaction.

Fundamentally a ZK-stark proves to the verifier that the prover who generated the zk-stark knows a solution to an np complete problem. Unfortunately the proof is quite large, and the relationship between that problem, and anything that anyone cares about, extremely elaborate and indirect. The proof is large and costly to generate, even if not that costly to verify, not that costly to transmit, not that costly to store.

So you need a language that will generate such a relationship. And then you can prove, for example, that a hash is the hash of a valid transaction output, without revealing the value of that output, or the transaction inputs.

But if you have to have such a proof for every output, that is a mighty big pile of proofs, costly to evaluate, costly to store the vast pile of data. If you have a lot of zk-snarks, you have too many.

So, rollups.

Instead of proving that you know an enormous pile of data satisfying an enormous pile of constraints, you prove you know two zk-starks.

Each of which proves that someone else knows two more zk-starks. And the generation of all these zk-starks can be distributed over all the peers of the entire blockchain. At the bottom of this enormous pile of zk-starks is an enormous pile of transactions, with no one person or one computer knowing all of them, or even very many of them.

Instead of Bob proving to Carol that he knows every transaction that ever there was, and that they are all valid, Bob proves that for every transaction that ever there was, someone knew that that transaction was valid. Neither Carol nor Bob know who knew, or what was in that transaction.

You produce a proof that you verified a pile of proofs. You organize the information about which you want to prove stuff into a merkle tree, and the root of the merkle tree is associated with a proof that you verified the proofs of the direct children of that root vertex. And proof of each of the children of that root vertex proves that someone verified their children. And so forth all the way down to the bottom of the tree, the origin of the blockchain, proofs about proofs about proofs about proofs.

And then, to prove that a hash is a hash of a valid transaction output, you just produce the hash path linking that transaction to the root of the merkle tree. So with every new block, everyone has to just verify one proof once. All the child proofs get thrown away eventually.

Which means that peers do not have to keep every transaction and every output around forever. They just keep some recent roots of the blockchain around, plus the transactions and transaction outputs that they care about. So the blockchain can scale without limit.

ZK-stark rollups are a scaling technology plus a privacy technology. If you are not securing peoples privacy, you are keeping an enormous pile of data around that nobody cares about, (except a hostile government) which means your scaling does not scale.

And, as we are seeing with Tornado, some people Eth do not want that vast pile of data thrown away.

To optimize scaling to the max, you optimize privacy to the max. You want all data hidden as soon as possible as completely as possible, so that everyone on the blockchain is not drowning in other people’s data. The less anyone reveals, and the fewer the people they reveal it to, the better it scales, and the faster and cheaper the blockchain can do transactions, because you are pushing the generation of zk-starks down to the parties who are themselves directly doing the transaction. Optimizing for privacy is almost the same thing as optimizing for scalability.

The fundamental problem is that in order to produce a compact proof that
the set of coins, unspent transaction outputs, of state $N+1$ was validly
derived from the set of coins at state $N$, you actually have to have those
sets of coins, which is not very compact at all, and generate a compact
proof about a tree lookup and cryptographic verification for each of the
changes in the set.

This is an inherently enormous task at scale, which will have to be
factored into many, many subtasks, performed by many, many machines.
Factoring the problem up is hard, for it not only has to be factored, divided
up, it has to be divided up in a way that is incentive compatible, or else
the blockchain is going to fail at scale because of peer misconduct,
transactions are just not going to be validated.  Factoring a problem is hard,
and factoring that has to be mindful of incentive compatibility is
considerably harder.  I am seeing a lot of good work grappling with the
problem of factoring, dividing the problem into manageable subtasks, but
it seems to be totally oblivious to the hard problem of incentive compatibility at scale.

Incentive compatibility was Satoshi's brilliant insight, and the client trust
problem, too may people runing client wallets and not enough people
running full peers,  is failure of Satoshi's solution to that problem to scale.
Existing zk-snark solutions fail at scale, though in a different way.  With
zk-snarks, the client can verify the zeek but producing a valid zeek in the
first place is going to be hard, and will rapidly get harder as the scale
increases.

A zeek that succinctly proves that the set of coins (unspent transaction
outputs) at block $N+1$ was validly derived from the set of coins at
block $N$, and can also prove that any given coin is in that set or not in that
set is going to have to be a proof about many, many, zeeks produced by
many, many machines, a proof about a very large dag of zeeks, each zeek
a vertex in the dag proving some small part of the validity of the step from
consensus state $N$ of valid coins to consensus state $N+1$ of valid coins, and the owners of each of those machines that produced a tree vertex for the step from set $N$ to set $N+1$ will need a reward proportionate
to the task that they have completed, and the validity of the reward will
need to be part of the proof, and there will need to be a market in those
rewards, with each vertex in the dag preferring the cheapest source of
child vertexes.  Each of the machines would only need to have a small part
of the total state $N$, and a small part of the transactions transforming state
$N$ into state $N+1$. This is hard but doable, but I am just not seeing it done yet.

### The problem with zk-snarks

Last time I checked, [Cairo] was not ready for prime time.

[Cairo]:https://starkware.co/cairo/
"Cairo - StarkWare Industries Ltd."

Maybe it is ready now.

Starkware's [Cairo] now has zk-stark friendly elliptic curve.   But they
 suggest it is better to prove identity another way, I would assume by
 proving that the preimage contains a secret that is the same as another
 pre-image contains.  For example, that the transaction was prepared from
 unspent transaction outputs whose full preimage is a secret known only to
 the rightful owners of the outputs.

 Their main use of this zk-stark friendly elliptic curve is to enable recursive
 proofs of verification, hash based proof of elliptic curve based proofs.

[pre-image of a hash]:https://berkeley-desys.github.io/assets/material/lec5_eli_ben_sasson_zk_stark.pdf

An absolutely typical, and tolerably efficient, proof is to prove one knows
the [pre-image of a hash]  And then, of course, one wants to also prove
various things about what is in that pre-image

 I want to be able to prove that the [pre-image of a hash has certain
 properties, among them that it contains proofs that I verified that the
 pre-image of hashes contained within it have certain properties](https://cs251.stanford.edu/lectures/lecture18.pdf).

[Polygon]:https://www.coindesk.com/tech/2022/01/10/polygon-stakes-claim-to-fastest-zero-knowledge-layer-2-with-plonky2-launch/

[Polygon] with four hundred million dollars in funding, claims to have
accomplished this

[Polygon] is funding a variety of zk-snark intiatives, but the one that claims
to have recursive proofs running over a Merkle root is [Polygon zero](https://blog.polygon.technology/zkverse-polygons-zero-knowledge-strategy-explained/),
which claims:

    Plonky2 combines the best of STARKs, fast proofs and no trusted
    setup, with the best of SNARKs, support for recursion and low
    verification cost ...

    ... transpiled to ZK bytecode, which can be executed efficiently in our VM running inside a STARK.

So, if you have their VM that can run inside a stark, and their ZK
bytecode, you can write your own ZK language to support a friendly
system, instead of an enemy system - a language that can do what we want done,
rather than what our enemies in Ethereum want done.

The key is writing a language that operates on what looks to it like sql
tables, to produce proof that the current state, expressed as a collection of
tables represented as a Merkle Patricia tree, is the result of valid
operations on a collection of transactions, represented as Merkle patricia
tree, that acted on the previous current state, that allows generic
transactions, on generic tables, rather than Ethereum transactions on
Ethereum data structures.

But it is a four hundred million dollar project that is in the pocket of our
enemies.  On the other hand, if they put their stuff in Ethereum, then I
should be able to link an allied proof into an enemy proof, producing a
side chain with unlimited side chains, that can be verified from its own
root, or from Ethereum's root.

To solve the second problem, need an [intelligible scripting language for
generating zk-snarks], a scripting language that generates serial verifiers
and massively parallel map-reduce proofs.

[intelligible scripting language for
generating zk-snarks]:https://www.cairo-lang.org
"Welcome to Cairo
A Language For Scaling DApps Using STARKs"

It constructs a byte code that gets executed in a STARK.  It is designed to
compile Ethereum contracts to that byte code, and likely our enemies will
fuck the compiler, but hard for them to fuck the byte code.

Both problems are being actively worked on.  Both problems need a good deal
more work, last time I checked.  For end user trust in client wallets
relying on zk-snark verification to be valid, at least some of the end
users of client wallets will need to themselves generate the verifiers from
the script.

For trust based on zk-snarks to be valid, a very large number of people
must themselves have the source code to a large program that was
executed on an immense amount of data, and must themselves build and
run the verifier to prove that this code was run on the actual data at least
once, and produced the expected result, even though very few of them will
ever execute that program on actual data, and there is too much data for
any one computer to ever execute the program on all the data.

Satoshi's fundamental design was that all users should verify the
blockchain, which becomes impractical when the blockchain approaches four
hundred gigabytes.  A zk-snark design needs to redesign blockchains from
the beginning, with distributed generation of the proof, but the proof for
each step in the chain, from mutable state $N$ to mutable state $N+1$, from set
$N$ of coins, unspent transaction outputs, to set $N+1$ of coins only being
generated once or generated a quite small number of times, with its
generation being distributed over all peers through map-reduce, while the
proof is verified by everyone, peer and client.

For good verifier performance, with acceptable prover performance, one
should construct a stark that can be verified quickly, and then produce
a libsnark that it was verified at least once  ([libsnark proof generation
being costly], but the proofs are very small and quickly verifiable).

At the end of the day, we still need the code generating and executing the
verification of zk-snarks to be massively replicated, in order that all
this rigmarole with zk-snarks and starks is actually worthy of producing
trust.

[libsnark proof generation being costly]:https://eprint.iacr.org/2018/046.pdf
"Scalable computational integrity:
section 1.3.2: concrete performance"

This is not a problem I am working on, but I would be happy to see a
solution.  I am seeing a lot of scam solutions, that sprinkle zk-snarks over
existing solutions as magic pixie dust, like putting wings on a solid fuel
rocket and calling it a space plane.

[lightning layer]:lightning_layer.html

[sovereign cipher corporations]:social_networking.html#many-sovereign-corporations-on-the-blockchain

[a frequently changing secret that is distributed]:multisignature.html#scaling

### How a fully scalable blockchain running on zeek rollups would work

A blockchain is of course a chain of blocks, and at scale, each block would be far too immense for any one peer to store or process, let alone the entire chain.

Each block would be a Merkle patricia tree, or a Merkle tree of a number of Merkle patricia trees, because we want the block to be broad and flat, rather than deep and narrow, so that it can be produced in a massively parallel way, created in parallel by an immense number of peers. Each block would contain a proof that it was validly derived from the previous block, and that the previous block’s similar proof was verified. A chain is narrow and deep, but that does not matter, because the proofs are “scalable”. No one has to verify all the proofs from the beginning, they just have to verify the latest proofs.

Each peer would keep around the actual data and actual proofs that it cared about, and the chain of hashes linking the data it cared about to Merkle root of the latest block.

All the immense amount of data in the immense blockchain that anyone
cares about would need to exist somewhere, but it would not have to exist
*everywhere*, and everyone would have a proof that the tiny part of the
blockchain that they keep around is consistent with all the other tiny parts
of the blockchain that everyone else is keeping around.

# sharding within each single very large peer

Sharding within a single peer is an easier problem than sharding the
blockchain between mutually distrustful peers capable of Byzantine
defection, and the solutions are apt to be more powerful and efficient.

When we go to scale, when we have very large peers on the blockchain,
we are going to have to have sharding within each very large peer, which will
multiprocess in the style of Google's massively parallel multiprocessing,
where scaling and multiprocessing is embedded in interactions with the
massively distributed database, either on top of an existing distributed
database such as Rlite or Cockroach, or we will have to extend the
consensus algorithm so that the shards of each cluster form their own
distributed database, or extend the consensus algorithm so that peers can
shard.  As preparation for the latter possibility, we need to have each peer
only form gossip events with a small and durable set of peers with which it
has lasting relationships, because the events, as we go to scale, tend to
have large and unequal costs and benefits for each peer.  Durable
relationships make sharding possible, but we will not worry to much about
sharding until a forty terabyte blockchain comes in sight.

When we go to scale, we are going to have to have sharding, which will
multiprocess in the style of Google’s massively parallel multiprocessing,
where scaling and multiprocessing is embedded in interactions with the
massively distributed database, either on top of an existing distributed
database such as Rlite or Cockroach, or we will have to extend the
consensus algorithm so that the shards of each cluster form their own
distributed database, or extend the consensus algorithm so that peers can
shard.  As preparation for the latter possibility, we need to have each peer
only form gossip events with a small and durable set of peers with which it
has lasting relationships, because the events, as we go to scale, tend to
have large and unequal costs and benefits for each peer.  Durable
relationships make sharding possible, but we will not worry to much about
sharding until a forty terabyte blockchain comes in sight.

For sharding, each peer has a copy of a subset of the total blockchain, and
some peers have a parity set of many such subsets, each peer has a subset
of the set of unspent transaction outputs as of consensus on total order at
one time, and is working on constructing a subset of the set of unspent
transactions as of a recent consensus on total order, each peer has all the
root hashes of all the balanced binary trees of all the subsets, but not all
the subsets, each peer has durable relationships with a set of peers that
have the entire collection of subsets, and two durable relationships with
peers that have parity sets of all the subsets.

Each subset of the append only immutable set of transactions is represented
by a balanced binary tree of hashes representing $2^n$ blocks of
the blockchain, and each subset of the mutable set of unspent transaction
outputs is a subsection of the Merkle-patricia tree of transaction outputs,
which is part of a directed acyclic graph of all consensus sets of all past
consensus states of transaction outputs, but no one keeps that entire graph
around once it gets too big, as it rapidly will, only various subsets of it.

But they keep the hashes around that can prove that any subset of it was
part of the consensus at some time.

Gossip vertexes immutable added to the immutable chain of blocks will
contain the total hash of the state of unspent transactions as of a previous
consensus block, thus the immutable and ever growing blockchain will contain
an immutable record of all past consensus Merkle-patricia trees of
unspent transaction outputs, and thus of the past consensus about the
dynamic and changing state resulting from the immutable set of all past
transactions

For very old groups of blocks to be discardable, it will from time to time be
necessary to add repeat copies of old transaction outputs that are still
unspent, so that the old transactions that gave rise to them can be
discarded, and one can then re-evaluate the state of the blockchain starting
from the middle, rather than the very beginning.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								---
 								title: Scaling, trust and clients
-												Updated to current pandoc format

Which affected all documentation files.

											
										
										
											2022-05-06 22:49:33 -04:00
+								...
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								The fundamental strength of the blockchain architecture is that it is a immutable public ledger. The fundamental flaw of the blockchain architecture is that it is an immutable public ledger.
 								This is a problem for privacy and fungibility, but what is really biting is scalability, the sheer size of the thing. Every full peer has to download every transaction that anyone ever did, evaluate that transaction for validity, and store it forever. And we are running hard into the physical limits of that. Every full peer on the blockchain has to know every transaction and every output of every transaction that ever there was.
 								As someone said when Satoshi first proposed what became bitcoin: “it does not seem to scale to the required size.”
 								And here we are now, fourteen years later, at rather close to that scaling limit. And for fourteen years, very smart people have been looking for a way to scale without limits.
 								And, at about the same time as we are hitting scalability limits, “public” is becoming a problem for fungibility. The fungibility crisis and the scalability crisis are hitting at about the same time. The fungibility crisis is hitting eth and is threatening bitcoin.
 								That the ledger is public enables the blood diamonds attack on crypto currency. Some transaction outputs could be deemed dirty, and rendered unspendable by centralized power, and to eventually, to avoid being blocked, you have to make everything KYC, and then even though you are fully compliant, you are apt to get arbitrarily and capriciously blocked because the government, people in quasi government institutions, or random criminals on the revolving door between regulators and regulated decide they do not like you for some whimsical reason. I have from time to time lost small amounts of totally legitimate fiat money in this fashion, as an international transactions become ever more difficult and dangerous, and recently lost an enormous amount of totally legitimate fiat money in this fashion.
 								Eth is highly centralized, and the full extent that it is centralized and in bed with the state is now being revealed, as tornado eth gets demonetized.
 								Some people in eth are resisting this attack. Some are not.
 								Bitcoiners have long accused eth of being a shitcoin, which accusation is obviously false. With the blood diamonds attack under way on eth, likely to become true. It is not a shitcoin, but I have long regarded it as likely to become one. Which expectation may well come true shortly.
 								A highly centralized crypto currency is closer to being an unregulated bank than a crypto currency. Shitcoins are fraudulent unregulated banks posing as crypto currencies. Eth may well be about to turn into a regulated bank. When bitcoiners accuse eth of being a shitcoin, the truth in their accusation is dangerous centralization, and dangerous closeness to the authorities.
 								The advantage of crypto currency is that as elite virtue collapses, the regulated banking system becomes ever more lawless, arbitrary, corrupt, and unpredictable. An immutable ledger ensures honest conduct. But if a central authority has too much power over the crypto currency, they get to retroactively decide what the ledger means. Centralization is a central point of failure, and in world of ever more morally debased and degenerate elites, will fail. Maybe Eth is failing now. If not, will likely fail by and by.
-												work in progress backed up

											
										
										
											2022-03-23 22:11:00 -04:00
+								# Scaling
-												moved markdown files from root directory

(since everyone has a browser, text is obsolete)
modified:   docs/libraries.md
deleted:    docs/libraries/pandoc_templates/style.css
modified:   docs/merkle_patricia_dag.md
modified:   docs/mkdocs.sh
renamed:    LICENSE.md -> docs/rootDocs/LICENSE.md
renamed:    NOTICE.md -> docs/rootDocs/NOTICE.md
renamed:    README.md -> docs/rootDocs/README.md
renamed:    RELEASE_NOTES.md -> docs/rootDocs/RELEASE_NOTES.md
new file:   docs/scalable_reputation_management.md
modified:   docs/scale_clients_trust.md
modified:   docs/set_up_build_environments.md
modified:   docs/writing_and_editing_documentation.md
deleted:    pandoc_templates/style.css

											
										
										
											2022-05-20 07:41:37 -04:00
+								The Bitcoin blockchain has become inconveniently large, and evaluating it
 								from beginning to end to determine the current mutable state is apt to fail
 								half way through.  Your computer takes a very long time and is apt to fail
 								to make it all the way.
 								And, to take over the world, needs to become one hundred times larger.
 								Instead of five hundred gigabytes of blockchain, fifty terabytes.  And if you
 								have eight eight terabyte drives attached to your computer, it is big
 								expensive computer, and drives keep failing from time to time.  And they
 								are not the only thing that fails from time to time.
 								It is doable, but the only people that would be full peers on the blockchain
 								would be a bunch of corporations and quite wealthy individuals.  You
 								would have a small data centre, rather than a computer Which is likely to
 								have bad consequences.
 								## Kademlia
 								[Kademlia]:https://codethechange.stanford.edu/guides/guide_kademlia.html
 								"Distributed Hash Tables with Kademlia"
 								{target="_blank"}
 								[Kademlia] is a patricia tree of hashes, except that instead of direct links to
 								data, we have groups of peers that are known to be up, running, and
 								handling that vertex of the patricia tree.
 								Trouble is, if we need to look up an enormous number of links, we don't
 								want to walk through that tree an enormous number of times, since each
 								step in the walk is relatively slow.
 								So, an algorithm to distribute the blockchain over a very large number of
 								peers is a little bit tricky.  We cannot have the tree split into too many tiny
 								fragments.  If we are looking for tiny bit of data, a transaction, each input
 								for a transaction, and each output for a transaction, we want to get a whole
 								pile of such little bits in one go over a single connection.
 								And, if we are using massively replicated transaction checking, rather than
 								recursive zk-snarks, we need to make sure that the community evaluating
 								each transaction is big enough and random enough, and need reputation
 								management and the capacity to kick out peers that invalidly evaluate
 								state.  So we want a big pile of peers on each branch of the transaction tree.
 								And each group of peers is providing services to and receiving services
 								from many other groups of peers.  We have to incentivize this in a way that
 								ensures that each peer is managing a decently large branch of the tree, and
 								each branch of the tree is managed by a decently large number of peers.
 								Which is a hard problem, which the bittorrent community has been failing
 								to solve for a long time, though in the case of bittorrent, the primary way
 								this problem manifests is that peers have a big pile of old data sitting
 								around and are disinclined to make it available.  And if they did make it
 								available, the same peer would appear in far too many different and
 								unrelated branches of the tree, creating excessive [Kademlia] lookup costs.
-												work in progress backed up

											
										
										
											2022-03-23 22:11:00 -04:00
 								## Merkle Patricia tree of signatures
 								Suppose that every block of the root primary blockchain contains hash of Merkle-patricia keys of signatures of blobs.
 								Each blob represents the current state, current commitment of anyone who wants to be able to prove that he is committing to a single state, a single append only chain of states, to anyone who wants to know,
 								telling the same story to Bob as to Carol.  This can be used to prevent Byzantine defection, but does
 								not in itself prevent it.  The peers on the blockchain do not know who is committing to what, or
 								whether their sequence of commitments is internally consistent.
 								Among the Byzantine defections it can prevent is forking a chain of signatures, but there are no end
 								of algorithms where we want to exclude Byzantine defection.  Such algorithms are often complex, and
 								have results that are hard to explain and hard to make use of.  It is a powerful tool, but not obvious
 								how to take advantage of it such that everyone gets his due.
 								Suppose this is hash the state sequence of Bob.  Then everyone sharing this preimage can know that all the other people participating in this state sequence are seeing the same preimage.  Which is a good start on preventing bad things from happening, but does not in itself prevent bad things from happening.
 								We all know the company with good reputation, that gets into financial difficulties, sells its good name to new management, which cashes in the long accrued reputation for a quick buck, then the phones
 								stop being answered, the website gradually ceases to work, and large amounts of spam mail arrives.
-												moved markdown files from root directory

(since everyone has a browser, text is obsolete)
modified:   docs/libraries.md
deleted:    docs/libraries/pandoc_templates/style.css
modified:   docs/merkle_patricia_dag.md
modified:   docs/mkdocs.sh
renamed:    LICENSE.md -> docs/rootDocs/LICENSE.md
renamed:    NOTICE.md -> docs/rootDocs/NOTICE.md
renamed:    README.md -> docs/rootDocs/README.md
renamed:    RELEASE_NOTES.md -> docs/rootDocs/RELEASE_NOTES.md
new file:   docs/scalable_reputation_management.md
modified:   docs/scale_clients_trust.md
modified:   docs/set_up_build_environments.md
modified:   docs/writing_and_editing_documentation.md
deleted:    pandoc_templates/style.css

											
										
										
											2022-05-20 07:41:37 -04:00
+								## sharding, many blockchains
-												work in progress backed up

											
										
										
											2022-03-23 22:11:00 -04:00
 								Coins in a shard are shares in [sovereign cipher corporations] whose
 								primary asset is a coin on the primary blockchain that vests power over
 								their name and assets in a frequently changing public key. Every time
 								money moves from the main chain to a sidechain, or from one sidechain to
 								another, the old coin is spent, and a new coin is created. The public key on
 								the mainchain coin corresponds to [a frequently changing secret that is distributed]
 								between the peers on the sidechain in proportion to their stake.
 								The mainchain transaction is a big transaction between many sidechains,
 								that contains a single output or input from each side chain, with each
 								single input or output from each sidechain representing many single
 								transactions between sidechains, and each single transaction between
 								sidechains representing many single transactions between many clients of
 								each sidechain.
 								The single big mainchain transaction merkle chains to the total history of
 								each sidechain, and each client of a sidechain can verify any state
 								information about his sidechain against the most recent sidechain
 								transaction on the mainchain, and routinely does.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								# Client trust
 								When there are billions of people using the blockchain, it will inevitably
 								only be fully verified by a few hundred or at most a few thousand major
 								peers, who will inevitably have [interests that do not necessarily coincide]
 								with those of the billions of users, who will inevitably have only client
 								wallets.
 								[interests that do not necessarily coincide]:https://vitalik.ca/general/2021/05/23/scaling.html
 								"Vitalik Buterin talks blockchain scaling"
-												moved markdown files from root directory

(since everyone has a browser, text is obsolete)
modified:   docs/libraries.md
deleted:    docs/libraries/pandoc_templates/style.css
modified:   docs/merkle_patricia_dag.md
modified:   docs/mkdocs.sh
renamed:    LICENSE.md -> docs/rootDocs/LICENSE.md
renamed:    NOTICE.md -> docs/rootDocs/NOTICE.md
renamed:    README.md -> docs/rootDocs/README.md
renamed:    RELEASE_NOTES.md -> docs/rootDocs/RELEASE_NOTES.md
new file:   docs/scalable_reputation_management.md
modified:   docs/scale_clients_trust.md
modified:   docs/set_up_build_environments.md
modified:   docs/writing_and_editing_documentation.md
deleted:    pandoc_templates/style.css

											
										
										
											2022-05-20 07:41:37 -04:00
+								{target="_blank"}
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
 								And a few hundred seems to be the minimum size required to stop peers
 								with a lot of clients from doing nefarious things. At scale, we are going to
 								approach the limits of distributed trust.
 								There are several cures for this. Well, not cures, but measures that can
 								alleviate the disease
 								None of these are yet implemented, and we will not get around to
 								implementing them until we start to take over the world. But it is
 								necessary that what we do implement be upwards compatible with this scaling design:
-												Since "proof of stake" already means something else
changed it to proof of share.
Made a small start on populating the horizontal navbar
discovered that no end of my documentation has been broken
by events and should not be linked in.

											
										
										
											2023-09-04 06:04:59 -04:00
+								## proof of share
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
 								Make the stake of a peer the value of coins (unspent transaction outputs)
 								that were injected into the blockchain through that peer. This ensures that
 								the interests of the peers will be aligned with the whales, with the interests
 								of those that hold a whole lot of value on  the blockchain. Same principle
 								as a well functioning company board.   A company board directly represents
 								major shareholders, whose interests are for the most part aligned with
 								ordinary shareholders. (This is apt to fail horribly when an accounting or
 								law firm is on the board, or a converged investment fund.) This measure
 								gives power the whales, who do not want their hosts to do nefarious things.
 								## client verification
 								every single client verifies the transactions that it is directly involved in,
 								and a subset of the transactions that gave rise to the coins that it receives.
 								If it verified the ancestry of every coin it received all the way back, it
 								would have to verify the entire blockchain, but it can verify the biggest
 								ancestor of the biggest ancestor and a random subset of  ancestors, thus
 								invalid transactions are going immediately generate problems. If every
 								client unpredictably verifies a small number of transactions, the net effect
 								is going to be that most transactions are going to be unpredictably verified
 								by several clients.
 								## lightning layer
 								The [lightning layer] is the correct place for privacy and contracts – because
 								we do not want every transaction, let alone every contract, appearing on
 								the mainchain.  Keeping as much stuff as possible *off* the blockchain helps
 								with both privacy and scaling.
 								## zk-snarks
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								Zk-snarks, zeeks, are not yet a solution.  They have enormous potential
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								benefits for privacy and scaling, but as yet, no one has quite found a way.
 								A zk-snark is a succinct proof that code *was* executed on an immense pile
 								of data, and produced the expected, succinct, result.  It is a witness that
 								someone carried out the calculation he claims he did, and that calculation
 								produced the result he claimed it did.  So not everyone has to verify the
 								blockchain from beginning to end.  And not everyone has to know what
 								inputs justified what outputs.
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								As "zk-snark" is not a pronounceable work, I am going to use the word "zeek"
 								to refer to the blob proving that a computation was performed, and
 								produced the expected result.  This is an idiosyncratic usage, but I just do
 								 not like acronyms.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								The innumerable privacy coins around based on zk-snarks are just not
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								doing what has to be done to make a zeek privacy currency that is viable
 								at any reasonable scale.  They are intentionally scams, or by negligence,
 								unintentionally scams.   All the zk-snark coins are doing the step from a set
 								$N$ of valid coins, valid unspent transaction outputs, to set $N+1$, in the
 								old fashioned Satoshi way, and sprinkling a little bit of zk-snark magic
 								privacy pixie dust on top (because the task of producing a genuine zeek
 								proof of coin state for step $N$ to step $N+1$ is just too big for them).
 								Which is, intentionally or unintentionally, a scam.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
 								Not yet an effective solution for scaling the blockchain, for to scale the
 								blockchain, you need a concise proof that any spend in the blockchain was
 								only spent once, and while a zk-snark proving this is concise and
 								capable of being quickly evaluated by any client, generating the proof is
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								an enormous task.
 								### what is a Zk-stark or a Zk-snark?
 								Zk-snark stands for “Zero-Knowledge Succinct Non-interactive Argument of Knowledge.”
 								A zk-stark is the same thing, except “Transparent”, meaning it does not have
 								the “toxic waste problem”, a potential secret backdoor. Whenever you create
 								zk-snark parameters, you create a backdoor, and how do third parties know that
 								this backdoor has been forever erased?
 								zk-stark stands for Zero-Knowledge Scalable Transparent ARguments of Knowledge, where “scalable” means the same thing as “succinct”
 								Ok, what is this knowledge that a zk-stark is an argument of?
 								Bob can prove to Carol that he knows a set of boolean values that
 								simultaneously satisfy certain boolean constraints.
 								This is zero knowledge because he proves this to Carol without revealing
 								what those values are, and it is “succinct” or “scalable”, because he can
 								prove knowledge of a truly enormous set of values that satisfy a truly
 								enormous set of constraints, with a proof that remains roughly the same
 								reasonably small size regardless of how enormous the set of values and
 								constraints are, and Carol can check the proof in a reasonably short time,
 								even if it takes Bob an enormous time to evaluate all those constraints over all those booleans.
 								Which means that Carol could potentially check the validity of the
 								blockchain without having to wade through terabytes of other people’s
 								data in which she has absolutely no interest.
 								Which means that each peer on the blockchain would not have to
 								download the entire blockchain, keep it all around, and evaluate from the beginning. They could just keep around the bits they cared about.
 								The peers as a whole have to keep all the data around, and make certain
 								information about this data available to anyone on demand, but each
 								individual peer does not have to keep all the data around, and not all the
 								data has to be available.  In particular, the inputs to the transaction do not
 								have to be available, only that they existed, were used once and only once,
 								and the output in question is the result of a valid transaction whose outputs
 								are equal to its inputs.
 								Unfortunately producing a zeek of such an enormous pile of data, with
 								such an enormous pile of constraints, could never be done, because the
 								blockchain grows faster than you can generate the zeek.
 								### zk-stark rollups, zeek rollups
 								Zk-stark rollups are a privacy technology and a scaling technology.
 								A zeek rollup is a zeek that proves that two or more other zeeks were verified.
 								Instead of Bob proving to Alice that he knows the latest block was valid, having evaluated every transaction, he proves to Alice that *someone* evaluated every transaction.
 								Fundamentally a ZK-stark proves to the verifier that the prover who generated the zk-stark knows a solution to an np complete problem. Unfortunately the proof is quite large, and the relationship between that problem, and anything that anyone cares about, extremely elaborate and indirect. The proof is large and costly to generate, even if not that costly to verify, not that costly to transmit, not that costly to store.
 								So you need a language that will generate such a relationship. And then you can prove, for example, that a hash is the hash of a valid transaction output, without revealing the value of that output, or the transaction inputs.
 								But if you have to have such a proof for every output, that is a mighty big pile of proofs, costly to evaluate, costly to store the vast pile of data. If you have a lot of zk-snarks, you have too many.
 								So, rollups.
 								Instead of proving that you know an enormous pile of data satisfying an enormous pile of constraints, you prove you know two zk-starks.
 								Each of which proves that someone else knows two more zk-starks. And the generation of all these zk-starks can be distributed over all the peers of the entire blockchain. At the bottom of this enormous pile of zk-starks is an enormous pile of transactions, with no one person or one computer knowing all of them, or even very many of them.
 								Instead of Bob proving to Carol that he knows every transaction that ever there was, and that they are all valid, Bob proves that for every transaction that ever there was, someone knew that that transaction was valid. Neither Carol nor Bob know who knew, or what was in that transaction.
 								You produce a proof that you verified a pile of proofs. You organize the information about which you want to prove stuff into a merkle tree, and the root of the merkle tree is associated with a proof that you verified the proofs of the direct children of that root vertex. And proof of each of the children of that root vertex proves that someone verified their children. And so forth all the way down to the bottom of the tree, the origin of the blockchain, proofs about proofs about proofs about proofs.
 								And then, to prove that a hash is a hash of a valid transaction output, you just produce the hash path linking that transaction to the root of the merkle tree. So with every new block, everyone has to just verify one proof once. All the child proofs get thrown away eventually.
 								Which means that peers do not have to keep every transaction and every output around forever. They just keep some recent roots of the blockchain around, plus the transactions and transaction outputs that they care about. So the blockchain can scale without limit.
 								ZK-stark rollups are a scaling technology plus a privacy technology. If you are not securing peoples privacy, you are keeping an enormous pile of data around that nobody cares about, (except a hostile government) which means your scaling does not scale.
 								And, as we are seeing with Tornado, some people Eth do not want that vast pile of data thrown away.
 								To optimize scaling to the max, you optimize privacy to the max. You want all data hidden as soon as possible as completely as possible, so that everyone on the blockchain is not drowning in other people’s data. The less anyone reveals, and the fewer the people they reveal it to, the better it scales, and the faster and cheaper the blockchain can do transactions, because you are pushing the generation of zk-starks down to the parties who are themselves directly doing the transaction. Optimizing for privacy is almost the same thing as optimizing for scalability.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
 								The fundamental problem is that in order to produce a compact proof that
 								the set of coins, unspent transaction outputs, of state $N+1$ was validly
 								derived from the set of coins at state $N$, you actually have to have those
 								sets of coins, which is not very compact at all, and generate a compact
 								proof about a tree lookup and cryptographic verification for each of the
 								changes in the set.
 								This is an inherently enormous task at scale, which will have to be
 								factored into many, many subtasks, performed by many, many machines.
 								Factoring the problem up is hard, for it not only has to be factored, divided
 								up, it has to be divided up in a way that is incentive compatible, or else
 								the blockchain is going to fail at scale because of peer misconduct,
 								transactions are just not going to be validated.  Factoring a problem is hard,
 								and factoring that has to be mindful of incentive compatibility is
 								considerably harder.  I am seeing a lot of good work grappling with the
 								problem of factoring, dividing the problem into manageable subtasks, but
 								it seems to be totally oblivious to the hard problem of incentive compatibility at scale.
 								Incentive compatibility was Satoshi's brilliant insight, and the client trust
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								problem, too may people runing client wallets and not enough people
 								running full peers,  is failure of Satoshi's solution to that problem to scale.
 								Existing zk-snark solutions fail at scale, though in a different way.  With
 								zk-snarks, the client can verify the zeek but producing a valid zeek in the
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								first place is going to be hard, and will rapidly get harder as the scale
 								increases.
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								A zeek that succinctly proves that the set of coins (unspent transaction
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								outputs) at block $N+1$ was validly derived from the set of coins at
 								block $N$, and can also prove that any given coin is in that set or not in that
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								set is going to have to be a proof about many, many, zeeks produced by
 								many, many machines, a proof about a very large dag of zeeks, each zeek
 								a vertex in the dag proving some small part of the validity of the step from
 								consensus state $N$ of valid coins to consensus state $N+1$ of valid coins, and the owners of each of those machines that produced a tree vertex for the step from set $N$ to set $N+1$ will need a reward proportionate
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								to the task that they have completed, and the validity of the reward will
 								need to be part of the proof, and there will need to be a market in those
 								rewards, with each vertex in the dag preferring the cheapest source of
 								child vertexes.  Each of the machines would only need to have a small part
 								of the total state $N$, and a small part of the transactions transforming state
 								$N$ into state $N+1$. This is hard but doable, but I am just not seeing it done yet.
 								### The problem with zk-snarks
 								Last time I checked, [Cairo] was not ready for prime time.
 								[Cairo]:https://starkware.co/cairo/
 								"Cairo - StarkWare Industries Ltd."
 								Maybe it is ready now.
-												Updated to current pandoc format

Which affected all documentation files.

											
										
										
											2022-05-06 22:49:33 -04:00
+								Starkware's [Cairo] now has zk-stark friendly elliptic curve.   But they
 								 suggest it is better to prove identity another way, I would assume by
 								 proving that the preimage contains a secret that is the same as another
 								 pre-image contains.  For example, that the transaction was prepared from
 								 unspent transaction outputs whose full preimage is a secret known only to
 								 the rightful owners of the outputs.
 								 Their main use of this zk-stark friendly elliptic curve is to enable recursive
 								 proofs of verification, hash based proof of elliptic curve based proofs.
 								[pre-image of a hash]:https://berkeley-desys.github.io/assets/material/lec5_eli_ben_sasson_zk_stark.pdf
 								An absolutely typical, and tolerably efficient, proof is to prove one knows
 								the [pre-image of a hash]  And then, of course, one wants to also prove
 								various things about what is in that pre-image
 								 I want to be able to prove that the [pre-image of a hash has certain
 								 properties, among them that it contains proofs that I verified that the
 								 pre-image of hashes contained within it have certain properties](https://cs251.stanford.edu/lectures/lecture18.pdf).
 								[Polygon]:https://www.coindesk.com/tech/2022/01/10/polygon-stakes-claim-to-fastest-zero-knowledge-layer-2-with-plonky2-launch/
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
-												Updated to current pandoc format

Which affected all documentation files.

											
										
										
											2022-05-06 22:49:33 -04:00
+								[Polygon] with four hundred million dollars in funding, claims to have
 								accomplished this
 								[Polygon] is funding a variety of zk-snark intiatives, but the one that claims
 								to have recursive proofs running over a Merkle root is [Polygon zero](https://blog.polygon.technology/zkverse-polygons-zero-knowledge-strategy-explained/),
 								which claims:
 								    Plonky2 combines the best of STARKs, fast proofs and no trusted
 								    setup, with the best of SNARKs, support for recursion and low
 								    verification cost ...
-												whitespace woes

											
										
										
											2022-05-23 00:05:10 -04:00
+								    ... transpiled to ZK bytecode, which can be executed efficiently in our VM running inside a STARK.
-												Updated to current pandoc format

Which affected all documentation files.

											
										
										
											2022-05-06 22:49:33 -04:00
 								So, if you have their VM that can run inside a stark, and their ZK
 								bytecode, you can write your own ZK language to support a friendly
 								system, instead of an enemy system - a language that can do what we want done,
 								rather than what our enemies in Ethereum want done.
 								The key is writing a language that operates on what looks to it like sql
 								tables, to produce proof that the current state, expressed as a collection of
 								tables represented as a Merkle Patricia tree, is the result of valid
 								operations on a collection of transactions, represented as Merkle patricia
 								tree, that acted on the previous current state, that allows generic
 								transactions, on generic tables, rather than Ethereum transactions on
 								Ethereum data structures.
 								But it is a four hundred million dollar project that is in the pocket of our
 								enemies.  On the other hand, if they put their stuff in Ethereum, then I
 								should be able to link an allied proof into an enemy proof, producing a
 								side chain with unlimited side chains, that can be verified from its own
 								root, or from Ethereum's root.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
 								To solve the second problem, need an [intelligible scripting language for
 								generating zk-snarks], a scripting language that generates serial verifiers
 								and massively parallel map-reduce proofs.
 								[intelligible scripting language for
 								generating zk-snarks]:https://www.cairo-lang.org
 								"Welcome to Cairo
 								A Language For Scaling DApps Using STARKs"
-												Updated to current pandoc format

Which affected all documentation files.

											
										
										
											2022-05-06 22:49:33 -04:00
+								It constructs a byte code that gets executed in a STARK.  It is designed to
 								compile Ethereum contracts to that byte code, and likely our enemies will
 								fuck the compiler, but hard for them to fuck the byte code.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								Both problems are being actively worked on.  Both problems need a good deal
 								more work, last time I checked.  For end user trust in client wallets
 								relying on zk-snark verification to be valid, at least some of the end
 								users of client wallets will need to themselves generate the verifiers from
-												cleanup, and just do not like pdfs

Also, needed to understand Byzantine fault tolerant paxos better.

Still do not.

											
										
										
											2022-02-18 15:59:12 -05:00
+								the script.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
 								For trust based on zk-snarks to be valid, a very large number of people
 								must themselves have the source code to a large program that was
 								executed on an immense amount of data, and must themselves build and
 								run the verifier to prove that this code was run on the actual data at least
 								once, and produced the expected result, even though very few of them will
 								ever execute that program on actual data, and there is too much data for
 								any one computer to ever execute the program on all the data.
 								Satoshi's fundamental design was that all users should verify the
 								blockchain, which becomes impractical when the blockchain approaches four
 								hundred gigabytes.  A zk-snark design needs to redesign blockchains from
 								the beginning, with distributed generation of the proof, but the proof for
 								each step in the chain, from mutable state $N$ to mutable state $N+1$, from set
 								$N$ of coins, unspent transaction outputs, to set $N+1$ of coins only being
 								generated once or generated a quite small number of times, with its
 								generation being distributed over all peers through map-reduce, while the
 								proof is verified by everyone, peer and client.
 								For good verifier performance, with acceptable prover performance, one
 								should construct a stark that can be verified quickly, and then produce
 								a libsnark that it was verified at least once  ([libsnark proof generation
 								being costly], but the proofs are very small and quickly verifiable).
 								At the end of the day, we still need the code generating and executing the
 								verification of zk-snarks to be massively replicated, in order that all
 								this rigmarole with zk-snarks and starks is actually worthy of producing
 								trust.
 								[libsnark proof generation being costly]:https://eprint.iacr.org/2018/046.pdf
 								"Scalable computational integrity:
 								section 1.3.2: concrete performance"
 								This is not a problem I am working on, but I would be happy to see a
 								solution.  I am seeing a lot of scam solutions, that sprinkle zk-snarks over
 								existing solutions as magic pixie dust, like putting wings on a solid fuel
 								rocket and calling it a space plane.
 								[lightning layer]:lightning_layer.html
 								[sovereign cipher corporations]:social_networking.html#many-sovereign-corporations-on-the-blockchain
 								[a frequently changing secret that is distributed]:multisignature.html#scaling
-												new file:   images/gpt_partitioned_linux_disk.webp
new file:   images/msdos_linux_partition.webp
modified:   images/nobody_know_you_are_a_dog.webp
modified:   libraries.md
new file:   notes/merkle_patricia_dag.md
modified:   pandoc_templates/style.css
new file:   pandoc_templates/vscode.css
modified:   scale_clients_trust.md
modified:   setup/contributor_code_of_conduct.md
modified:   setup/set_up_build_environments.md
modified:   setup/wireguard.md
modified:   social_networking.md
modified:   ../libsodium
modified:   ../wxWidgets

											
										
										
											2023-02-19 02:15:25 -05:00
+								### How a fully scalable blockchain running on zeek rollups would work
 								A blockchain is of course a chain of blocks, and at scale, each block would be far too immense for any one peer to store or process, let alone the entire chain.
 								Each block would be a Merkle patricia tree, or a Merkle tree of a number of Merkle patricia trees, because we want the block to be broad and flat, rather than deep and narrow, so that it can be produced in a massively parallel way, created in parallel by an immense number of peers. Each block would contain a proof that it was validly derived from the previous block, and that the previous block’s similar proof was verified. A chain is narrow and deep, but that does not matter, because the proofs are “scalable”. No one has to verify all the proofs from the beginning, they just have to verify the latest proofs.
 								Each peer would keep around the actual data and actual proofs that it cared about, and the chain of hashes linking the data it cared about to Merkle root of the latest block.
 								All the immense amount of data in the immense blockchain that anyone
 								cares about would need to exist somewhere, but it would not have to exist
 								*everywhere*, and everyone would have a proof that the tiny part of the
 								blockchain that they keep around is consistent with all the other tiny parts
 								of the blockchain that everyone else is keeping around.
-												leaving potentially inconvenient history behind

											
										
										
											2022-02-16 00:53:01 -05:00
+								# sharding within each single very large peer
 								Sharding within a single peer is an easier problem than sharding the
 								blockchain between mutually distrustful peers capable of Byzantine
 								defection, and the solutions are apt to be more powerful and efficient.
 								When we go to scale, when we have very large peers on the blockchain,
 								we are going to have to have sharding within each very large peer, which will
 								multiprocess in the style of Google's massively parallel multiprocessing,
 								where scaling and multiprocessing is embedded in interactions with the
 								massively distributed database, either on top of an existing distributed
 								database such as Rlite or Cockroach, or we will have to extend the
 								consensus algorithm so that the shards of each cluster form their own
 								distributed database, or extend the consensus algorithm so that peers can
 								shard.  As preparation for the latter possibility, we need to have each peer
 								only form gossip events with a small and durable set of peers with which it
 								has lasting relationships, because the events, as we go to scale, tend to
 								have large and unequal costs and benefits for each peer.  Durable
 								relationships make sharding possible, but we will not worry to much about
 								sharding until a forty terabyte blockchain comes in sight.
 								When we go to scale, we are going to have to have sharding, which will
 								multiprocess in the style of Google’s massively parallel multiprocessing,
 								where scaling and multiprocessing is embedded in interactions with the
 								massively distributed database, either on top of an existing distributed
 								database such as Rlite or Cockroach, or we will have to extend the
 								consensus algorithm so that the shards of each cluster form their own
 								distributed database, or extend the consensus algorithm so that peers can
 								shard.  As preparation for the latter possibility, we need to have each peer
 								only form gossip events with a small and durable set of peers with which it
 								has lasting relationships, because the events, as we go to scale, tend to
 								have large and unequal costs and benefits for each peer.  Durable
 								relationships make sharding possible, but we will not worry to much about
 								sharding until a forty terabyte blockchain comes in sight.
 								For sharding, each peer has a copy of a subset of the total blockchain, and
 								some peers have a parity set of many such subsets, each peer has a subset
 								of the set of unspent transaction outputs as of consensus on total order at
 								one time, and is working on constructing a subset of the set of unspent
 								transactions as of a recent consensus on total order, each peer has all the
 								root hashes of all the balanced binary trees of all the subsets, but not all
 								the subsets, each peer has durable relationships with a set of peers that
 								have the entire collection of subsets, and two durable relationships with
 								peers that have parity sets of all the subsets.
 								Each subset of the append only immutable set of transactions is represented
 								by a balanced binary tree of hashes representing $2^n$ blocks of
 								the blockchain, and each subset of the mutable set of unspent transaction
 								outputs is a subsection of the Merkle-patricia tree of transaction outputs,
 								which is part of a directed acyclic graph of all consensus sets of all past
 								consensus states of transaction outputs, but no one keeps that entire graph
 								around once it gets too big, as it rapidly will, only various subsets of it.
 								But they keep the hashes around that can prove that any subset of it was
 								part of the consensus at some time.
 								Gossip vertexes immutable added to the immutable chain of blocks will
 								contain the total hash of the state of unspent transactions as of a previous
 								consensus block, thus the immutable and ever growing blockchain will contain
 								an immutable record of all past consensus Merkle-patricia trees of
 								unspent transaction outputs, and thus of the past consensus about the
 								dynamic and changing state resulting from the immutable set of all past
 								transactions
 								For very old groups of blocks to be discardable, it will from time to time be
 								necessary to add repeat copies of old transaction outputs that are still
 								unspent, so that the old transactions that gave rise to them can be
 								discarded, and one can then re-evaluate the state of the blockchain starting
 								from the middle, rather than the very beginning.