forked from cheng/wallet
216 lines
42 KiB
HTML
216 lines
42 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
<style>
|
||
body {
|
||
max-width: 30em;
|
||
margin-left: 2em;
|
||
}
|
||
p.center {text-align:center;}
|
||
</style>
|
||
<link rel="shortcut icon" href="../rho.ico">
|
||
<title>Transaction Volume</title>
|
||
</head>
|
||
<body>
|
||
<p><a href="./index.html"> To Home page</a> </p>
|
||
<h1>Transaction Volume</h1>
|
||
<hr/>
|
||
<h2>Total number of bitcoin transactions </h2>
|
||
|
||
<p>Was four hundred million on 2019-05-04, occupying four hundred gigabytes
|
||
on disk. I don’t know how many spent and unspent transaction outputs that
|
||
adds up to, but probably a billion or two. Thus a hash table mapping coin
|
||
public keys to transactions occupies 64 bytes for each coin, so fits in
|
||
couple of hundred gigabytes. Not a huge problem today when four terabyte
|
||
hard disks are standard, but in early versions of our competing rhocoin
|
||
system, simpler just to store all transaction outputs in an sqlite3
|
||
database, where an unspent transaction output goes in a table with a field
|
||
linking it to its transaction, and a spent transaction output goes in
|
||
another table with a field linking it to the transaction from which it is
|
||
an output, and the transaction to which it is an input. Each transaction
|
||
will be part of the canonical Merkle-patricia tree, but the transaction
|
||
outputs will not in themselves be – a proof that a transaction output is
|
||
valid or invalid will work by will contain the hash chain deriving from
|
||
the transaction, in transactions ordered by block number, and within blocks
|
||
by hash.</p>
|
||
|
||
<p>But our canonical tree is going to have to contain namecoins ordered by name order, rather than transaction order, to enable short proofs of the valid authority over a name.</p>
|
||
<hr/>
|
||
<h2>Bandwidth</h2>
|
||
<p>A bitcoin transaction is typically around 512 bytes, could be a lot less: A transaction needs a transaction type, one or more inputs, one or more outputs. A simple input or output would consist of the type, the amount, the time, and a public key. Type sixteen bits, time forty eight bits bits, amount sixty four bits, public key two hundred and fifty six bits, total forty eight bytes. A typical transaction has two inputs and two outputs, total 192 bytes. Frequently need another hash to link to relevant information, such as what this payment is for, another thirty two bytes, total 224 bytes. </p>
|
||
|
||
<p>We will frequently store the transaction and the resulting balances, as if together, though likely for internal optimization reasons, actually stored separately, so 196 bytes. </p>
|
||
<p>Visa handles about 2000 transactions per second. Burst rate about four thousand per second. </p>
|
||
<p>Paypal 115 transactions per second. </p>
|
||
<p>Bitcoin four to seven transactions per second. </p>
|
||
<p>So, to compete with bitcoin, volume ten kbps. </p>
|
||
<p>So, to compete with paypal, volume two hundred kbps. </p>
|
||
<p>So, to compete with visa, volume six mbps. </p>
|
||
<p>My home connection is fifteen mbps up, forty five mbps down.</p>
|
||
<p>Common home internet connections are twelve mbps, common small business internet connections are one hundred mbps. So you will be able to peer with even when we reach Visa volumes, but you probably will not want to unless you are moderately wealthy or you are operating a business, so need a high capability connection anyway.</p>
|
||
<p>The i2p hidden network can handle about 30 kbps, so we can put the whole thing on the i2p hidden network, until we are wiping the floor with Paypal. Ideally, we want a client download, a host and peer download, and a business host and peer download, where you just say "install and next thing you know you have a website offering “<item name here> for <99.99999> megaros”. (During initial development, we will fix the price at one megaro per US$)</p>
|
||
<p>A peer on a laptop can easily handle Paypal volumes on a typical free advertising supported cofffee shop connection, which is around 10mbs down, 5mbs up. We need to allow peers to drop in and out. If you are a client, and the peer that hosts you drops out, you can go to another peer and ask another peer to host your wallet. If you are doing seriously anonymous transactions, well, just bring the laptop running your peer to the coffee shop.</p><p>
|
||
|
||
Assuming a block happens ever five minutes, then visa is three hundred megabye blocks, 8Mbps.</p><p>
|
||
Paypal is ten megabyte blocks, 0.5Mps.</p><p>
|
||
Bitcoin is one megabyte blocks, 28Kbps.</p><p>
|
||
|
||
After two years at bitcoin volumes, our blockchain will be two hundred gigabytes, at which point we might want to think of a custom format for sequential patricia trees as a collection of immutable files, append only files which grow into immutable files, and wal files that get idempotently accumulated into the append only files.</p><p>
|
||
|
||
Initially, our volume will probably only be one transaction every thirty seconds or so, at which rate it will take two years to reach a gigabyte, and everything fits in memory, making thinking about the efficiency of disk access irrelevant. We just load up everything as necessary, and keep it till shutdown.</p><p>
|
||
|
||
During initial development, need an ICO, including a market place buying and selling megaros. </p>
|
||
<hr/>
|
||
<h2>Sorting</h2>
|
||
<p>Suppose two thousand transactions per second, one kilobyte per transaction, block time of two hundred and fifty six seconds. That is half a gigabyte per block, easily held all in ram in native format. Then, when block finalized, we write the block to disk in sorted order as serialized pod objects, with a range algorithm for randomly accessing the desired 4KB block within the 500kB block. For the uniformly distributed part of the address space, you guess which 4K block and hit it up to find an index at the 4K block boundary.</p>
|
||
|
||
<p>For the initial release, all these smarts about four K block boundaries are overkill. We assume the upper parts of the tree are already in memory as native form objects, and proceed directly through the densely packed sorted disk data to get whatever part of the tree we care about into memory.</p>
|
||
|
||
<p>We have a tree of native objects in memory. If we are running short of room, free them up, and restore them on demand by going through the compressed serialized pod objects.</p>
|
||
|
||
<p>We then have a mechanism for efficiently accessing a node by its key, but not however by its hash, except in the common case that the key of the node reflects the hash of the subnode that it signs.</p>
|
||
|
||
<p>Since we sort stuff onto disk in patricia order, quicker to look stuff up by index than by hash, since objects with similar indexes will be near each other.</p>
|
||
<hr/>
|
||
<h2>Storage</h2>
|
||
<p>Instead of only keeping transactions, we keep transactions and accounts. Thus peers can place old transactions in offline storage, or just plain dump them. The hash tree ensures that they are immutable, but you don’t actually have to keep old transactions and old account balances around. Thus, total chain volume bounded. We can replace branches of the Merkle tree in old blocks that lead only to used transactions by their hash – we operate with a potentially incomplete Merkle-patricia dac</p>
|
||
|
||
<p>On the other hand, perhaps I worry too much about disk storage, which has been doubling every year. Since we only plan to have a thousand or so full peers, when we have seven billion client wallets, they will be able to afford to store every transaction for everyone ever. Visa volume of 2000 transactions per second implies fifty gigabytes per day. So, if when we reach visa volumes, a typical peer has sixteen disks each of sixty four terabytes, it will take fifty years to fill them up, by which time we will likely have figured out a way of ditching old transactions.,</p>
|
||
|
||
<p>Suppose we have a thousand peers, and seven billion clients, and each client wants ten thousand transactions stored forever. Each peer has to store forty thousand terabytes of data. Each peer has seven million clients, so each peer is going to be quite large business, and by that time standard hard disks will probably be about one hundred terabytes, so each peer is going to need about four hundred hard disks on a local network of a hundred computers, twenty thousand clients per hard disk, which is not going to be a burdensome cost – other scaling problems will no doubt bite us hard before then. At present two hundred terabyte systems are common, though not among private individuals. Forty thousand is gigantic, though perhaps it will be less gigantic in a few years. OpenZFS handles hundreds of terabytes just fine. Not sure what will happen with tens of thousands of terabytes.</p>
|
||
|
||
<p>The trouble with disk storage is that sector failure rates have not been falling. When one sector fails you tend to lose everything. One bad sector can take down the disk operating system, and one bad sector will take down an Sqlite database. We can mitigate this problem by having several Sqlite databases, with the most active blocks in solid state drive storage, and the immutable consensus blocks (immutable in having root hashes that have been the subject of consensus, though they can lose some branches) spread over several very large disk drives. Perhaps Sqlite will address this problem, so that bad sectors only cause minor local losses, which can easily be replaced by data from another peer, or perhaps when we get big enough that it is a problem we will then write a system that directly writes Merkle-patricia dac to raw disk storage in such a manner that failed sectors only cause minor losses of a few elements of the Merkle-patricia dac, easily fixed, by getting replacement data from peers.</p><p>
|
||
|
||
The way Sqlite works at present, having all our data in one big Sqlite database is not going to work when the block chain gets too big, no matter how big disks become, for sooner or later one sector will fail, and we don’t want the failure of one sector to require a full rebuild of a potentially gigantic database.</p><p>
|
||
|
||
A data structure that dies on the loss of one sector becomes unusable when it reaches two terabytes or so, and the trouble is that our existing disk operating systems, and databases such as Sqlite, are frequently fragile to the loss of single sector. To scale to our target size (every transaction, for everyone in the world, forever) we have to have a system that tolerates a few sectors dropping out here and there, and replaces them by requesting the lost data on those few sectors from peers. For the moment, however, we can get by using multiple Sqlite databases, each one limited to about six hundred gigabytes. They will die infrequently, and when they die, the peers should reload the lost data from other peers. The cost of disk sector bitrot increases as the square of the size of the Sqlite database, because the likelihood of the entire database dying of bit rot of a single sector increases proportionally to size, and the cost of recopying the data from peers increases proportionally to the size. Lose one sector of an Sqlite database, you may well lose the entire database, which is an increasing problem as disks and databases get bigger, and likely to become unbearable at around two to four terabytes.</p><p>
|
||
|
||
But for the moment, three six terabyte drives don’t cost all that much, and with split databases, with some blocks in one Sqlite database, and other blocks in another, we could scale to twenty times the current size of bitcoin storage, and when we hit that limit, solve the problem when it becomes a problem. We don’t really need to purge old transactions, though we build the architecture so that it is doable.</p>
|
||
|
||
<p>The architecture is that it is all one gigantic Merkle-patricia dac organized into blocks, with the older blocks immutable, and the way this data is stored on a particular peer is an implementation detail that can differ between one peer and the next. Each peer sees the other peer’s Merkle-patricia dac, not the way the other peer’s tree is stored on disk. It sees the canonical representation, not the way the particular peer represents the tree internally.This approach enables us to be flexible about storage as technology progresses, and as the amount of data to be stored increases. Maybe we will wind up storing the tree as architecture dependent representations of records written directly to raw disk sectors without any file system other than the tree itself. By the time sector bitrot becomes a problem, Sqlite may well be fixed so that it can lose a bit of data here and there to sector bitrot, and report that data lost and in need of recovery, rather than the entire database dying on its ass. And if not, by the time it becomes a problem, the community will be big enough and wealthy enough to issue a fix, either to Sqllite, or by creating a disk system that represents arbitrarily large Merkle-patricia dacs directly on disk sectors over multiple disks over multiple computers, rather than a database stored on files that are then represented on disk sectors.</p><p>
|
||
|
||
For a database of two terabytes or so, can keep them in one Sqlite database, though probably more efficient to have tha active blocks on a solid state drive, and the older blocks on a disk drive, running on top of a standard operating system. Eight terabytes can store two billion transactions, which will fail horribly at handling all the world’s transactions, and can only keep up with visa for a few days, but we can keep up with paypal for long enough to hack up something that can handle massive disk arrays, and ditches stale transactions.</p><p>
|
||
|
||
Assume clients keep their transactions forever, peers keep their own client transactions for a long time, but dump the transactions of other clients after a month or so. </p>
|
||
|
||
<p>Then to compete with Bitcoin, need about three gigabytes of storage, ever, about the size of a movie. Typical home hard disks these days are one thousand gigabytes. </p>
|
||
|
||
<p>Then to compete with Paypal, need about fifty gigabytes of storage, ever, about the size of a television series. Typical home hard disks these days are one thousand gigabytes. </p>
|
||
<p>Then to compete with Visa, need about one terabyte of storage, ever, about the size of a typical home consumer hard disk drive. </p>
|
||
<p>So when we are wiping the floor with visa, <em>then</em> only wealthy people with good internet connections will be peers. </p>
|
||
<p>If keeping only live transactions, and assume each entity has only a hundred live transactions, then an eight terabyte hard drive can support a billion accounts. Paypal has about three hundred million accounts, so we can beat paypal using Sqlite on standard disk drives, without doing anything clever. And then we can start correspondent banking, and start using a custom storage system designed to support Merkle trees directly on raw disks.</p>
|
||
<p>If we put a block chain Merkle tree oriented storage system on top of Sqlite, then we can shard, write direct to disk, whatever, without disturbing the rest of the software, allowing us to beat visa. We always query the Merkle tree of the current block, with most of the lower branches of the Merkle tree pointing back into previous blocks. So if you ask, what is the record whose identifier is such and such in the current block, you will probably get the answer, "it is record such and such in a previous block", which will likely go through several steps of recursion, as if we had a billion sqllite databases. And why not have a billion Sqlite databases, in which a hundred are in one sqllite database, and hundred in another? And many processes. If we have one block every 256 seconds, then in ten years we have a million blocks, and a table locating each block in some arbitrary place, associated with some arbitrary channel to some arbitrary process is manageable, even without the trival optimization of handling ranges of blocks. This implies that once a block is immutable, it is handed off to some process whose job is to resolve read only Merkle queries on blocks on a certain range. So we have a table locating immutable blocks, and the processes that can read them, which allows us to trivially shard to many databases, each on its own hard disk, and near trivially shard to clusters of computers. Assume a per peer table that assigns groups of blocks to directories on particular disks on particular hosts with only one of these groups being mutable. Then we can have shardable storage on day one. And assume we can have multiple processes, some responsible for some blocks and some responsible for others. Each peer could be a large local network over many machines. Of course, we are still potentially bottlenecked on assembling the current block, which will be done by a single cpu attached to single local disk, but we are not bottlenecked for disk space.</p>
|
||
<hr/>
|
||
<h2>Storage and transmission structure</h2>
|
||
<p>We will not have a Bitcoin style chain of blocks, instead a binary Merkle-patricia dac. But operations on this tree will be grouped into 256 second phases so we will be doing something very like a block every 256 seconds. (Or whatever time the global consensus configuration file specifies – there is global consensus config data, per peer config data, and per client config data, all of them in yaml.)</p>
|
||
<p>For block syncronization we need a tree of transaction events organized by block, hence organized by time, at least for the high order part. To prove balance to a client without giving him the entire tree, need a tree of balances. To prove to a client name key association, need a tree of names. To prove no monkey business with names, need a tree of events affecting names organized by names. To prove no monkey business with balances, need a tree of transactions organized by client. ((Leaves on the tree of transactions would consist of the key to the actual transaction) And, of course, a rather trivial tree to hold the global config data. There will be a chain of leap seconds in anticipation of the day that the standards authorities pull their fingers out of their ass. The leap second data has to be signed by an authority specified in the global configuration data, but this will serve no actual useful function until the day there really is such an authority. For the moment the world represents time as if leap seconds never happened, as for example in communication between mail servers, and no one cares about the precise value of durations – it only matters that time is monotonic and approximately linear.</p>
|
||
|
||
<p>When we want more precise time comparisons, as when we are synchronizing and worried abouth the round trip time, we just get data on both computer’s clocks as relevant and give up on a global true time. So if one computer ticks a thousand times a second, and the other two fifty six times a second, then, for round trip concerns, no biggie.</p>
|
||
|
||
<p>Data will be ordered, both in storage and transmission, by key, where the high order bits of the key are block number, which is the high order bits of the time. Transactions are applied according to the time designated by the sender, rounded to some multiple of a thousand milliseconds, and then the public key of the recipient. Order in which transactions are applied matters for bouncing transactions. Transactions between the same sender and the same recipient shall be limited to one per block period. If someone wants to transact more often than that, has to batch transactions using the microtransaction protocol. For internal optimization, there will be additional indexes, but that is what we are going to hash over, and that will be the order of our hash tree, our storage, and our transmissions. </p>
|
||
<p>Our global hash will be a binary Merkle-patricia dac of hashes, with the leaf hashes corresponding to the above. </p>
|
||
|
||
<p>We use two kinds of time – millisecond time modulo 2^32 for managing connections, and second time for times that go into the global consensus block chain.</p>
|
||
|
||
|
||
<p>C++11 <code>std::chrono::</code> seems to be the library that actually fixes these problems, with a steady clock for duration, which we will access in milliseconds modulo 2^32, and a system clock which we will access for global time in seconds since the epoch modulo 2^64</p>
|
||
|
||
<p>Both kinds of time ignore the leap second issue. Whenever seconds are presented for human readership, then ISO 8601 format which evades the leap second issue. Whenever durations are employed internally, we use milliseconds past the epoch modulo 2^32, and do not assume computer clocks are synchronized, or even both running at the same rate, though we do assume that both are measuring something rather close to milliseconds.</p>
|
||
|
||
<p><code>boost::posix_time::ptime</code> sucks.</p>
|
||
|
||
<p>We give up on handling leap seconds until the standards people get their act together and make it easy for everyone.</p>
|
||
|
||
<p>Time will be stored as a binary UT1 value, and not stored as local time. When displayed for humans, time will be displayed as if UTC time, in accordance with <a href="https://www.ietf.org/rfc/rfc3339.txt">RFC 3339</a> though it is actually an approximation to UT1. If the data is being displayed to the local user momentarily, will be displayed as local time plus the offset to UT1, for example 2019-04-12T23:20:50.52-5:00, representing 2019 April the twelfth, 8:50.52PM UT1 time in New York. If the data is not just being displayed momentarily, but will be recorded in a log file, it will always be recorded as UT1 time, as for example 2019-04-12T15:50.52Z, not UT1 time plus local time offset, because humans are likely to compare log files in different time zones, and when looking at logs long after the event, don’t care much what the local time was.</p>
|
||
|
||
<p>If we are recording the local time plus offset, remember that confusingly 1996-12-19T15:39:57-08:00 means that to get universal time from local time, we have to add eight hours, not substract eight hours.<br/>
|
||
<code>1996-12-19T15:39:57-08:00</code> is, surprisingly and illogically, the same moment as:<br/>
|
||
<code>1996-12-19T23:39:57+00:00</code> add it, don’t subtract it.</p><p>
|
||
|
||
<code>1996-12-19T24:00:00+00:00</code> is, unsurprisingly and logically the same moment as:<br/>
|
||
<code>1996-12-20T00:00:00+00:00</code></p>
|
||
|
||
<p>We can store in any format, and transmit in any mutually agreed format, but the protocol will hash as if in full length binary. Any protocol is transmitting representation of the binary data, which should be uniquely and reversibly mapped to a human representation of the data, but we do not need global agreement on a human representation of the data. Because multiple human representations of the data are possible, any such representation should contain an identifier of the representation being used. Similarly storage and transmission. . Since global agreement on storage, transmission, and human representations of the data is not guaranteed, format identifiers for storage and transmission formats such will be 128 bits in the binary representation. Human readable formats will be in yaml. </p>
|
||
|
||
<p>For a hash to be well defined, we need a well defined mapping between a yaml text stream, and a binary stream. A single yaml document can have many equivalent text representations, so we have to hash the binary data specified by the yaml document. Which means a yaml document must be stored in an object of specific type defined outside the document, and we hash the canonical form of that object. The object type will be identified by a yaml directive %<code><typename></code>, and if the named fields of the yaml document fail to correspond to the location fields of the binary object, it is an error, and the document is rejected. </p>
|
||
|
||
<p>The yaml document representing a binary object has a human readable type name in the form of a percent directive, and human readable field names in the form of a mapping between namesd values, but the binary object it represents does not contain these field names, and its type identifier, it has one, is an integer. The mapping between human readable names in the yaml document and offset locations within the binary object occurs outside the object, and outside the yaml document. </p>
|
||
|
||
<p>Different compilers tend to wind up implementing object padding differently, and of course there is always the endianness issue, big endian versus little endian. For a binary object to have one unique representation really requires ASN.1. </p>
|
||
|
||
<p>ASN.1 can generate aligned Canonical Packed encoding, which is what we are going to need to hash, and aligned packed encoding, which is what we will need for everything except human readability. It can also generate JSON encoding rules, which is type information in JSON, which we manually edit into a human readable description of how the object should look in yaml, and manually edit into a yaml to and from Canonical Aligned packed encoding.
|
||
|
||
For the data represented by a yaml document to have a unique well defined hash, the yaml document has to reference an ASN.1 specification in a percent directive. Try the III ASN.1 Mozilla library to compile ASN.1 into C++. Our yaml library should generate code to read yaml into and out of an object whose representation in memory is compiler and machine dependent, and our ASN.1 library should generate code to read data into and out of the same object whose representation in memory is compiler and machine dependent. </p>
|
||
|
||
<p>A connection between two peers will respresent the data in a ASN.1 PER format, but not necessarily canonical aligned PER format. The representation will be idiosyncratic to that particular connection. Every connection could use a different representation. They probably all use the same representation, but nothing requires them to do so. A machine could have a hundred connections active, each with a different representation, though it probably will not. </p>
|
||
|
||
<p>A branch our immutable binary tree, or the portion of the entire global immutable tree leading to a particular branch, can be output as yaml, or input from yaml in order that we can figure out what the machines are doing, and when we read a branch in yaml, we can always immediately learn whether than branch chains to the global consensus, albeit a portion of that chain may run through a local wallet, then to a transaction in that wallet, which transaction is globally known. </p>
|
||
|
||
<p>Our immutable tree can only contain types known to the code, and compiled into the code. ASN.1 and yaml can represent arbitrary structures, but our code will not be able to handle arbitrary structures, and shall swiftly reject lots of valid, but unexpected, yaml, though a wallet might well know types that are not globally known. ASN.1 canonical aligned PER can only represent types that are expected by the code, thus everything that can be read, has to be expected and capable of beig handled by canonical aligned PER. </p>
|
||
|
||
<p>A conversation involving many people is a tree in reverse temporal order to hash tree. However, we are only going to link such conversations into the block chain in relation to a financial transaction, in which case we want to link it in hash chain order – the bill or offer, and any documents explicitly included in the bill or offer. Texts in chat contain a time, an author, a chatroom identifier, and that they are a reply to a previous text. Constructing a tree forward in times involve a search for all replies. It is not stored in tree form, texts do not contain links to what replies to them. Within wallet storage such a text is globally identified by its hash. Which is globally meaningful for any participant in the conversation. Of course for payments, there are usually only two people in the conversation, but if a text is referenced by a transaction in the global consensus chain, you can send it to anyone, and what it is a reply to, and what that it is a reply to, whereupon they become meaningful, and immutable, for the recipients. So you can send a text to a third party that includes another text, or several such texts. If you include a third party text, and what this third party text is a reply to, then the reply link will become meaningul to the recipient. Otherwise, when he clicks on the reply link, will get an error. The wallet contains a bunch of records that are indexed by hash, the hash globally and uniquely identifying immutable data. Texts are almost free from, having a time, an orginator, a title of under 140 characters, which is likely to be the entire content of the text, and a destination, which may be a chatroom, or a particular sutype of chatroom, a chatroom with only two permitted members, and free form text which may contain references to other texts, generally in the form of a quote or title, which links to the actual prior in time text. </p>
|
||
|
||
<p>Clicking on the link expands it inline with a four tab indent, and a tree outline rooted on the left. If the tree gets too deep, say four tabs deep (configurable) we collapse it by hiding sister expansions, collapsing parent nodes, and applying tail recursion. The tree of a conversation thread is similarly displayed, also in correct time order (collapsing and tail recursing as needed to prevent it drifting too far right) but it is not a hash tree, and you cannot link it into the block chain. You can only link a particular text, and prior texts that it explicitly includes by hash, into the blockchain. Only their root hash is actually stored in the blockchain. The actual texts, though their hash globally identifies them across all wallets, are only stored in some particular wallets, and can easily wind up lost or erased. </p>
|
||
|
||
<p>When peers negotiate a protocol, they negotiate a representation of the data, which may be different for any pair of peers, but what is being represented is globally agreed. </p>
|
||
|
||
<p>What is being represented is a global binary Merkle-patricia dac of blocks of variable length binary data. In transmission we generally do not transmit the entire hash if the entire subtree is available, but a sample of the hash, which sample is different in every connection between every pair of peers. This means that a peer cannot easily fake keeping the data. Since the entire tree represents a global consensus, the content of every block must be understandable to every peer so that every peer can determine that every block is compliant with the global rules, so every block begins with a type identifier, which normally defines a fixed length for the block and the rule for generating a hash for that block. The tree never carries opaque data, though it may well carry a hash code that identifies off block chain opaque data.</p>
|
||
<p>Whatever protocol is employed, the software can and will routinely express the representation in its global canonical binary form, and in human readable form, so that when disagreements occur in reaching consensus the human masters of the machines can agree on what is going on and what should be done about it, regardless of the representation used internally to store and transmit data.. </p>
|
||
<p>Although the consensus will be represented by both a human and a robotic board and CEO, there shall be no automatic rule update mechanism, except according to individual human choices. Each new consensus shall represent a consensus of peers managing a majority of the shares, and to change the rules will require that the board and the CEO persuade a substantial majority of the human masters of the peers to install new software, which they may do, or may refrain from doing. The consensus will be continually updated according to robotic choices, but the rules the robots are following are installed by humans – or not.</p>
|
||
<p>During each block period, peers will be accumulating and sharing transactions for the current and previous block periods. </p>
|
||
<p>They will be trying to reach consensus for earlier block periods by sharing transactions <em>or by excluding transactions that did not get widely shared. </em> </p>
|
||
<p>They will attempt the Paxos protocol to announce a hash reflecting a widely shared consensus for block periods before that (more than three block periods before the present), starting with the block for which no widely shared consensus has yet been announced.. </p>
|
||
<p>So, they don’t even attempt to announce a consensus until ten minutes at the earliest. </p>
|
||
<p>When a consensus exists for the previous block period, the hash is performed as if each transaction was associated with the resulting account balances, and as if the transmission of a transaction was transmitted with the ensuing account balances, though for internal optimization reasons, they are not necessarily stored or transmitted in this fashion. </p>
|
||
<p>This potentially allows a peer to throw away all data on which there is a consensus, except the most recent transaction balances and key names, since clients rely on local storage to prove a transaction has taken place, and what it was for. The peer has to prove to clients that the recent balance changes were correct, so has to keep those transactions and show them on demand.</p>
|
||
<p>A peer must be able to produce any block related to a client that was generated while that client was hosted by the peer, and any block that is part of a very recent consensus, so cannot throw those away, but is only required to produce the hashes linking that to the global hash, not required to produce any block that ever there was. </p>
|
||
<p> A client’s wallet should contain all blocks relevant to that client, and all hashes relating those blocks to the global hash, but if for some reason, for example back up and restore, his wallet does not, and he changes hosts, part of his history may simply become inaccessible. Everything shall continue to function correctly, even if that portion of the history is forever lost to everyone, and represented only by a hash, with the items that the hash represents forever lost. He will know his balance is such and such, but how it came to be such and such may be forever lost.</p>
|
||
<p>For reasons of speed efficiency, we will maintain a table of accounts, and several tables of transactions one of them ordered by recipient, one them by sender, but the system shall act as if we had a single table, in which transactions held balance data and links to the previous transactions for both accounts, and hashes shall reflect this fictitious and storage efficient but speed inefficient structure. </p>
|
||
<p>We shall locally employ 64 bit hash tables of 256 bit quantities, wherever a 256 bit quantity would otherwise have to be internally stored multiple times. </p>
|
||
<hr/>
|
||
<h2>Micro and nano transactions</h2>
|
||
<p>Micro and nano transactions are trust based. </p>
|
||
<p>In nano transactions, the parties keep a running tab, but do not generate any unforgeable cryptographic record of the tab. If something goes wrong, the wronged party breaks the connection, and temporarily blocks the other party. </p>
|
||
<p>From time to time, they record the tab into a micro transaction, one or both parties cryptographically sign. </p>
|
||
<p> Between any two parties there can only be one tab, and only one tab record can be entered into the global consensus in any one time period. </p>
|
||
<p>Each party knows his tabs most recently recorded in the global block chain with all other parties – they have a local image of the part of the global block chain that is relevant to themselves. </p>
|
||
<p>If there is no existing tab, the parties agree to start it at zero. From time to time the party being billed signs a new tab balance. If both parties are being billed, if the balance moves both up and down, they both independently sign it. </p>
|
||
<p>When a tab record is entered in the block chain, that transfers the difference between the most recent tab record, if any, and the new tab record. It is just an ordinary transaction like any other, but the payment conversation it chains to records the interaction of the parties, and they can prove it to third parties.; </p>
|
||
<p>The parties generate tab records at frequent intervals, but only infrequently submit them to the global block chain. </p>
|
||
<p>It may well happen however, that when the tab record goes into the block chain, there is no money to pay. When this happens the defaulting party’s cannot make expenditures until the money is paid – his account is locked until more money goes into it. (Which will probably never happen, so we want to minimize storage associated with such accounts.) But this does not in itelf do much good, since identities are extremely cheap. </p>
|
||
<p>To generate positive reputation, an account must have existed for a long time with positive money and no defaults, which is not cheap, and must have made payments to lots of other accounts which have existed for a long time with positive reputation, which is also not cheap. But how does this square with throwing away the records? </p>
|
||
<p>A user’s wallet and host retains his positive records, and the global blockchain only keeps the most recent default, forever, or for a very long time before altogether throwing away the dud account. The wallet knows what has gone into the global chain, and keeps a record of what should go into the global chain, and the normal wallet refrains from any micro or nano transactions that would result in default.</p>
|
||
<hr/>
|
||
<h2>Zooko’s quadrangle, human readable names</h2>
|
||
<p>The root of the identity of every client and peer will be a public key, or the hash thereof. A client can reserve a human readable name, provided that they have <<code>name_reservation_deposit</code>> amount of money in the account. </p>
|
||
|
||
<p>The client UI will not let the user accidentally spend this money and lose the name, so it will be presented separately from the rest of his balance. A peer has to have such a name, which is also the name of the special client account that controls the peer. </p><p>
|
||
Retaining a human readable username require a minimum balance, which may be changed from time to time with due warning. This implies global consensus on a configuration YAML file. We have a rule that a new configuration file takes effect <<code>transition_time</code>> days after being voted in, provided that it is never voted out in the meantime, where <<code>transition_time</code>> is a value set by the configuration file currently in effect. </p><p>
|
||
|
||
You can reserve as many names as you please, but each requires a not altogether trivial amount of money. </p><p>
|
||
|
||
You can sell a username to another public key. </p><p>
|
||
|
||
The blockchain keeps the usernames, the owning key, and keeps the host for the owning key. You have to ask the host for further details, like contacting the actual owner of the user name. Host links you to the wallet, wallet links you to a small graphic, a 140 character text, optionaly a larger text, and a web page. The guy who controls the web page will be able to create a websocket connection to his wallet that logs you in, and sends messages to both wallets. </p><p>
|
||
|
||
You can look up something like a blog page of the user, which blog page is not necessarily kept on the host. You can message the user, but unless you get on the whitelist, this involves a very small fee. </p><p>
|
||
|
||
You can also access a chatroom, by invitation, or by knowing the chatroom name and a shared secret. Messaging someone creates a chatroom to which the sender and all recipients have access, to which any party can give a name and a shared secret, or invite others. If you message some people in the chatroom but not others, you create a new chatroom, which forms a tree of chatrooms. </p><p>
|
||
|
||
The point, purpose, and intent of this infrastructure is to enable secure encrypted conversations, and to allow people to make and accept offers within the system. Chat with money. But we want to integrate with the web, after the fashion of the payment processor. You send the client, and his order, to the payment processor, and the payment processor comes back with a page hit saying he has paid for order such and such.</p><p>
|
||
|
||
The browser is inherently insecure, so your wallet client will allow encrypted non html secured chat. It will also provide a websocket connection (socketio) so that web pages can deal with it.</p><p>
|
||
|
||
It is useless to attempt immutable records of html pages, since they can, and do, mutate according to circumstances. We can, however, create immutable records of pdf pages, which should live, not in the blockchain, but in the wallet’s associated directory tree. HTMLDOC will convert html to pdf. The wallet, the client, and the blockchain, will set up a browser server connection between two blockchain names, but anything that is going to chain into the blockchain, such as a record of what is being paid for, cannot be html. All html data and authentication shall be transient, shall be local to the parties, and shall go away at end of session. If a web page wants to create a durable record for the wallet, has to be pdf, plain text, or YAML plain text. (Which it sends to the wallet by websocket or socketio.)</p><p>
|
||
|
||
To get a logged on browser to server connection between two blockchain identities, your wallet client will talk to the other guy’s wallet client, which will talk to his webserver, and your wallet will launch the browser with a regular https connection with logon cookie representing that a browser obeying a wallet with one blockchain key is logged on to a webserver that obeys a wallet with the other blockchain key.</p><p>
|
||
|
||
When the time comes to do a transation, the wallet clients will exchange encrypted non html messages, in response to websocket messages by browser and webserver, and the payment will be linked to one such signed message, held off blockchain by both parties in their wallets. Logging on will always be done in the blockchain client, not the browser and the keystroke that actually agrees to make the payment will be done in the blockchain client, not the browser.</p>
|
||
|
||
<p style="background-color : #ccffcc; font-size:80%">These documents are
|
||
licensed under the <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">Creative
|
||
Commons Attribution-Share Alike 3.0 License</a></p>
|
||
</body>
|
||
</html>
|