Also, needed to understand Byzantine fault tolerant paxos better. Still do not.
58 KiB
Syntax and semantics of identity
The problem is, we need a general syntax and semantics to express identity.
Our use cases are likely to include a big pile of documents signed by diverse people, with no contact information, some of them encrypted so that they can only be read by people with certain private keys, with no indication of the public key corresponding to that private key.
So, what is our ascii armoured signature going to look like?
If we ascii armouring, we are likely signing a utf8 string. Which will be hashed as a count based string introduced by an arbitrary precision integer and followed by a null that is not included in the count, even if it is a null terminated string with no count, or a count based string that normally has no null terminator. This is to ensure that it is impossible to concoct a multiple string sequence that will have the same hash for a group of strings as for a group of strings grouped differently, and so that different computers with different word lengths and different endianness will generate the same hash for the same string or sequence of separate strings.
A sig block consists of:
{sig
252 bitstring as base sixty four characters, arbitrary sequence of non base sixty four characters, 252 bit bitstring as base sixty four characters, optional arbitrary sequence of non base sixty four characters, 256 bit bitstring as base sixty four characters representing the public key, optionally followed by whitespace or linefeed characters, followed by arbitrary utf8 characters representing the nickname of that public key, which must start with a non whitespace character, the name being followed by}
and may not contain}
Or alternatively the nickname may be represented by a nickname block
composed of {nick
, optionally followed by an arbitrary sequence of
bracketing or whitespace or symmetric ascii characters followed by “
,
followed by the nickname, followed by ”
followed by the reverse sequence,
followed by }
A single signed string may have several different but equivalent ascii armorings:
-
unquoted
- blank line or start of document.
:::
on a line by itself- string to be signed on a line by itself (or lines by itself if it contains line feeds)\
:::
followed by sig block.- blank line
-
quoted
- blank line or start of document.
:::
followed by arbitrary sequence of bracketing or whitespace or symmetric ascii characters followed by“
, on a line by itself- string to be signed on a line by itself (or lines by itself if it contains line feeds)
”
followed by the reverse arbitrary sequence followed by:::
followed by sig block.- blank line
-
inline unquoted
[
string to be signed]
followed by sig block.
-
inline quoted
[
followed by arbitrary sequence of bracketing or symmetric ascii characters followed by“
, followed by string to be signed followed by”
followed by reverse arbitrary sequence of bracketing or symmetric ascii characters, followed by]
followed by sig block.
A signature represents an identity. If a means of contacting that identity is to be represented, it will be represented outside of and separately from that signature.
Very commonly we want to sign not just one arbitrary string, but an arbitrary string and or a nickname and or a public key. This is done similarly to the above, with the public key introduced by a hash sign, and the nickname bare or in a nick block.
For example:
:::
Hi
:::
John Hancock#0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghi
{sig 0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefgh 0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefgh}
Or:
[
Hi]
John Hancock#0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghi
{sig 0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefgh 0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefgh}
We cheerfully assume that strings have no semantically and syntactacly significant characters, and if they do have semantically significant characters, we bracket the string with angle quotes, or if the string contains angle quotes, with angle quotes and an arbitrary string of non significant bracketing, symmetric, or whitespace ascii characters around the angle quotes.
#
, “
, ”
, "
, :::
, [
, ]
, {
, and }
are semantically and
syntactically significant in certain contexts, and to distinguish between
the endless variety of uses to which they will be put, the closing :::
or
the opening {
will generally be immediately followed by a short label
identifying the particular use to which it is being put. A starting :::
is
preceded by a blank line or start of document, and an ending :::
is
followed by a {label ...}
identifying the use to which it is put, followed
by a blank line. The ending :::
is a ternary operator, linking several
fields
Thus, if someone obnoxiously wanted line feeds, angle quotes, and curly brackets in nickname, perhaps:
John Hancock {prince of “darkness”}
a signed block of text might then look like:
:::
Hi
::: (|“
John Hancock {prince of “darkness”}”|) #0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghi
{sig 0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefgh 0123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefgh}
with the (|“
… ”|)
acting as quote marks that will cause whatever is
inside them to be treated as a single string by the surrounding
representation. To avoid the need for escaping special characters, we allow
an infinite variety of quotation symbols.
Our goal in breaking from uri syntax, json syntax, and even yaml and markdown syntax, is a syntax that allows arbitrarily complex cryptographic expressions to be represented to the end user in a way that is as intuitive as they can be.
This syntax is inspired by Pandoc markdown, which extended the syntax of markdown to allow better combination of markdown with html and css, while still holding on to as much readability as possible.
Whitespace characters can be liberally inserted and will generally be ignored, except that they have to be balanced when demarking quotes or that they part of the string or the nickname. Literal strings will only occur within these labelled bracketing operators, or within quotes inside these labelled bracketing operators
In an environment where lines are represented by something other than line feed characters, the lines shall be converted into line feed characters before being hashed and signed, and line feeds in the signed string may be represented by lines native to the environment when the ascii armored signed string is displayed in an environment where lines are represented by something other than line feed characters.
Primary and most urgent use case
Our primary use case, however, is not mere identity, but is a link that will bring you to a shopping cart, containing a link to a checkout, containing a link that will say “Your order has been placed” and will generate a record that the payment has been attempted, containing a link that can check on your payment status, and generate a review or respond to the vendor, which should be followed eventually by an email like message from the vendor that your payment has gone through, also containing a link that could generate a review and respond to the vendor.
Our secondary case is signed messages subject to usenet style authentication and hashtag style pooling, distribution, and grouping (rather than usenet hierarchical grouping), and to spam filtering by the distributors (which is necessarily indistinguishable and inseparable from political filtering, because a lot of political messaging is spam, containing repetitiously similar messages, which messages are apt to be misleading and manipulative). And with social media style following, liking, downvoting, and reposting (where a repost should not result in the same message appearing in someone’s feed multiple times, but in the feeds of the followers of the reposting party who did not follow the posting party. A repost should appear with the reposters that you are following appearing. Friending should be a request to the client’s server for permission to communicate, but following should grant permission and capability for one way communication, while successfully sending a message automatically by default grants permission to communicate one way, with one way one on one communication permitting two way one on one communication. Thus mutual following should by default permit mutual two way one on one communication.
Mutual following should by default imply friending, and friending should by default imply mutual following.
We should also allow exploding messages, which are authenticated by the client to his server, and authenticated by his server, but you have to login, perhaps anonymously or under a throwaway identity, to his server in order to see them – they cannot go in usenet style pooling.
An identity should be able to function as host and server if the owner of that identify chooses to set that identity to continually publish its signed network address (obtained from its host). This enables anyone to send messages directly to that identity, but normally most such messages are automatically ignored and discarded.
Thus if an identity is selling something controversial or arguably illegal, and is functioning as a server, in order that people can make purchases, anyone in the world can discover and monitor its network address. Therefore, purchases have to be possible between client non host identities, whose network address is not widely known. The transaction sequence (link, shopping cart, purchase screens) have to be email like.
Our primary use case for exploding messages is discussing the manufacture, sale, and purchase of goods, such as guns, which may attract attacks.
In this case you have a non radioactive server carrying signed messages, some of them mildly radioactive and carrying links to a mildly radioactive server, carrying highly radioactive messages, some of them exploding, which contain links to contactable identities through which radioactive goods and services can be bought and sold, leading to signed or exploding reviews supporting or detracting from the reputations of sellers and purchasers of highly radioactive goods.
This implies a cryptographic resource identifier has names and a public key, or a blockchain transaction output identifier, but may be useless without a locator which identifies a host. As a general rule, your host will know where to find that host.
Cryptographic resource identifiers should be visible to the human as per message petnames of the entity embedding the resource identifier, qualified by the users petname, if any, and the entities nickname, with the embedding defaulting to the entities nickname, unless the composer provides a per message petname.
Because messages are widely distributed, using authentication rather than signing would not make a difference and would be difficult to implement. The identities signing therefore would need to be discardable, but they will need to permit the possibility of one on one contact, usually through a peer that hosts the client identity, perhaps directly, if the sender wishes it. If it is a sales message, which is our primary use case, it will frequently be convenient to be contactable directly, usually through a named blockchain transaction output, supported by the pool of signed network addresses associated with blockchain outputs, which typically contains not the network addresses of the keys signed by the blockchain key, but the network addresses of keys signed by that key as having authority to communicate on its behalf.
We aim for a pseudonymous currency supporting reputations, not an anonymous currency. We worry first about identity, and actually locating that entity is an ad hoc afterthought cobbled on top. We want cryptographic resource identifiers, which in some cases have the side effect of locating the resource, in other cases have the effect of identifying a byte string that has somehow arrived on your computer, or discovering that you don’t have certain items of evidence that should have arrived on your computer, but have not, and should be searched for on the cloud.
The foundation of our system is probabilistically unique cryptographic identifiers, twofiftysix bit identifiers.
Cannot comply with uri syntax
I would like to conform to the Uniform Resource Locator and Uniform Resource Identifier syntax and semantics, but they are too married to the existing system, and we are going to have subtly different semantics and much larger semantics. They have reserved no end of characters for their own syntax, with their own semantics, and expressing complicated semantics within a syntax designed for different and more restricted semantics and useful characters reserved that syntax just does not fit too well. And they never got hip to utf8. And they prohibit spaces.
There is an obvious problem in permitting unicode in identifiers, since the confusible identifier problem is insoluble even with ascii only identifiers, and becomes enormously harder with unicode identifiers. So, since no end of people have tried unsuccessfully to solve it with ascii identifiers, we will make absolutely no attempt to solve it. Instead, we make it more difficult to use confusible identifiers maliciously, and trust people to do their best to make their identifiers distinctive.
The usual malicious use of confusible identifiers is a man in the middle attack: “This is an urgent notice from BigImportantFinancialInstituion. You need to login immediately or all your money will be frozen or lost.”
Then you login onto an entity that has a name that can be confused with that BigImportantFinancialInstitution, give it all the secrets that you share with the actual BigImportantFinancialInstitution, and lose all your money.
Obviously this is not going to work with cryptographic secrets, since they are unshared. You login with a strong private secret, instead of a weak shared secret. So we should just go on living with the possibility and likelihood of confusible identifiers, as we have already been doing for a very long time, and expect that they will cause far fewer problems.
If you a contact a confusible entity, and it looks confusible to your client program, the interface will detect the incompatibility, and demand a distinctive petname, rendering it no longer confusible. Thus confusion becomes a client problem in a small space of names, rather than a system problem in a very large space of names. People who create clients will worry about it when and if people who use clients are bothered by it. Which they probably will not be. We will not worry about it unless people start demanding a fix. Which is only going to happen if malicious people find a use for a confusible. Which, if we do other parts of our job right, is not going to happen. Experience has demonstrated that it is always easy to maliciously create a confusible name. The solution is to set things up so that the malicious person can do little harm with it, so that no one wants to create confusible names.
Spaces are forbidden in a uri, because one routinely passes uris as arguments on the command line, and arguments are separated by whitespace. But lack of spaces is s severe blow to intelligibility, and in particular a severe blow against Zooko nicknames. One way around this rule is to have a rule for interpreting command line arguments that if an item is contained in angle quotes or brackets, it is one item, and to have as part of the scheme syntax and schema a rule for interpreting the object, that if it begins with an angle quote or bracket, it ends at the matching angle quote or bracket, and similarly for any item within it. If an operator expects an argument, and there is a bracket or angled quote mark immediately after it or before it, then everything inside the brackets to the next matching bracket, is one argument.
The no spaces rule is a reflection of the widespread use of lexers when we should be using parsers. Lexers fail to provide sufficient expressive power. We should break compatibility by demanding that anything that can handle our expressions uses a command line parser rather than a lexer, because you are just not going to be able to handle Zooko nicknames in a lexer.
The uri syntax is written for a lexer, not a parser, and the command line syntax is written for a lexer. Bad!
A program that accepts cryptographic identifiers on its command line is just going to have to parse, rather lex, its command line. Which is a pain because all the standard libraries for command line handling are lexers, not parsers.
In order to allow people to identify themselves and their computers on the cloud with distinct and intelligible cryptographic resource identifiers, we are going to assume parsers everywhere and unicode everywhere. If parsers are far from everywhere, they should be.
We could easily conform to the uniform resource identifier syntax and semantics by adding arbitrarily complex syntax and semantics within the authority field RFC2396 Section 3.2.
Oracle’s way of dealing with this problem was to have a pile of alphanumeric keywords separated by dots in the authority. And then they could have all the syntax and semantics they needed on the keywords, which were lexed by the dot separators within the authority field. This, of course, resulted in long and unintelligible authority fields full of verbose boilerplate. For intelligibility, need to have the cryptographic resource locator look as if it is an unknown scheme that has no authority component.
Trouble is that the restricted character set (no spaces, no unicode beyond ascii seven bit, and restrictions on the use of ascii seven bit punctuation characters) means that the resulting uris would not be visually distinctive – it would be hard for end users to make their uri look different and memorable. One way out of this conundrum is to have our own non conformant cryptographic resource identifiers, which can be made visually distinctive, and which can be, if needed, mapped to indecipherable gibberish that all looks alike using % byte codes (I refuse to call them octet codes), but which conforms to the permitted character set within the authority field.
Html tag restrictions prevent us from using " or ' marks inside an
attribute (because an attribute may be contained inside " or ') and
html attributes prohibit newlines, and provide no way of escaping
newlines into an attribute. At least we can escape " inside an
attribute with "
and ' and with &039;
, or with % encoded
octets but this is useless because we don’t want our entities to look
different inside html than they do to the parser.
RFC9386 Section 2.2 effectively restricts lexing and parsing to ninety three characters, excluding spaces because likely to be in a context that lexes on spaces, ", and ' and then lexes and parses as one operation by reserving no end of special characters, forgetting decades of study of the lexing and parsing problem.
As they attempt to squeeze more and more semantics into this impoverished syntax, the syntax and semantics becomes ever more impoverished. And we plan to permit a whole lot of new semantics into the authority field.
The impoverished syntax and semantics led to the triple backslash in the file scheme. Before they could get to the field with file syntax and semantics, they had to have an empty field with authority syntax. We want visually distinctive cryptographic resource identifiers, without distracting rubbish wrapping them.
We do not need the // to distinguish the authority field, because we have several different bases for authority, and need to distinguish them within the authority field. Which means that anything parsing the cryptographic resource identifier as uri is going to demand relative url syntax and character set for the whole thing.
Digression on parsers
There is no real difference between a resource locator and a resource identifier, because usually a resource identity depends on a name assigned by some authority, and when you contact that authority, it is likely to know where the resource is, and a resource locator may well refer to a resource that cannot be located, resulting in the ubiquitous 403 and 404 messages, thus necessarily have the same syntax and semantics. So, they are best all called resource identifiers, particularly as the purpose of cryptography is to identify, not to locate.
But still, we are going to have be able to recognize normal uris, so we will have to start an absolute resource identifier with a scheme, and might as well end the scheme name with a colon, though we might piss on them by having our scheme name be a unicode character that is not permitted, under some circumstances followed by a space which is not permitted either.
But, on the other hand, we would like environments that do not know
about cryptographic resource identifiers to at least be able to
recognize it as an unknown scheme, so I guess rho:
it is, at least as
an option. But after that colon, it is our playing field, and no longer
“universal”. Past the colon, a much bigger syntax than the overly
grandiose “universal” will apply.
The universal scheme is inherently far from “universal”, because it as soon as you put anything in a “universal” scheme other than how to recognize the scheme identifier, it ceases to be universal.
The “universal” resource identifier scheme contains no end of stuff that belongs not in the “universal” scheme but in particular schemes.
The reason for the irritating double backslash is because of protocol
relative authorities. You could potentially leave out the http:
, in
which case you have to distinguish the authority from a local directory
name.
The reason for the triple backslash file:///
makes little sense, it is
there because file://
(implied authority)/
. The backslash is there
because of “universal” syntax and semantics that are not applicable to
file systems. Unfortunately, however, while file systems have a subset
of universal semantics, we have a superset.
In our case, we might reference keys by name anywhere, and they might have local names or blockchain names. So we could have explicit keys, keys referenced by local name (which might well be a chain of names specifying a path through a local tree of bookmarks and contacts) keys referenced by unspent transaction output sequence number (which are supposed to generate an exception if already spent) keys referenced by transaction sequence number (the number is unchanged by spending, but should not generate an exception if spent) and keys referenced by the name of the string that they own on the blockchain. And since local and blockchain names are just strings, you have to identify what the name refers to.
So, rather than the // system for designating authorities, we will have a label, which may be a single reserved character, or a string followed by a “:”
We do not want to restrict possible names, since we want maximum distinctness, intelligibility, and recognizability. We want names to have available to them spaces and the full scope of characters.
In order that we can use brackets to denote a string entity containing terminal that could be interpreted as syntactically significant, rather than just more string, our syntax will have to denote different kinds of strings. Obviously a bracket enclosed string can contain operators that, if interpreted as operators rather than part of the string, produce a symbol that is not of string type, so if the parser is looking at the expression to see if it can be a string, can accept symbols as strings that in some contexts would be operators.
Any sequence of strings is a string, whose value is that of all the strings concatenated.
Any sequence of non whitespace unreserved characters is a string.
Any whitespace between two strings is a string.
Reserved characters can be used as part of a string in contexts where, if interpreted as syntactically significant, the resulting non terminal would be of unexpected type.
A html
entity
like '”
' is a single character string, in this case is a string
containing an unbalanced quote mark.
A unicode entity may also be represented by one or more % encoded bytes (or as the standards people, who still bow down to the tyranny of punched card machines long turned to rust call them, “octets”).
Of course many environments will throw up if you have spaces or unicode characters within something that looks like a uniform resource identifier, in which case you can make your string out of html entities or % encoded bytes corresponding to utf8 encoded characters. It will look ugly and incomprehensible, impossible to read and very hard to write, but the ascii armor will protect it in an environment that does not like non ascii. Tidy throws up at spaces or unicode beyond ascii in a domain name inside a uri, but is happy with unknown schemes, and happy with html entities representing non ascii unicode inside a domain name inside a uri, so you can armor any string into something that tidy will happily accept as an unknown scheme. But the parser is written for the twenty first century, and is intended to accept cryptographic resource identifiers that are human intelligible. We still are under the tyranny of standards set to accommodate punch card machines and it is long past time that we broke compatibility with those standards.
Arbitrary precision integers, arbitrary width windowed integers, public keys, bitstrings, and the like, are presented by a sequence of base sixty four digits immediately following, without intervening spaces, something that the parser knows should be immediately followed by a public key or whatever. They may also be represented by h% followed by base hexadecimal digits, b% followed by base two digits, but base sixty four is the default. An integer must be represented by %d followed by decimal digits. Thus we can reference names that are distinctive and unique, and reference them even cryptographic resource identifiers that have to be transported in text that restricts available characters.
Relative resource identifiers that may be relative to cryptographic resource identifier or a universal resource identifiers will have to be valid.
The left hand hand part of a cryptographic resource identifier has to be a scheme, and the right hand part is likely to be a standard relative uri. The central part however is going to be a cryptographic authority, and for that we need a much broader syntax than “universal”.
And that cryptographic authority might well have several hosts with several temporary subkeys and several network addresses, and several human agents with several temporary keys and several network addresses. And each of those network addresses should have the up to date public keys of all of them. Everyone should be the equivalent of a domain name server for the groups of which he is part.
The parser that parses a cryptographic resource identifier is going to encounter no end of things that require look up over large local databases, databases in the cloud, and finally, lookup on the target host. Thus parsing a cryptographic resource identifier tends to be equivalent to locating the resource. A 403 or 404 is parsing failure because of undefined symbol.
If an entity referenced in the identifier has existence, and perhaps class and parse type, some host, possibly far away, has given it that existence and parse type then for the parse to succeed, the parser has to connect to that host.
The straightforward case of Zooko cryptographic resource identifiers is that your equivalent of a domain name is a public key. You look up the network address and the public key actually used for communication in the equivalent of the domain name system, and get a public key signed by the domain name key. Then you communicate with that address, with your communications being authenticated by that key. There are likely three destination public keys involved in making the connection: a durable public key whose corresponding private key is likely not on any computer, but written down on a page in an old book on the bookshelf of the rightful owner of that public key, a durable but replaceable key signed by that master key, and a session key that provides perfect forward secrecy in that the corresponding secret key is discarded when the connection is closed.
#4397439879483774378943798
Represents the public key of an entity that knows the corresponding secret key, or a block of data, commonly a widely shared and distributed block of data, that when hashed according to its schema has that hash. If a public key, you use it to make an authenticated connection or to authenticate a signature, if a hash, authenticate data. If you can find it, you probably already know which it is and what you are going to do with it.
If it is a public key, and you somehow locate the network address of that entity, it will either authenticate itself with that key, or authenticate itself with a public key signed by that entity authorizing it to act as agent in some capacity.
Bob
#4397439879483774378943798
Represents the entity known to #4397439879483774378943798
as Bob
(though if it is part of signature, it means that that entity knows
itself as Bob, that that is the nickname of the identity that this
key reprsents.) Likely the the entity itself, possibly with a
different public key, possibly an agent authorized to speak and act
on behalf of
#4397439879483774378943798
, or possibly some random guy on his
friend’s list. If you have contacted #4397439879483774378943798
, you
have likely contacted Bob
, and if not #4397439879483774378943798
may
be of help in contacting him. If the unlikely event that the entity has
a Bob on its contact list, and a different Bob on its authorized agent
list, this will bring up the agent.
Bob
#4397439879483774378943798
The entity known to #4397439879483774378943798
as “Bob”, but not
authorized act on behalf of #4397439879483774378943798
. For another
entity, Bob is probably a different Bob, and the same Bob probably has a
different name at that entity.
Receivables
#4397439879483774378943798
The entity known to #4397439879483774378943798
as “Receivables”, and
authorized to act for that entity in some role. Possibly the entity
itself.
#4397439879483774378943798/foo
A data object on the computer that identifies itself on the network with
this public key, or a public key authorized by this public key. The data
object is typically itself a name table, hence
#4397439879483774378943798/foo/bar/program_name/arbitrary data for program.
uris have tied themselves in knots distinguishing between a file, and data passed to the program represented by that file to execute. Probably better to just say that anything to the right of the slash is entirely up to entity to the immediate left of the slash to interpret, and if it contains spaces and suchlike, use windows command line string representation rules, quote marks and escape codes.
rho:#4397439879483774378943798
rho:Bob#4397439879483774378943798
Bob@#4397439879483774378943798
Receivables.#4397439879483774378943798
fit into the Uniform Resource Identifier scheme, poorly.
#4397439879483774378943798/foo
fits into the catchall leftover part of the Uniform Resource Identifier scheme.
rho:Bob@Carol.Dave#4397439879483774378943798/foo
Does not fit into it in the slightest, and I think the idea of compatibility with the URN system is a lost cause.
But public keys are non memorable and difficult to humanly distinguish. We need Zooko’s quadrangle.
Zooko’s Quadrangle
Obviously you need to be able to reference human readable names on the blockchain, which is the fourth corner of [Zooko’s triangle].
[Zooko’s triangle] : ./zookos_triangle.html
Location
We want any identity to be able to send end to end encrypted messages to any other identity, but we don’t want ten thousand scammers to be able to spam an identity of which they know nothing other than that he can send money over the internet, nor do we want to allow distributed denial of service attacks on ordinary users (big peers can take care of themselves).
Universal Resource locators, urls, are not semantically or syntactically distinct from Uniform Resource Identifiers that are wrapped around methods for finding stuff on the internet, and methods for finding stuff on the internet are wrapped around Uniform Resource Identifiers, but this was designed for a smaller and more trusting world. Today, the problem is not finding data, but rather preventing hostile and ill willed people from finding data and from interfering with communication by supplying falsified data So we need names that are rooted on cryptographic foundations.
If you have a system of naming that can securely identify that you are getting the right connection and the right data, you can wrap location around it and attach location to it ad hoc. SQL is a language for generating on the fly ad hoc efficient methods for accessing data that is specified in ways that bear a very indirect relationship to its location.
The equivalent of a web page in the cloud obviously has to have a globally unique human readable name, being the website page and the identifier on the website, which is associated with an network address that hostile parties can find, to mount DDoS attack or sent the cops around, and associated with public key, or chain of public keys. But we do not want the equivalent capability to send messages to humans, or to find them. Yet we want everyone to be able to talk privately to everyone.
Suppose someone is publishing information which he wants widely known. He wants to cooperate with people who will cooperate with them, supply them with good information that is arguably grounds to act in his interest and their own. Well, often unpleasant people, or the government, do not want that to happen. For example people cooperating to build guns, and providing information on building guns. The government might well prefer that people do not have guns. Or a wealthy Chinese man wants to use the Chinese diaspora to move his assets to where the party cannot get at those assets. Or someone simply has money, and other people are hoping to scam him into giving them some, or give him a hard time so that he pays them to go away, or just hate him because he has money. Then he likely wants to pay for and operate the server publishing that information disconnected from his tax number, his real address, and a face that can be beaten in. So he wants the server identity and network address widely known, but he does not want the network address through which he makes payments for that server known to anyone except the people he is paying for the server.
In order to send a message directly to an identity, you are going to need its network address (which might be intermediated by a peer) but we don’t want to make the network address of every identity public. Sometimes, often, the network address associated with an identity needs to be a narrowly shared secret.
We can fix distributed denial of service attacks by inserting a proof of work demand in the first syn-ack of the three way connection setup handshake (syn, ack-syn, ack), and the work has to be supplied in the ack, before the server allocates any memory or performs any expensive asymmetric cryptography operations.
Most of the time a server is responding with a message that is intended for human attention. This is OK, because client requesting the message is under human control, so the human initiates.
If the server could send web pages uninvited, that would be extremely bad.
So, should not be able to get a human readable message unless it is a response to something, such as another human readable message.
So how do we get things started?
Everything is message/response, which makes a stream nature of TCP a bad idea. And, being engineers, we are apt to break things down to messages, which presupposes we can send any message to anyone in isolation, but a message has to happen in a connection, and a connection may require a relationship.
Once we have a relationship, we can figure out a connection within this relationship, and once we have a connection in the context of this relationship (which may be many to many), then we can conceptualize messages as isolated units within the context of that connection, within the context of that relationship.
We want groups of people to be able to securely communicate, which the Jami UI does: Name of room is weakly durable shared secret – but the people have to use this secret to find each other, which means that their meeting point location has to be a publicly known network address, and their meeting point location then knows all of their network addresses. But UI for this is not urgent. Rather, we want buyers to be able to find sellers, and buyers and sellers to acquire reputation.
For reviews, creating reputation, we are going to need usenet like distribution. For conversations regarding the transaction, email like distribution. The client logs in with the people he is transacting with to get and receive personal messages regarding the good or service, or he logs in with his message server from time to time to get or receive messages. This limits the number of people that can see the connection between his network address and his public key.
Implementation
- signed
- anyone can check that some data is signed by key, and such data can be passed around in a pool, usenet style.
- authenticated
- You got the data directly from an entity that has the key. You know it came from that key, but cannot prove it to anyone else.
- access
- A key with authorization from another key does something.
- authorization
- a key is given authorization by a key with authority.
- authority
- A key with authority can give other keys authorization. Every key has unlimited authority to do whatever it wants on its own computers, and with its own reputation. It may grant other keys authorization to access certain services on its computers and to perform certain acts in the name of its reputation.
We do not want the key on the server to be the master key that owns the server name, because keys on servers are too easily stolen. So we want it to be a key granted authority by the key that owns the server name.
There are no end of cases that we will eventually need to handle where one key grants some authority to another key, so we need a general mechanism and general format for this, with the particular cases we are now implementing being particular cases of this general mechanism and general format.
We will have authority to respond to automatic and anonymous queries, analogous to hitting a web page, authority to receive crypto currency, authority to promise goods and services in return for crypto currency (which authorities will often belong to different keys), authority to receive messages intended for human consumption, and authority to authenticate messages from a human identity (which will typically belong to the same key)
And a data structure that associates a network address or rendezvous server with a key, which data structure may be widely distributed, or narrowly distributed.
A data structure corresponds to a record or structure of records in a database, we are talking about a way of synchronizing databases, and all databases start as a human readable human writable text file. So, we need a network database that has public information, and network database that has private information.
So, we need a collection of data akin to
/etc/hosts
- public data, the broad consensus, agreed data known to the wider community.
~/.ssh/known_hosts
- privately known data about the community that cannot be widely shared because others might not trust it, and you might not trust others. You may want to share this with those you trust, and get it from those you trust, but your set of people that you trust is unlikely to agree with someone else’s and needs to be curated by a human.
~/.ssh/config
- And there is data you want to keep secret.
Public data has to rest on a foundation of idiosyncratic data, which has to rest on a foundation of secret data. If you do it the other way around, as with peer review, then public data gets manipulated for hostile purposes, as with certificate authorities and domain name service.
So we are going to start with idiosyncratic and narrowly shared contact information in private databases. The typical operation is look up a petname (guaranteed locally unique) find the controlling key for that petname (probabilistically unique) and find a publicly accessible key (probabilistically unique) which has the petname key as its root master key.
Format for a key granting authority to a key
types
We have, as always everywhere, additive and multiplicative types, with no general and universal way of ascertaining the type of data. You generally know from context. After all, if you know where to find the data, you probably know what the type is.
Any universal type, or universal way of discovering the type, is apt to turn into a constraint, and people find, as with json, clever ways of working around that constraint, which break everything
But situations do arise where one has to discover the type of an opaque record.
In general, the type of value represented by a hash may be unknown. A communication channel is an indefinitely long sequence of records of unknown additive type. A file is a large object of unknown type, that is apt to be part of a big pile of objects of miscellaneous type.
To open a connection, need protocol negotiation.
Unix originally planned to have executable files identified in the directory field, but found it had far too many types of executable file, and started to supplement this with the file header.
zip files, tar files, and tar.gz files are in unix identified to the user by the file name, but various utilities by the file header.
So, we have a type that identifies type, and we have a type that consists of such an identifier, followed by the object so identified - which is the equivalent of unix's file header.
We need a name for such a data structure It is not a datum, it is a way of making data out of an unending string of bytes, it is a way of identifying the data that bytes represent. It is a blob (Binary Large OBject) beginning with a schema identifier. It is record with schema id. It is a typed object. Having trouble naming it.
OK, we will call a schema identifier with the fields it identifies following a schere (SCHema REcord). It is a schema attached to the record whose fields are implied by the schema. It can contain strings, arbitrary precision integers, hashes, elliptic points, elliptic scalars, and other scheres. A collection of scheres can only be parsed from left to right, because if we do not know what to expect, we cannot know where one ends and the next begins.
Well, that is OK for objects at rest, but protocol negotiation is a different animal
We may well in future want to implement an extensive capability system, so we have to have a data format that gives room for future extensions to do all sorts of unforeseeable things – for example one key might want to issue a transferable time limited right to access a particular file to another key.
A byte stream preceded by schema identifier that defines what those bytes mean is not a byte string. It is a string of integers, scalars, points, hashes, strings, and whatnot. The schema identifier tells us what fields follow, thereby enabling us to parse them out an otherwise endless stream of otherwise meaningless data.
When one is dealing with scheres in C++ one is probably going to have a
pointer that could point to a variety of different structs, (additive type that
points to a variety of multiplicative types) which you would represent in
C++ as an std::variant type
.
Which implies that the party writing and the party reading have to have agreement on the mapping between schemas and the integers identifying schemas. When two people share data, the protocol they agree upon implies a complete set of integers identifiers of a set of schemas, which means we will frequently wind up issuing new protocols, and wind up with a fair bit of protocol negotiation.
Suppose we have a file just sitting by itself, not part of a connection. Well, it just has to start with the schema id that identifies the protocol it would have if it was part of a connection – so a protocol id is itself a schema id, which like any schema id gives the mapping of the schema ids whose scheres it contains.
Finding the network address of a master key, and the authority of the subkey that represents that master key at that network address
Suppose someone has a high value reputation, and he wants anyone anywhere to be able to connect to his server, and obtain an authenticated connection, a connection that the person connecting knows is controlled by the entity that has this reputation.
He has the master secret corresponding to the public key connected to this reputation, but because it is a high value secret, it is not on any computer anywhere. It is written in the margin of a page in a bible kept on his bookshelf, so cannot be used to authenticate the connection.
His secret master key is used to sign another a public subkey, which resides on a little used computer seldom if ever connected to the network, which signs another public key on a computer generally connected to the internet, which signs the public keys on his servers. When a client computer connects to one of his servers, and asks for a connection authenticated with this reputation, the client gets a connection authenticated by a key signed by a key signed by a key signed by the key whose secret is no longer in any computer, but is in the margin of a bible on someone’s bookshelf.
If the client computer has no copy of a certificate signed by this master key, it cannot connect because it does not know the network address. If it has a copy that testifies to the network address, but the certificate is timed out or nonexistent, it asks for a connection authenticated by the master key, and then, if it does not get a connection authenticated by the master key, gets a certificate authenticating a key communicated over the initial encrypted but unauthenticated connection, and forms a connection authenticated by the certified key. If it does not get either a certificate and authentication by the certified key or authentication by the master key, the connection fails, and it goes looking for a certificate.
But how does the entity that signs its network address know what its network address is?
If the master contacts it over the internet, no problem. The master knows network address at which he contacted it and has authority to tell it – but that might be a merely local network address. How does random computer find its network address?
The master on his client that he controls probably knows the network address of the server that he also controls, that being how he likely controls it, but this is not guaranteed to also be the situation.
If an entity wants to sign a network address so that others can contact it, it is publishing its network address into a pool of network addresses, so it already has other entities it can talk to and ask “What is my IP?” So it tries some entities at random, asks for its IP over the authenticated connection, and asks them to to open a connection on its port number to that IP.
Its prior is a high probability that they will all give the same answer and few if any give a different answer, and a prior that all of them will fail to call back or most of them will callback. It updates its priors after each call, until it has a high probability that the majority agree on its IP, and it is definitely contactable at this port, or uncontactable at this port.
Finally, the format
After long, long, long, discussion on the requirements for a format and the meaning of the format:
A signature consists of a schema id for signatures, followed by an arbitrary schere, followed by the public key, and the two elliptic scalars that form the Schnorr signature. The null schere, whose schema id is zero, is permissible, in which case the signature is a proof that the secret corresponding to this public key is known, which matters if it is used in a multisignature.
We also have schema ids for multisignatures – one for signature with two keys with two distinct roles, and one for a variable number of keys with symmetric and equivalent roles.
When a schere is signed, it is public and goes into public shared data. When it is merely communicated over an authenticated channel, the recipient handles it as if signed by the sender, but cannot prove to anyone else it originated from the sender, so stores the same data in its private database, rather than its public database.
Public information on network addresses and key authorities will be signed, and that is signed implies it is public and should be widely distributed. But often we do not want this information distributed, in which case these data structures should be authenticated but not signed, in which case it gets stored in the wallet and is not routinely and automatically distributed.
We will by default couple information on network addresses to information on key authorization, so that we do not run into the DNS fake network address problem (lack of certification) nor the CA key repudiation problem (failure to get up to date certificates).
But if authorization is distributed with network addresses, rather than being provided by the authorized key on request, then the authorization has to be signed by both the key authorized, and the key doing the authorizing. We don’t want random scammers to be able to claim to be the real power behind the power, neither do we want captured websites hanging on to authorizations that have been obsoleted because of capture.
A network address will be signed by the key whose network address it is, not by the key granting it authority. A key needs no authorization to tell us where it can be contacted, and any key giving us the network address of some other key would need evidence of authority to do so, which would get us deep in the weeds.
However, because the final link in the chain is jointly signed, it may contain the network address as well. Or we may two scheres, one signed by the last vertex in the chain giving authorization, and one signed only by the final leaf giving the network address. The former would be best for stable network addresses, the latter best for unstable network addresses. Both should be permissible.
When the network address changes, the key probably changes, and vice versa, so they should usually and normally be distributed together. Or maybe no contact information at all, implying that the owner of the master key only wants to be contacted by people who have received contact information by another, less widely shared channel. Maybe, as with product or supplier reviews, he wants everyone to be able to check his signature on data widely redistributed by someone else, but does not want everyone who reads that widely redistributed data to be able to send him messages.
OK, the arbitrary signed schere in the case we are here discussing is an authentication schere. Which is only meaningful as part of a signature, so will only ever be hashed as part of a full authentication, so has no hash rule as an isolated schere, so we can re-use schema ids, authentication data being useless without being contained within a signature.
The authentication schema consists of the key that is being granted authority, an arbitrary precision integer that represents a set of flags, an authentication date as a multiple of 256 seconds which signifies that the authentication supersedes all earlier authentications, and also that the authentication does not take effect until the indicated date. (If you have a bunch of servers with a bunch of keys all representing one master key, you have to re-authenticate all of them with the same start date), and an end date to encourage people to reauthenticate every now and then. An end date earlier than the start date is invalid, shall be rejected, and have no effect, except that an end date of zero indicates the authentication remains valid till superseded, till the client sees a certificate with a later start date. A certificate with an end date some astronomical time into the future may be rejected, or may be silently discarded and have no effect. Use an end date of zero to represent “never expires”. A certificate with a start date some unreasonable time into the future will not spread through the network till its start date draws nigh.
A key that has been authenticated has authority to grant the same authority or a subset of the flags that have been set to it for another key, with an end date less than or equal to its end date, and a start date greater than or equal to its start date.
-
Bit revokes all other keys, signifies that this chain of authorizations invalidates all previous chains of authorizations from the same root, where one chain is previous to the other if the start times in its chain, considered as a Dewey number, are earlier than the other. A chain that is identical except for the times gets invalidated anyway, but this bit is a key revocation bit – when you don’t want some other key trusted any more. Most of the time there will be one and only one valid key chain for one root, and most of the time this bit will be set.
-
Bit indicates that this key may be used to authenticate a connection as under the control of the entity at the root of the chain. This subkey can do anything the master key can do, except extend its timeout, or create subsubkeys with a timeout beyond its own (the equivalent of gpg subkey, can sign, authenticate, accept payment, whatever.)
-
Bit is the signing bit. It indicates that this key can sign data on behalf of this master key, sign as the entity at the root of the chain. One typically signs data that will be delivered to the recipient through an untrusted intermediary, as for example, downloading a rhocoin wallet, or a peer making an assertion about the most recent root or block of the blockchain.
-
Bit indicates that this key may be used to make an offer in the identity of the entity at the root of the chain.
-
Bit indicates that this key may be used to accept crypto currency as the entity at the root of the chain. An offer is likely to be made the contactable key at the leaf of the chain which has no authority to accept payment, which requests payment to an uncontactable key closer to the root of the chain which does have authority to accept payment. A payment request identifies the rhocoin public keys it wants by an elliptic scalar, and the public key of the rhocoin is the accepting key multiplied by that scalar. The payer therefore has proof he paid that entity and is owed something, even if the money goes astray.
-
Bit indicates that more authorities follow, in the form of an ordered sequence of arbitrary precision integers, terminated by zero, thus enabling people to roll their own authorities, ad hoc.
For authorities whose number is less than 64, the bitstring representation and the list of integers representation are equivalent - we may provide a long bit mask, or a zero terminated list of integers. Some implementations may refuse to accept bitstrings longer than 64 bits, generating a bad data exception.
We will have a variety of contact information schemas. The contact information needs to be signed by the key to be contacted, but that turns out to have surprisingly messy logistics in getting things rolling. When you are setting up a server, you want to both grant authority to its secret key (which is what “Lets Encrypt” does), and publish its network address which is what the DNS does. With “lets encrypt” you publish the network address insecurely, then “lets encrypt” insecurely finds a host key claiming the name at that network address. Which system works only because of the good behavior of the centralized authority authorizing domain name service.
The entity that has the master secret somehow controls both machines. That is, after all, what we want to prove to third parties. To generate the necessary certificates, the machine without the master secret has to have a connection to the machine that has the master secret. One initiates a connection to the other, then they generate the necessary certificates. The master at that point learns the slave public key and proves to himself that network address works. But the machine with the key does not necessarily know its own network address. The party authorizing the key of machine being authorized does know it, and the two machines can trust each other because under the control of the same party.
Whenever two parties communicate, they can verify the external network addresses associated with the other’s key, but not the external network address associated with their own key. And if we are talking authorization, we have a trust relationship that can be used to prove the network address to key at that network address. To avoid the key revocation problem, it is easier and safer if network address information is distributed with authorization information.
The key actually used is authenticated by the master key through a chain of certificates, which are normally gathered together in yet another higher level schere, which to be valid must have a valid link in each link in the chain. This higher level chain contains all the signatures, plus network address information, but the individual links in the chain, and the signed network address, are independently valid, and do not have to be actually contiguous and in order in a chain to be useful. The higher level schere is useful merely because it is sometimes convenient to pack related data as one big ordered bundle, a pile of facts only useful because they prove one fact, the beginning and end of the chain.
If someone is relying on a chain of authorities, any key in that chain can sign network address for itself or any descendants in the chain. But this requires yet another schema, which should combine a grant of authority with a network address.