forked from cheng/wallet
309 lines
17 KiB
Markdown
309 lines
17 KiB
Markdown
---
|
||
title: Review of Cryptographic libraries
|
||
...
|
||
|
||
# Noise Protocol Framework
|
||
|
||
The Noise Protocol Framework matters because used by Wireguard to do something related to what we intend to accomplish.
|
||
|
||
Noise is an already existent messaging protocol, implemented in
|
||
Wireguard as a UDP only protocol.
|
||
|
||
My fundamental objective is to secure the social net, particularly the
|
||
social net where the money is, the value of most corporations being
|
||
the network of customer relationships, employee relationships,
|
||
supplier relationships, and employee roles.
|
||
|
||
This requires that instead of packets being routed to network
|
||
addresses identified by certificate authority names and the domain
|
||
name system, they are routed to public keys that reflect a private
|
||
key derived from the master secret of a wallet.
|
||
|
||
## Wireguard Noise
|
||
|
||
Wireguard maps network addresses to public keys, and then to the
|
||
possessor of the secret key corresponding to that public key. We
|
||
need a system that maps names to public keys, and then packets to
|
||
the possessor of the secret key. So that you can connect to service
|
||
on some port of some computer, which you locate by its public key.
|
||
|
||
Existing software looks up a name, finds an thirty two bit or one
|
||
twenty eight bit value, and then connects. We need that name to
|
||
map through software that we control to a durable and attested
|
||
public key, which is then, for random strangers not listed in the conf
|
||
file, locally, arbitrarily and temporarily mapped into Wireguard
|
||
subnets , which mapping is actually a local and temporary handle to
|
||
that public key, which is then mapped back to the public key, which
|
||
is then mapped to the network address of the actual owner of that
|
||
secret key by software that we control. So that software that we do
|
||
not control thinks it is using network addresses, but is actually using
|
||
local handles to public keys which are then mapped to network
|
||
address supported by our virtual network card, which sends them off,
|
||
encapsulated in Wireguard style packets identified by the public
|
||
key of their destination to a host in the cloud identified by its actual
|
||
network address, which then routes them by public key, either to a
|
||
particular local port on that host itself by public key, or to another
|
||
host by public key which then routes them eventually by public key
|
||
to a particular port.
|
||
|
||
For random strangers on the internet, we have to in effect NAT
|
||
them into our Wireguard subnets, and we don't want them able to
|
||
connect to arbitrary ports, so we in effect give them NAT type port forwarding.
|
||
|
||
It will frequently be convenient to have only one port forwarded
|
||
address per public key, in which case our Wireguard fork needs to
|
||
accept several public keys, one for each service.
|
||
|
||
The legacy software process running on the client initiates a
|
||
connection to a name and a port, from a random client port. The
|
||
legacy server process receives it on the whitelisted port ignoring
|
||
the port requested, if only one incoming port is whitelisted for
|
||
this key, or to the requested whitelisted port if more than one port
|
||
is whitelisted. It replies to the original client port, which was
|
||
encapsulated, with the port being replied to encapsulated in the
|
||
message secured and identified by public key, and the receiving
|
||
networking software on the client has temporarily whitelisted that
|
||
client port for messages coming from that server key. Such
|
||
"temporary" white listing should last for a very long time, since we
|
||
might have quiet but very long lived connections. We do not want
|
||
random people on the internet messaging us, but we do want people
|
||
that we have messaged to randomly messaging at random times the
|
||
service that message them.
|
||
|
||
One confusing problem is that stable ports are used to identify a
|
||
particular service, and random ports a particular connection, and we
|
||
have to disentangle this relationship and distinguish connection
|
||
identifiers, from service identifiers. We would like public keys to
|
||
identify services, rather than hosts but sometimes, they will not.
|
||
|
||
Whitelist and history helps us disentangle them when connecting to
|
||
legacy software, and, within the protocol, they need to be
|
||
distinguished even though they will be lumped back together when
|
||
talking to legacy software. Internally, we need to distinguish
|
||
between connections and services. A service is not a connection.
|
||
|
||
Note that the new Google https allows many short lived streams,
|
||
hence many connections, identified by a single server service port
|
||
and a single random client port, which ordinarily would identify a
|
||
single connection. A connection corresponds to a single concurrent
|
||
process within client software, and single concurrent process within
|
||
server software, and many messages may pass back and forth between
|
||
these two processes and are handled sequentially by those
|
||
processes, who have retrospective agreement about their total shared state.
|
||
|
||
So we have four very different kinds of things, which old type ports
|
||
mangle together
|
||
|
||
1. a service, which is always available as long as the host is up
|
||
and the internet is working, which might have no activity for
|
||
a very long time, or might have thousands of simultaneous
|
||
connections to computers from all over the internet
|
||
1. a connection, which might live while inactive for a very long time,
|
||
or might have many concurrent streams active simultaneously
|
||
1. a stream which has a single concurrent process attached to it
|
||
at both ends, and typically lives only to send a message and
|
||
receive a reply. A stream may pass many messages back and
|
||
forth, which both ends process sequentially. If a stream is
|
||
inactive for longer than a quite short period, it is likely to be
|
||
ungracefully terminated. Normally, it does something, and
|
||
then ends gracefully, and the next stream and the next
|
||
concurrent process starts when there is something to do. While a stream lives, both ends maintain state, albeit in a
|
||
request reply, the state lives only briefly.
|
||
1. A message.
|
||
|
||
Representing all this as a single kind of port, and packets going
|
||
between ports of a single kind, inherently leads to the mess that we
|
||
now have. They should have been thought of as different derived
|
||
classes with from a common base class.
|
||
|
||
[Endpoint-Independent Mapping]:https://datatracker.ietf.org/doc/html/rfc4787
|
||
{target="_blank"}
|
||
|
||
Existing software is designed to work with the explicit white listing
|
||
provided by port forwarding through NATs with [Endpoint-Independent Mapping],
|
||
and the implicit (but inconveniently
|
||
transient) white listing provided by NAT translation, so we make it
|
||
look like that to legacy software. To legacy client software, it is as if
|
||
sending its packets through a NAT, and to legacy server software, it
|
||
is sending its packets through a NAT with port forwarding. Albeit
|
||
we make the mapping extremely long lived, since we can rely on
|
||
stable identities and have no shortage of them. And we also want
|
||
the port mappings (actually internal port whitelistings, they would
|
||
be mappings if this was actual NAT) associated with each such
|
||
mapping to be extremely stable and long lived.
|
||
|
||
[Endpoint-Independent Mapping] means that the NAT reuses the
|
||
address and port mapping for subsequent packets sent from the
|
||
same internal port (X:x) to any external IP address and port (Y:y).
|
||
X1':x1' equals X2':x2' for all values of Y2:y2, which our architecture
|
||
inherently tends to force unless we do something excessively clever,
|
||
since we should not muck with ports randomly chosen. For us, [Endpoint-Independent Mapping] means that the mapping between
|
||
external public keys of random strangers not listed in our
|
||
configuration files, and the internal ranges of the Wireguard fork
|
||
interface is stable, very long lived and *independent of port numbers*.
|
||
|
||
## Noise architecture
|
||
|
||
[Noise](https://noiseprotocol.org/) is an architecture and a design document, not source code.
|
||
Example source code exists for it, though the [C example]
|
||
(https://github.com/rweather/noise-c) uses a build architecture that
|
||
may not fit with what I want, and uses protobuf, enemy software. It
|
||
also is designed to use several different implementations of the
|
||
core crypto protocols, one of them being libsodium, while I want a
|
||
pure libsodium only version. It might be easier to implement my
|
||
own version, using the existing versions as a guide, in particular and
|
||
especially Wireguard's version, since it is in wide use. Probably have
|
||
to walk through the existing version.
|
||
|
||
Noise is built around the ingenious central concept of using as the
|
||
nonce the hash of past shared and acknowledged data, which is
|
||
AEAD secured but sent in the clear. Which saves significant space
|
||
on very short messages, since you have to secure shared state
|
||
anyway. It regularly and routinely renegotiates keys, thus has no $2^{64}$
|
||
limit on messages. A 128 bit hash sample suffices for the nonce,
|
||
since the nonce of the next message will reflect the 256 bit hash of
|
||
the previous message, hence contriving a hash that has the same
|
||
nonce does the adversary no good. It is merely a denial of service.
|
||
|
||
I initially thought that this meant it had to be built on top of a
|
||
reliable messaging protocol, and it tends to be described as if it did,
|
||
but Wireguard uses a bunch of designs and libraries in its protocol,
|
||
with Noise pulling most of them together, and I need to copy,
|
||
rather than re-invent their work.
|
||
|
||
On the face of it, Wireguard does not help with what I want to do.
|
||
But I am discovering a whole lot of low level stuff related to
|
||
maintaining a connection, and Wireguard incorporates that low level stuff.
|
||
|
||
Noise goes underneath, and should be integrated with, reliable
|
||
messaging. It has a built in message limit of 2^16 bytes. It is not
|
||
just an algorithm, but very specific code.
|
||
|
||
Noise is messaging code. Here now, and present in Wireguard,
|
||
as a UDP only cryptographic protocol. I need to implement my
|
||
messaging system as a fork of Wireguard.
|
||
|
||
Wireguard uses base64, and my bright idea of slash6 gets in the
|
||
way. Going to use base52 for any purposes for which my bright idea
|
||
would have been useful, so should be rewritten to base64 regardless.
|
||
|
||
Using the hash of shared state goes together with immutable
|
||
append only Merkle‑patricia trees like ham and eggs, though you
|
||
don't need to keep the potentially enormous data structure around.
|
||
When a connection has no activity for a little while, you can discard
|
||
everything except a very small amount of data, primarily the keys,
|
||
the hash, the block number, the MTU, and the expected timings.
|
||
|
||
The Noise system for hashing all past data is complicated and ad
|
||
hoc. For greater generality and more systematic structure, for a
|
||
simpler fundamental structure with fewer arbitrary decisions about
|
||
particular types of data, needs to be rewritten as hashing like an
|
||
immutable append only Patricia Merkle tree. Which instantly and
|
||
totally breaks interoperability with existing Wireguard, so to talk
|
||
to the original Wireguard, has to know what it is talking to.
|
||
Presumably Wireguard has a protocol negotiation mechanism, that
|
||
you can hook. If it does not, well, it breaks and the nature of the
|
||
thing that public key addresses has to be flagged anyway, since I
|
||
am using Ristretto public keys, and they are not. Also, have to move
|
||
Wireguard from NACL encryption to Libsodium encryption, because
|
||
NACL is an attack vector.
|
||
|
||
Wireguard messages are distinguishable on the wire, which is odd,
|
||
because Noise messages are inherently white noise, and destination
|
||
keys are known in advance. Looks like enemy action by the bad guys at NACL.
|
||
|
||
I think a fork that if a key is an legacy key type, talks legacy
|
||
wireguard, and if a new type (probably coming from our domain
|
||
name system), though it can also be placed in `.conf` files) talks
|
||
with packets indistinguishable from white noise to an adversary that
|
||
does not know the key.
|
||
|
||
Old type session initiation messages are distinguishable from
|
||
random noise. For new type session initiation messages to a server
|
||
with an old type id and a new type id on the same port, make sure
|
||
that the new type session initiation packet does not match, which
|
||
may require both ends to try a variety of guesses if its expectations
|
||
are violated. Which opens a DOS attack, but that is OK. You just
|
||
shut down that connection. DOS resistance is going to require
|
||
messages readily distinguishable from random noise, but we don't
|
||
send those messages unless facing workloads suggestive of DOS,
|
||
unless under heavy session initiation load.
|
||
|
||
Ristretto keys are uncommon, and are recognizable as ristretto
|
||
keys, but not if they are sent in unreduced form.
|
||
|
||
Build on top a fork of Wireguard a messaging system that delivers
|
||
messages not to network addresses, but to Zooko names (which
|
||
might well map to a particular port on a particular host, but whose
|
||
network address and port may change without people noticing or caring.)
|
||
|
||
Noise is a messaging protocol. Wireguard is a messaging protocol
|
||
built on top of it that relies on public keys for routing messages.
|
||
Most of the work is done. It is not what I want built, but it has an
|
||
enormous amount of commonality. I plan a very different
|
||
architecture, but that is a re-arrangement of existing structures
|
||
already done. I am going to want Kademlia and a blockchain for the
|
||
routing, rather than a pile of local text files mapping IPs to nameless
|
||
public keys. Wireguard is built on `.conf` text files the way the
|
||
Domain name system was built on `host` files. It almost does the job,
|
||
needs a Kamelia based domain name system on top and integrated with it.
|
||
|
||
# [Libsodium](./building_and_using_libraries.html#instructions-for-libsodium)
|
||
|
||
# I2P
|
||
|
||
The [Invisible Internet Project](https://geti2p.net/en/about/intro) does a great deal of the chat capability that you want. You need to interface with their stuff, rather than duplicate it. In particular, your wallet identifiers need to be I2P identifiers, or have corresponding I2P identifiers, and your anonymized transactions should use the I2P network.
|
||
|
||
They have a substitute for UDP, and a substitute for TCP, and your anonymized transactions are going to use that.
|
||
|
||
# Amber
|
||
|
||
[Amber](https://github.com/bernedogit/amber)
|
||
|
||
Not as fast and efficient as libsodium, and further from Bernstein. Supports base 58, but [base58check](https://en.bitcoin.it/wiki/Base58Check_encoding#Base58_symbol_chart) is specifically bitcoin protocol, supporting run time typed checksummed cryptographically strong values. Note that any value you are displaying in base 58 form might as well be bitstreamed, for the nearest match between base 58 and base two is that 58^7^ is only very slightly larger than 2^41^, so you might as well use your prefix free encoding for the prefix.
|
||
|
||
[Curve25519](https://github.com/msotoodeh/curve25519)
|
||
|
||
Thirty two byte public key, thirty two byte private key.
|
||
|
||
Key agreement is X25519
|
||
|
||
Signing is ED25519. Sixty four byte signature.
|
||
|
||
Trouble is that amber does not include Bernstein’s assembly language optimizations.
|
||
|
||
[ED25519/Donna](https://github.com/floodyberry/ed25519-donna) does include Bernstein’s assembly language optimizations, but is designed to compile against OpenSSL. Probably needs some customization to compile against Amber. Libsodium is designed to be uncontaminated by NSA.
|
||
|
||
ED25519 does not directly support [Schnorr signatures](schnorr-signatures.pdf), being nonprime. Schnorr signatures can do multisig, useful for atomic exchanges between blockchains, which are multisig, or indeed arbitary algorithm sig. With some cleverness and care, they support atomic exchanges between independent block chains.
|
||
|
||
explanation of how to do [Schnorr multisignatures](https://www.ietf.org/archive/id/draft-ford-cfrg-cosi-00.txt) [using ED25519](https://crypto.stackexchange.com/questions/50448/schnorr-signatures-multisignature-support#50450)
|
||
|
||
Amber library packages all these in what is allegedly easy to incorporate form, but does not have Schnorr multisignatures.
|
||
|
||
[Bernstein paper](https://ed25519.cr.yp.to/software.html).
|
||
|
||
The fastest library I can find for pairing based crypto is [herumi](https://github.com/herumi/mcl).
|
||
|
||
How does this compare to [Curve25519](https://github.com/bernedogit/amber)?
|
||
|
||
There is a good discussion of the performance tradeoff for crypto and IOT in [this Internet Draft](https://datatracker.ietf.org/doc/draft-ietf-lwig-crypto-sensors/), currently in IETF last call:
|
||
|
||
From the abstract:.
|
||
|
||
> This memo describes challenges associated with securing resource-
|
||
> constrained smart object devices. The memo describes a possible
|
||
> deployment model where resource-constrained devices sign message
|
||
> objects, discusses the availability of cryptographic libraries for
|
||
> small devices and presents some preliminary experiences with those
|
||
> libraries for message signing on small devices. Lastly, the memo
|
||
> discusses trade-offs involving different types of security
|
||
> approaches.
|
||
|
||
The draft contains measurement and evaluations of libraries, allegedly
|
||
including herumi. But I don’t see any references to the Herumi library in
|
||
that document, nor any evaluations of the time required for pairing based
|
||
cryptography in that document. Relic-Toolkit is not Herumi and is supposedly
|
||
markedly slower than Herumi.
|
||
|
||
Looks like I will have to compile the libraries myself and run tests on them.
|