343 lines
15 KiB
HTML
343 lines
15 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en"><head>
|
||
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
<style>
|
||
body {
|
||
max-width: 30em;
|
||
margin-left: 2em;
|
||
}
|
||
p.center {
|
||
text-align:center;
|
||
}
|
||
</style><title>generic client server program</title></head>
|
||
<body>
|
||
|
||
<p><a href="./index.html"> To Home page</a> </p>
|
||
|
||
<h1>Generic Client Server Program</h1><p>
|
||
|
||
|
||
Need a MIDL like language which specifies messages, and
|
||
generates skeleton program with key handling and
|
||
protocol negotiation, in which the distributed public
|
||
keys are hashes of rules for recognizing valid full
|
||
asymmetric encryption public keys, which interface with a
|
||
key distribution and key management system, and which
|
||
generates skeleton client server programs with full
|
||
protocol negotiation, DDoS resistance, end to end
|
||
encryption, support for once and only once messaging,
|
||
with or without in order messaging support for at least
|
||
once messaging, with or without in order messaging.
|
||
</p>
|
||
|
||
<p>We already have such a message specification language, Cap'n Proto, which will generate message crackers and message builders, but we don’t have a client server generator for conversing on the generated messages. But Cap'n Proto can generate quite a bit of stuff. We also have the Boost Asynch library, and the wxWidget library. all of which overlap to some substantial extent with what is being proposed here.</p>
|
||
|
||
<p>A single endpoint is displayed to the user with a
|
||
single name, and for all communications and all
|
||
protocols with an endpoint under all applications that
|
||
that should conceived of by the particular user a single
|
||
site with a single name, and for all applications
|
||
displaying information from that site to a single user,
|
||
there should be a single encryption and authentication
|
||
key per session, many channels per protocol, many
|
||
protocols per session, one key for all – thereby avoiding
|
||
the security holes that result from the browser
|
||
recreating keys many times in what is perceived by the
|
||
user to be single session with a single entity. In the
|
||
brower, web pages nominally from the same origin can
|
||
modify web pages secured using a different certificate
|
||
and a different certificate authority. To avoid the
|
||
endless security complications that ensue, all
|
||
connections in a single session should rest on a single
|
||
shared secret – we should never recreate shared secrets
|
||
from public keys in the course of a session or a
|
||
transaction. </p><p>
|
||
|
||
The keying mechanism should support secret handshakes,
|
||
for the core revolutionary problem is the long march
|
||
through the institutions. </p><p>
|
||
|
||
The keying mechanism should also support the corporate
|
||
form – not sure if anything special needs to be done to
|
||
support signatures that represent shareholder votes.
|
||
</p><p>
|
||
|
||
The protocol negotiation mechanism should enable anyone
|
||
to add their own protocol without consulting anyone and
|
||
without disruption or significant inefficiency
|
||
ensuing, while guaranteeing that once a connection is
|
||
set up, both ends are talking the same protocol.
|
||
</p><p>
|
||
|
||
The keying mechanism shall start with a hash type
|
||
identifier, followed by the hash. The hash is a hash of
|
||
a rule prescribing a valid key in full binary.
|
||
Permissible rules are</p><ul><li>
|
||
|
||
The hash is the hash of full key itself. </li><li>
|
||
|
||
The hash is the hash of a full key plus a name path.
|
||
This rule implies that a valid key consists of that key
|
||
signing another key, which signs another key, which signs
|
||
another key, which keys are named according to the name
|
||
path, and have expiry dates that are mutually consistent
|
||
(a key cannot sign a key that expires beyond its own
|
||
expiry) and have not expired yet. Each key in the
|
||
sequence except the first is declared by the preceding
|
||
signature to have a name, creation date, and expiry date,
|
||
and that name agrees with the name path in the hash.
|
||
The highest key in the chain has a name and creation
|
||
dates specified by the hash, but no expiry date. Where
|
||
successive keys in the chain of signatures have the same
|
||
name, it corresponds to a single link.
|
||
|
||
The full key itself with an expiry date and preceding
|
||
hash key. We can then change short keys – which
|
||
implies many short keys can correspond to the same
|
||
identity. </li></ul><p>
|
||
|
||
|
||
. </p><p>
|
||
|
||
. </p><p>
|
||
|
||
|
||
|
||
Whenever one writes a client server program, and whenever
|
||
one writes multi threaded software on the shared-nothing
|
||
model, for example python with threads connected by
|
||
queues, one tends to wind up violating the <a href="http://c2.com/cgi/wiki?DontRepeatYourself">Don’t
|
||
Repeat Yourself</a> principle, and the <a href="http://c2.com/cgi/wiki?OnceAndOnlyOnce">Once And
|
||
Only Once</a> principle, eventually leading to total code
|
||
meltdown in any very large and very long lived project, so
|
||
that it eventually becomes impossible to make any further
|
||
substantial changes to the project that involve changing
|
||
the interface between client and server, even when it
|
||
becomes apparent that the architecture contains utterly
|
||
disastrous failings that are complete show stoppers.
|
||
</p><p>
|
||
|
||
The present browser crisis with https is an example of
|
||
such an insoluble show stopper, one example of a great
|
||
many. </p><p>
|
||
|
||
Many of the requirements for a generic server program
|
||
are discussed in Beowulf’s <a href="http://qmail_security.pdf">retrospective on
|
||
Qmail</a>. The Unix generic server program is inetd.
|
||
</p><p>
|
||
|
||
A server program has to recover from crashes, and give
|
||
performance that degrades gracefully in a resource
|
||
limited situation rather than crashing and burning.
|
||
</p><p>
|
||
|
||
Inetd listens for network connections. When a
|
||
connection is made, inetd runs another program to handle
|
||
the connection. For example, inetd can run qmail-smtpd
|
||
to handle an incoming SMTP connection. The qmail-smtpd
|
||
program doesn’t have to worry about networking,
|
||
multitasking, etc.; it receives SMTP commands from one
|
||
client on its standard input, and sends the responses to
|
||
its standard output. </p><p>
|
||
|
||
If a particular instance of qmail-smtpd crashes or
|
||
hangs, it is not a problem. Each instance does limited
|
||
processing, stuffs the result into a queue, activates the
|
||
program that processes the queue if it is not presently
|
||
running, and shuts down. The queue is processed with
|
||
limited parallelism, thus is somewhat resistance resource
|
||
limited crash and burn. </p><p>
|
||
|
||
I envisage something much more capable – that instead
|
||
of specifying an IO stream, one specifies an interface,
|
||
which gets compiled into message crackers, a skeleton
|
||
program with protocol negotiation, and unit tests.
|
||
</p><p>
|
||
|
||
In accordance with the <a href="http://c2.com/cgi/wiki?DontRepeatYourself">Don’t
|
||
Repeat Yourself</a> principle, the specification is human
|
||
readable, and records the history of changes and variants
|
||
in the specification. The history of changes and
|
||
variants is compiled into protocol negotiation code for
|
||
both sides of the interface, and the specification itself
|
||
is compiled into message generators and message crackers
|
||
for both sides of the interface. The generated code
|
||
is regenerated every major make, and is never checked
|
||
into source code control, nor edited by humans, though
|
||
the generated skeleton code may be copied by humans as a
|
||
starting point. </p><p>
|
||
|
||
We use a compiler compiler such as CoCo/R to generate a
|
||
compiler that compiles the interface specification to the
|
||
program. An interface specification specifies several
|
||
interface versions. In the absence of C code, it
|
||
generates a skeleton program. In the presence of C code
|
||
for the previous version of the protocol, it adds a
|
||
skeleton for the new version of the protocol to the
|
||
existing program. (I favor CoCo, because it can compile
|
||
itself, though it does not seem a very popular choice.)
|
||
</p><p>
|
||
|
||
This functionality is akin to MIDL. MIDL was a
|
||
Microsoft compiler that compiled interface description
|
||
language into C++ code that implemented that interface
|
||
and interface negotiation – thereby ensuring that
|
||
everyone used the same methods to find out what
|
||
interfaces were available, and to advertise what
|
||
interfaces they made available.
|
||
|
||
MIDL/IDL/COM was designed for calls within a single
|
||
address space and a single thread, and worked great for
|
||
this problem, but their efforts to extend it to inter
|
||
thread, inter process, and across network calls varied
|
||
from bad to catastrophic. </p><p>
|
||
|
||
Google’s protobuf is designed to work between threads -
|
||
it is a message specification protocl, but lacks the
|
||
key feature of MIDL: Protocol negotiation – no run time
|
||
guarantee that data was serialized the way it will be
|
||
deserialized. </p><p>
|
||
|
||
We can also use a meta programming system such as Boost,
|
||
which gives C++ lisp like meta programming
|
||
capabilities. Unfortunately Boost, though a no doubt
|
||
excellent language, runs on the virtual machine provided
|
||
by compilers to implement templates, and the most trivial
|
||
operations suck up large amounts of stack space and
|
||
symbol space, so despite the coolness of the language, I
|
||
expect it to die horribly in real world applications.
|
||
</p><p>
|
||
|
||
We want message crackers, so as to protect against the
|
||
buffer overflow problem. But what about the resource
|
||
limit problem? </p><p>
|
||
|
||
Launching a new program for every connection is costly,
|
||
even in Unix, and much more costly for Windows. I
|
||
envisage that the server program will use the TBB,
|
||
creating a new thread for each connection. That is
|
||
efficient, but it means that a failure in one connection
|
||
can inconvenience all others, that a bug can allow one
|
||
thread to access information from other threads. For
|
||
the latter problem, I think the answer is just "no bugs"
|
||
- or at least no bugs that allow access to random memory,
|
||
but there is one bug we are bound to have: Resource
|
||
exhaustion. </p><p>
|
||
|
||
How does the generic server program, the program
|
||
generated for a particular interface specification,
|
||
handle resource exhaustion? </p><p>
|
||
|
||
We need our program to be neutral to DDoS – does not
|
||
allow anything that is cheap for an anonymous
|
||
attacker’s machine but expensive for the server
|
||
machine, and we need our program to degrade gracefully
|
||
when legitimate usage exceeds its capability. </p><p>
|
||
|
||
First, when establishing new connections, we have a
|
||
limited cache for connections in the process of being
|
||
established. If that cache is exceeded, we send an
|
||
encrypted cookie to the client, and stop holding state
|
||
for connections in progress – see the discussion of
|
||
defenses against a distributed denial of service attack
|
||
on TCP – syn flooding and TCP cookies. </p><p>
|
||
|
||
Our address space for connections is large and variable
|
||
precision. Each incoming packet contains a clear index
|
||
to a shared secret, which is used to decrypt the first
|
||
block in the incoming packet, which has to be correct
|
||
format and in window for the connection stream, or the
|
||
packet gets discarded. We now, after the decryption,
|
||
have the connection stream identifier, which may contain
|
||
an index to a larger set of shared secrets, a set of
|
||
connections and streams larger than the in memory list
|
||
for shared secrets for the initial block. </p><p>
|
||
|
||
Having identified that the stream is legit, we then
|
||
check if the packet of the stream corresponds to a
|
||
presently running thread of a presently running
|
||
protocol interpreter. If it is, we dispatch the packet
|
||
to that program. If it is not, but the protocol
|
||
interpreter is running, we dispatch the packet, and the
|
||
protocol interpreter recovers the state from the database
|
||
and launches a new thread for that stream. Similarly,
|
||
if the protocol interpreter is not running . . .
|
||
</p><p>
|
||
|
||
Each thread in the protocol interpreter times out after
|
||
a bit of inactivity, and saves its state to the
|
||
database. Persistently busy threads time out
|
||
regardless, to discriminate against needy clients. When
|
||
no threads remain, the protocol interpreter shuts down.
|
||
From time to time we launch new instances of the protocol
|
||
interpreter, (we being the master interpreter that
|
||
handles all protocols) and failure of the old instance to
|
||
shut down within a reasonable time is detected and
|
||
presented as an error. </p><p>
|
||
|
||
The master interpreter monitors resource usage, and
|
||
gracefully declines to open new connections, and shuts
|
||
down hoggish old connections, when resources get
|
||
sparse. The skeleton interpreter generated for each
|
||
protocol has a cache limit and database limit for the
|
||
number of connections, and an idle and absolute time
|
||
limit on cache and database connections – when the cache
|
||
limit is exceeded, the connection information goes to
|
||
database, when the database limit is exceeded, the
|
||
connection is terminated. </p><p>
|
||
|
||
Internet facing programs will always encounter malicious
|
||
data. The two huge problems in C and C++ internet
|
||
facing programs are buffer overflow and malloc failure.
|
||
It is possible to take care of buffer overflow just by
|
||
tightly vetting all string inputs. All string inputs
|
||
have to have a defined format and defined maximum length,
|
||
and be vetted for conformity. This is doable, problem
|
||
solved – and the protocol definition should specify
|
||
restraints on strings, with default restraints if none
|
||
specified, and the code generated by the protocol
|
||
interpreter should contain such checks, guaranteeing that
|
||
all input strings have defined maximums, and defined
|
||
forbidden or permitted characters. Malloc, however is a
|
||
harder problem. No one is going to write and test error
|
||
recovery from every malloc. We therefore have to
|
||
redefine malloc in the library as malloc_nofail. If
|
||
someone <em>is</em> going to write error recovery code,
|
||
he can explicitly call malloc_can_fail. If
|
||
malloc_no_fail fails, program instance shuts down,
|
||
thereby relieving resource shortage by degrading service
|
||
if the malloc failure is caused by server overload, or
|
||
frustrating the attack if the malloc failure is caused by
|
||
some clever attack.</p><p>
|
||
|
||
We cannot ensure that nothing can ever go wrong, therefore
|
||
we must have a crash-and-restart process, that detects
|
||
process failure, and auto relaunches. Unix encourages
|
||
this code pattern, by providing Inetd. This is perhaps,
|
||
overly costly, but continually spawning and killing off
|
||
new processes is inherently robust. Therefore, ever so
|
||
often, we must spawn a new instance, and every so often,
|
||
old instances must be destroyed. </p><p>
|
||
|
||
The protocol interpreter should automatically generate
|
||
such a robust architecture – a working skeleton program
|
||
that is internet facing and architected in ways that make
|
||
programs derived from it unlikely to fail under attack.
|
||
The experience with web servers is that the efficient and
|
||
robust solution is multiple instances, each with multiple
|
||
threads. For robustness, retire an instance after a
|
||
while. One thread per client session, new sessions with
|
||
same client in the same instance where possible (what is
|
||
a session is sometimes unclear) and after a while, an old
|
||
instance gets no new threads, and is eventually forcibly
|
||
shut down, if it does not shut itself down when out of
|
||
client threads and no prospect of getting new ones.
|
||
</p>
|
||
|
||
<p style="background-color : #ccffcc; font-size:80%">These documents are
|
||
licensed under the <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">Creative
|
||
Commons Attribution-Share Alike 3.0 License</a></p>
|
||
|
||
</body></html>
|