"How to communicate peer-to-peer through NAT firewalls"
{target="_blank"}
Another source says that "most NAT tables expire within 60 seconds, so
NAT keepalive allows phone ports to remain open by sending a UDP
packet every 25-50 seconds".
The no brainer way is that each party pings the other at a mutually agreed
time every 15 seconds. Which is a significant cost in bandwidth. But if a
server has 4Mib/s of internet bandwidth, can support keepalives for couple
of million clients. On the other hand, someone on cell phone data with thirty
peers is going to make a significant dent in his bandwidth.
With client to client keepalives, probably a client will seldom have more
than dozen peers. Suppose each keepalive is sent 15 seconds after the
counterparty's previous packet, or an expected keepalive is not received,
and each keepalive acks received packets. If not receiving expected acks
or expected keepalives, we send nack keepalives (hello-are-you-there
packets) one per second, until we give up.
This algorithm should not be baked in stone, but rather should be an
option in the connection negotiation, so that we can do new algorithms as
the NAT problem changes, as it continually does.
If two parties are trying to setup a connection through a third party broker,
they both fire packets at each other (at each other's IP as seen by the
broker) at the same broker time minus half the broker round trip time. If
they don't get a packet in the sum of the broker round trip times, keep
firing with slow exponential backoff until connection is achieved,or until
exponential backoff approaches the twenty second limit.
Their initial setup packets should be steganographed as TCP startup
handshake packets.
We assume a global map of peers that form a mesh whereby you can get
connections, but not everyone has to participate in that mesh. They can be
clients of such a peer, and only inform selected counterparties as to whom
they are a client of.
The protocol for a program to open port forwarding is part of Universal Plug and Play, UPnP, which was invented by Microsoft but is now ISO/IEC 29341 and is implemented in most SOHO routers.
But is it generally turned off by default, or manually. Needless to say, if relatively benign Bitcoin software can poke a hole in the
firewall and set up a port forward, so can botnet malware.
The standard for poking a transient hole in a NAT is STUN, which only works for UDP – but generally works – not always, but most of the time. This problem everyone has dealt with, and there are standards, but not libraries, for dealing with it. There should be a library for dealing with it – but then you have to deal with names and keys, and have a reliability and bandwidth management layer on top of UDP.
But if our messages are reasonably short and not terribly frequent, as client messages tend to be, link level buffering at the physical level will take care of bandwidth management, and reliability consists of message received, or message not received. For short messages between peers, we can probably go UDP and retry.
STUN and ISO/IEC 29341 are incomplete, and most libraries that supply implementations are far too complete – you just want a banana, and you get the entire jungle.
Note that the internet does not in fact use the OSI model though everyone talks as if it did. Internet layers correspond only vaguely to OSI layers, being instead:
Assume an identity system that finds the entity you want to
talk to.
If it is behind a firewall, you cannot notify it, cannot
send an interrupt, cannot ring its phone.
Assume the identity system can notify it. Maybe it has a
permanent connection to an entity in the identity system.
Your target agrees to take the call. Both parties are
informed of each other’s IP address and port number on which
they will be taking the call by the identity system.
Both parties send off introduction UDP packets to the
other’s IP address and port number – thereby punching holes
in their firewall for return packets. When they get
a return packet, an introduction acknowledgement, the
connection is assumed established.
It is that simple.
Of course networks are necessarily non deterministic,
therefore all beliefs about the state of the network need to
be represented in a Bayesian manner, so any
assumption must be handled in such a manner that the
computer is capable of doubting it.
We have finite, and slowly changing, probability that our
packets get into the cloud, a finite and slowly changing
probability that our messages get from the cloud to our
target. We have finite probability that our target
has opened its firewall, finite probability that our
target can open its firewall, which transitions to
extremely high probability when we get an
acknowledgement – which prior probability diminishes over
time.
As I observe in [Estimating Frequencies from Small Samples](./estimating_frequencies_from_small_samples.html) any adequately flexible representation of the state of
the network has to be complex, a fairly large body of data,