Because protocols need to be changed, improved, and fixed from time to
time, it is essential to have a protocol negotiation step at the start of every networked interaction, and protocol requirements at the start of every store
and forward communication.
But we also want anyone, anywhere, to be able to introduce new
protocols, without having to coordinate with everyone else, as attempts to
coordinate the introduction of new protocols have ground to a halt, as
more and more people are involved in coordination and making decisions.
The IETF is paralyzed and moribund.
So we need a large enough address space that anyone can give his
protocol an identifier without fear of stepping on someone else’s identifier.
But this involves inefficiently long protocol identifiers, which can become
painful if we have lots of protocol negotiation, where one system asks
another system what protocols it supports. We might have lots of
protocols in lots of variants each with long names.
So our system forms a guess as to the likelihood of a protocol, and then
sends or requests enough bits to reliably identify that protocol. But this
means it must estimate probabilities from limited data. If one’s data is
limited, priors matter, and thus a Bayesian approach is required.
We see the beta distributed part of the probability distribution keeps
getting smaller, and the delta distributed part of the probability keeps
getting higher.
And our estimate that the second sample will also be X is
$$\frac{8}{9}$$
After two samples, n=2, our new estimate is
Probability $\frac{1}{4}$
Probability distribution $\frac{1}{4}ρ^2+\frac{3}{4}δ(1−ρ)$
And our estimate that the third sample will also be X is $\frac{15}{16}$
By induction, after n samples, all of them members of category X, our new
estimate for one more sample is
$$1-(n+2)^{-2}=\frac{(n+3)×(n+1)}{(n+2)^2}$$
Our estimate that the run will continue forever is
$$\frac{(n+1)}{n+2}$$
Which corresponds to our intuition on the question “all men are mortal” If we find no immortals in one hundred men, we think it highly improbable that we will encounter any immortals in a billion men.
In contrast, if we assume the beta distribution, this implies that the likelihood of the run continuing forever is zero.
The metalogistic distribution is like the Beta distribution in that
its Bayesian update is also a metalogistic distribution, but has more terms,
as many terms as are required for the nature of the thing being represented.
The Beta distribution plus two delta functions is a metalogistic distribution
if we stretch the definition of the metalogistic distribution slightly.
The Beta distribution represents the probability of a probability
(since we are using it for its Bayesian update capability).
For example we have a collection of urns containing red and blue balls,
and from time to time we draw a ball out of an urn and replace it,
whereupon the Beta distribution is our best guess
about the likelihood that it contains a certain ratio of red and blue balls
(also assuming the urns are enormously large,
and also always contain at least some red and at least some blue balls)
Suppose, however, the jars contain gravel, the size of each piece
of gravel in a jar being normally distributed, and we want to
estimate the size and standard deviation of the gravel in an urn,
rather than the ratio of red balls and blue balls.
(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative, but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed.)
Whereupon our Baysian updates become more complex,
and our prior has to contain difficult to justify information
(no boulders or dust in the urns), but we are still doing Bayesian updates,
hence the Beta distribution, and its generalization