modified: docs/estimating_frequencies_from_small_samples.md

modified:   docs/libraries/cpp_automatic_memory_management.md
modified:   docs/libraries/time.md
modified:   docs/manifesto/consensus.md
renamed:    docs/notes/big_cirle_notation.md -> docs/notes/big_circle_notation.md
modified:   docs/writing_and_editing_documentation.md
This commit is contained in:
reaction.la 2024-06-28 07:52:41 +00:00
parent 9c5a393a93
commit 4678fba3ce
No known key found for this signature in database
14 changed files with 137 additions and 129 deletions

0
docs/design/mkdocs.sh Normal file → Executable file
View File

View File

@ -317,6 +317,25 @@ cases this is not going to be the cases, for a real is usually going to have som
finite distribution, and thus it is more likely to be OK to use the uncorrected
delta distribution
## Haldane prior $α=0$ $β=0$
This is a pathological and unusable prior, because $B(0,0)$ is infinite.
We cannot actually do anything with it, but has the desirable characteristic
that almost none or almost all are highly weighted, giving us much the same result as above.
It presupposes that we truly know nothing, but one has to start with some ground under one's feet.
One must always start with some leap of faith.
Since it is unusable, one may take $α=ɛ$, $β=ɛ$ where $ɛ$ is some small quantity, the smaller $ɛ$ is,
the smaller one's initial leap of faith If small, gives us a reasonable result for "all men are mortal".
Maybe immortals exist, but are enormously rare, and a small number of samples, assuming this prior,
gives us confidence that immortal men are extremely rare, but still predicts only very rare, not nonexistent.
$ɛ = \frac{1}{2}$ is a prior that generally gives results that are not too silly,
and avoids the complexity of our dual distribution delta function.
On the other hand, our dual distribution just gives us saner and more commonsensical conclusions.
## the metalog (metalogistic) distribution
The metalogistic distribution is like the Beta distribution in that
@ -326,25 +345,30 @@ as many terms as are required for the nature of the thing being represented.
The Beta distribution plus two delta functions is a metalogistic distribution
if we stretch the definition of the metalogistic distribution slightly.
The Beta distribution represents the probability of a probability
(since we are using it for its Bayesian update capability).
For example we have a collection of urns containing red and blue balls,
For example we have a collection of urns containing red and not red balls,
and from time to time we draw a ball out of an urn and replace it,
whereupon the Beta distribution is our best guess
about the likelihood that it contains a certain ratio of red and blue balls
about the likelihood that it contains a certain ratio of red and not red balls
(also assuming the urns are enormously large,
and also always contain at least some red and at least some blue balls)
and also always contain at least some red and at least some not red balls)
Suppose, however, the jars contain gravel, the size of each piece
of gravel in a jar being normally distributed, and we want to
estimate the size and standard deviation of the gravel in an urn,
rather than the ratio of red balls and blue balls.
rather than the ratio of red balls and not balls.
(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative, but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed.)
(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative,
but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed., or
perhaps we should assume a poisson distributon.)
Whereupon our Baysian updates become more complex,
and our prior has to contain difficult to justify information
(no boulders or dust in the urns), but we are still doing Bayesian updates,
hence the Beta distribution, and its generalization
hence the Beta distribution, or rather its generalization
the metalogistic distribution, still applies.
Typically we want to exclude outliers as evidence, so we assume
a metalogistic distribution that is the sum of two terms, one
narrowly distributed, and one broadly distributed with a long tail.

View File

@ -2,7 +2,9 @@
title:
C++ Automatic Memory Management
...
# Memory Safety
Modern, mostly memory safe C++, is enforced by:\
- Microsoft safety checker
@ -151,7 +153,7 @@ objects that reference resources, primarily unique_pointer.
`std::move(t)` is equivalent to `static_cast<decltype(t)&&>(t)`, causing move
semantics to be generated by the compiler.
`t`, the compiler assumes is converted by your move constructor or move assignment into a valid state where your destructor will not need to anything very costly.
`t`, the compiler assumes, is converted by your move constructor or move assignment into a valid state where your destructor will not need to anything very costly.
`std::forward(t)` causes move semantics to be invoked iff the thing referenced
is an rvalue, typically a compiler generated temporary, *conditionally*
@ -348,41 +350,11 @@ use with placement new
pmc->~MyClass(); //Explicit call to destructor.
aligned_free(p);.
# GSL: Guideline Support Library
# spans
The Guideline Support Library (GSL) contains functions and types that
are suggested for use by the C++ Core Guidelines maintained by the
Standard C++ Foundation. This repo contains [Microsofts implementation
of GSL](https://github.com/Microsoft/GSL).
A span is a non owning pointer and count.
git clone https://github.com/Microsoft/GSL.git
cd gsl
git tag
git checkout tags/v2.0.0
Which implementation mostly works on gcc/Linux, but is canonical on
Visual Studio.
For usage of spans ([the replacement for bare naked non owning pointers
subject to pointer
arithmetic)](http://codexpert.ro/blog/2016/03/07/guidelines-support-library-review-spant/)
For usage of string spans ([String
spans](http://codexpert.ro/blog/2016/03/21/guidelines-support-library-review-string_span/)
These are pointers to char arrays. There does not seem to be a UTF8
string_span.
GSL is a preview of C++20, as boost contained a preview of C++11.
It is disturbingly lacking in official documentation, perhaps because
still subject to change.
[Unofficial
documentation](http://modernescpp.com/index.php/c-core-guideline-the-guidelines-support-library)
It provides an optional fix for Cs memory management problems, while
still retaining backward compatibility to the existing pile of rusty
razor blades and broken glass.
It would perhaps be more useful if we also hand owning pointers with counts
# The Curiously Recurring Template Pattern
@ -456,85 +428,54 @@ named functor class.
To construct a lambda in the heap:
auto p = new auto([a,b,c](){})
```c++
auto p = new auto([a,b,c](){})
auto q = new auto([](int x) { return x * x; });
int result = (*q)(5); // Calls the lambda with the argument 5
delete p;
delete q;
```
Objects inside the lambda are constructed in the heap.
But if you have pointers to lambdas, std::function is more useful than lambda, because then you can declare their pointer type.
If you want to declare a pointer type that can point to any lambda with the same signature, or indeed any instance of
a class that supports operator(), you can use std::function:
```C++
#include <functional>
// Declare a pointer to a lambda that takes an int and returns an int
std::function<int(int)>* lambdaPtr = new std::function<int(int)>([](int x) { return x * x; });
// Use the lambda
int result = (*lambdaPtr)(5);
// Clean up
delete lambdaPtr;
```
This way, lambdaPtr is a pointer to a std::function that can store any callable object, including lambdas, that take an int and return an int.
similarly placement `new`, and `unique_ptr`.
To template a function that takes a lambda argument:
## callbacks
template <typename F>
void myFunction(F&& lambda){
//some things
In C, a callback is implemented as an ordinary function pointer, and a pointer to void, which is then cast to a data
structure of the desired type.
You can put a lambda in a class using decltype,and pass it around for
continuations, though you would probably need to template the class:
What the heavy C++ machinery of `std::function` does is bind the two together.
template<class T>class foo {
public:
T func;
foo(T in) :func{ in } {}
auto test(int x) { return func(x); }
};
....
auto bar = [](int x)->int {return x + 1; };
foo<(bar)>foobar(bar);
Sometimes you want to have indefinitely many data structures, which are dynamically allocated
and then discarded.
But we had to introduce a name, bar, so that decltype would have
something to work with, which lambdas are intended to avoid. If we are
going to have to introduce a compile time name, easier to do it as an
old fashioned function, method, or functor, as a method of a class that
is very possibly pod.
Sometimes you want to have a single data structure that gets overwritten frequently. The latter is
preferable when it suffices, since it means that asynch callback code is more like sync code.
If we are sticking a lambda around to be called later, might copy
it by value into a templated class, or might put it on the heap.
In one case, you would allocate the object every time, and when does with it, discard it.
auto bar = []() {return 5;};
You can give it to a std::function:
auto func_bar = std::function<int()>(bar);
In this case, it will get a copy of the value of bar. If bar had
captured anything by value, there would be two copies of those values on
the stack; one in bar, and one in func_bar.
When the current scope ends, func_bar will be destroyed, followed by
bar, as per the rules of cleaning up stack variables.
You could just as easily allocate one on the heap:
auto bar_ptr = std::make_unique(bar);
std::function <int(int)> increm{[](int arg{return arg+1;}}
presumably does this behind the scenes
On reflection we could probably use this method to produce a
templated function that stored a lambda somewhere in a templated class
derived from a virtual base class for execution when the event triggered
by the method fired, and returned a hashcode to the templated object for
the event to use when the event fired. The event gets the event handler
from the hashcode, and the virtual base class in the event handler fires
the lambda in the derived class, and the lambda works as a continuation,
operating in the context wherein it was defined, making event oriented
programming almost as intuitive as procedural programming.
But then we have a problem, because we would like to store event
handlers in the database, and restore them when program restarts, which
requires pod event handlers, or event handlers constructible from POD
data, which a lambda is not.
We could always have some event handlers which are inherently not POD
and are never sent to a database, while other event handlers are, but
this violates the dry design principle. To do full on functional
programming, use std::function and std::bind, which can encapsulate
lambdas and functors, but are slow because of dynamic allocation
C++ does not play well with functional programming. Most of the time you
can do what you want with lambdas and functors, using a pod class that
defines `operator(...)`
In the other case it would be a member variable of struct that hangs around and is continually
re-used.
# auto and decltype(variable)
@ -553,7 +494,7 @@ procedurally, as by defining a lambda, but by defining a derived type.
# Variable length Data Structures
C++ just does not handle them well, except you embed a vector in them,
C++ just does not handle them well, except you embed an `std::vector` in them,
which can result in messy reallocations.
One way is to drop back into old style C, and tell C++ not to fuck
@ -576,9 +517,6 @@ around.
return output;
}
Another solution is to work around C++s inability to handle variable
sized objects by fixing your hash function to handle disconnected data.
# for_each
template<class InputIterator, class Function>

0
docs/libraries/mkdocs.sh Normal file → Executable file
View File

View File

@ -2,20 +2,26 @@
title: time
sidebar: true
notmine: false
...
abstract: >-
We plan to have a network with all parties agreeing on the consensus network time,
peers running on linux systems that claim their tai time is accurate should
stubbornly steer the consensus time to the tai time, but not too stubbornly
because they can have a wrong tai time.
---
We plan to have a network with all parties agreeing on the consensus network time,
which accommodates leap seconds by being rubbery,
passing at 1001 milliseconds per second, or 999 milliseconds per second, or so,
when a leap second happens.
# timekeeping primitives
the C++ library chrono reports the steady time, guaranteed not to jump,
[chrono]:https://en.cppreference.com/w/cpp/chrono {target="_blank"}
the C++ library [chrono] reports the steady time, guaranteed not to jump,
and guaranteed to pass at very very close to one second per actual second,
but not guaranteed to have any particular relationship with any other machine,
and also the global official time, with no guarantees
that it is not wildly wrong, and which once in a while jumps by a second.
the global official time, the system time, with no guarantees
that it is not wildly wrong, and which once in a while jumps by a second,
and, on C++20, the tai time, guaranteed to not jump.
To check the global posix time on linux, *and uncertainty in that time* on linux
However there are linux specific calls that report the system time,
the global posix time on linux, *and uncertainty in that time*.
``` c
#include <stdio.h>
@ -39,6 +45,8 @@ int main()
}
```
# ad hoc consensus time
Those machines that do not have an accurate global time will try to be
at the median of all the machines that they have direct connection with
while biasing the consensus towards the passage of one second per second,
@ -50,12 +58,14 @@ they will stubbornly drag their heels in moving to the new consensus.
Those that do have an accurate global time will try to be nearer to the
global time, while remaining inside two thirds of the distribution.
If the network
time differs by so from the authoritative time, they will be as
close as they can be to the authoritative time, while remaining inside
time differs by so from the authoritative tai time, they will be as
close as they can be to the authoritative tai time, while remaining inside
the majority consensus, thus causing the consensus to drift towards
the authoritative time.
the authoritative tai time.
This describes an ad hoc mechanism for keeping consensus.
# Bayesian consensus
(People agree on metalog distribution of consensus time)
[gamma distribution]:../estimating_frequencies_from_small_samples.html#beta-distribution
{target="_blank"}
@ -73,3 +83,10 @@ Or cleverer still and accommodate leap seconds by consensus
on both the time and the rate of passage of consensus time relative
to steady time by a [delta distribution], in which case we have
a three dimensional $α$ and $β$
But both distributions suffer from the pathology that outliers will have
large effect, when they should have little effect.
So we need a metalog distribution that represents the sum of two
distributions, one narrow, and one wide and long tailed.
So outliers affect primarily the long tailed distribution

View File

@ -5,4 +5,33 @@ title: >-
sidebar: true
notmine: false
abstract: >-
Consensus is a hard problem, and gets harder when you have shards
consensus is a hard problem, and considerably harder when you have shards
Nakomoto Satoshi's implementation of Nakomoto consensus, aptly described
as wading through molasses, failed to scale
---
# Failure of Bitcoin consensus to scale
Mining pools, asics
# Monero consensus
RandomX prevented asics, because RandomX mining is best done on a general purpose CPU,
but as Monero got bigger, it came to pass that payouts got bigger. And because
more and more people were trying to mine a block, they got rarer and rarer
So miners joined in mining pools, so as to get smaller, but more regular and
predicable payouts more often. Which destroys the decentralization objective
of mining, giving the central authority running the mining pool dangerously great
power over the blockchain.
Monero's workaround for this is P2Pool, a mining pool without centralization.
But not everyone wants to use P2Pool.
Monero has a big problem with people preferring to mine in big pools, because
of the convenience provided by the dangerously powerful central authority.
It easier for the individual miner to let the center make all the decisions that
matter, but many of these decisions matter to other people, and the center could
make decisions that are contrary to everyone else's interests. Even if the
individual miner is better off than mining solo, this could well make everyone
including the individual miner worse off, because he and everyone may be
adversely affected by other people's decision to pool mine.

0
docs/manifesto/mkdocs.sh Normal file → Executable file
View File

0
docs/mkdocs.sh Normal file → Executable file
View File

0
docs/names/mkdocs.sh Normal file → Executable file
View File

0
docs/notes/mkdocs.sh Normal file → Executable file
View File

0
docs/rootDocs/mkdocs.sh Normal file → Executable file
View File

0
docs/setup/mkdocs.sh Normal file → Executable file
View File

View File

@ -222,7 +222,7 @@ This allows multiline, but visual studio code does not like it. Visual Studio C
Header Aligned Aligned Aligned
----------- ------- --------------- -------------------------
First row 12.0 Example of a row that
spans multiple lines.
spans multiple lines.
Second row 5.0 Here's another one. Note
the blank line between