modified: docs/estimating_frequencies_from_small_samples.md

modified: docs/libraries/cpp_automatic_memory_management.md modified: docs/libraries/time.md modified: docs/manifesto/consensus.md renamed: docs/notes/big_cirle_notation.md -> docs/notes/big_circle_notation.md modified: docs/writing_and_editing_documentation.md
2024-06-28 07:52:41 +00:00 · 2024-06-28 07:52:41 +00:00 · 4678fba3ce
commit 4678fba3ce
parent 9c5a393a93
14 changed files with 137 additions and 129 deletions
--- a/docs/design/mkdocs.sh
+++ b/docs/design/mkdocs.sh
--- a/docs/estimating_frequencies_from_small_samples.md
+++ b/docs/estimating_frequencies_from_small_samples.md
@ -317,6 +317,25 @@ cases this is not going to be the cases, for a real is usually going to have som
 finite distribution, and thus it is more likely to be OK to use the uncorrected
 delta distribution
 ## Haldane prior  $α=0$ $β=0$
 This is a pathological and unusable prior, because $B(0,0)$ is infinite.
 We cannot actually do anything with it, but has the desirable characteristic
 that almost none or almost all are highly weighted, giving us much the same result as above.
 It presupposes that we truly know nothing, but one has to start with some ground under one's feet.
 One must always start with some leap of faith.
 Since it is unusable, one may take $α=ɛ$, $β=ɛ$ where $ɛ$ is some small quantity, the smaller $ɛ$ is,
 the smaller one's initial leap of faith   If small, gives us a reasonable result for "all men are mortal".
 Maybe immortals exist, but are enormously rare, and a small number of samples, assuming this prior,
 gives us confidence that immortal men are extremely rare, but still predicts only very rare, not nonexistent.
 $ɛ = \frac{1}{2}$ is a prior that generally gives results that are not too silly,
 and avoids the complexity of our dual distribution delta function.
 On the other hand, our dual distribution just gives us saner and more commonsensical conclusions.
 ## the metalog (metalogistic) distribution
 The metalogistic distribution is like the Beta distribution in that
@ -326,25 +345,30 @@ as many terms as are required for the nature of the thing being represented.
 The Beta distribution plus two delta functions is a metalogistic distribution
 if we stretch the definition of the metalogistic distribution slightly.
 The Beta distribution represents the probability of a probability
 (since we are using it for its Bayesian update capability).
-For example we have a collection of urns containing red and blue balls,
+For example we have a collection of urns containing red and not red balls,
 and from time to time we draw a ball out of an urn and replace it,
 whereupon the Beta distribution is our best guess
-about the likelihood that it contains a certain ratio of red and blue balls
+about the likelihood that it contains a certain ratio of red and not red balls
 (also assuming the urns are enormously large,
-and also always contain at least some red and at least some blue balls)
+and also always contain at least some red and at least some not red balls)
 Suppose, however, the jars contain gravel, the size of each piece
 of gravel in a jar being normally distributed, and we want to
 estimate the size and standard deviation of the gravel in an urn,
-rather than the ratio of red balls and blue balls.
+rather than the ratio of red balls and not balls.
-(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative, but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed.)
+(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative,
 but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed., or
 perhaps we should assume a poisson distributon.)
 Whereupon our Baysian updates become more complex,
 and our prior has to contain difficult to justify information
 (no boulders or dust in the urns), but we are still doing Bayesian updates,
-hence the Beta distribution, and its generalization
+hence the Beta distribution, or rather its generalization
 the metalogistic distribution, still applies.
 Typically we want to exclude outliers as evidence, so we assume
 a metalogistic distribution that is the sum of two terms, one
 narrowly distributed, and one broadly distributed with a long tail.
--- a/docs/libraries/cpp_automatic_memory_management.md
+++ b/docs/libraries/cpp_automatic_memory_management.md
@ -2,7 +2,9 @@
 title:
 	C++ Automatic Memory Management
 ...
 # Memory Safety
 Modern, mostly memory safe C++, is enforced by:\
 - Microsoft safety checker
@ -151,7 +153,7 @@ objects that reference resources, primarily unique_pointer.
 `std::move(t)` is equivalent to `static_cast<decltype(t)&&>(t)`, causing move
 semantics to be generated by the compiler.
-`t`, the compiler assumes is converted by your move constructor or move assignment into a valid state where your destructor will not need to anything very costly.
+`t`, the compiler assumes, is converted by your move constructor or move assignment into a valid state where your destructor will not need to anything very costly.
 `std::forward(t)`  causes move semantics to be invoked iff the thing referenced
 is an rvalue, typically a compiler generated temporary, *conditionally*
@ -348,41 +350,11 @@ use with placement new
 		pmc->~MyClass();    //Explicit call to destructor.
 		aligned_free(p);.
-# GSL: Guideline Support Library
+# spans
-The Guideline Support Library (GSL) contains functions and types that
+A span is a non owning pointer and count.
 are suggested for use by the C++ Core Guidelines maintained by the
 Standard C++ Foundation. This repo contains [Microsoft’s implementation
 of GSL](https://github.com/Microsoft/GSL).
-	   git clone https://github.com/Microsoft/GSL.git
+It would perhaps be more useful if we also hand owning pointers with counts
 		cd gsl
 		git tag
 		git checkout tags/v2.0.0
 Which implementation mostly works on gcc/Linux, but is canonical on
 Visual Studio.
 For usage of spans ([the replacement for bare naked non owning pointers
 subject to pointer
 arithmetic)](http://codexpert.ro/blog/2016/03/07/guidelines-support-library-review-spant/)
 For usage of string spans ([String
 spans](http://codexpert.ro/blog/2016/03/21/guidelines-support-library-review-string_span/)
 These are pointers to char arrays. There does not seem to be a UTF‑8
 string_span.
 GSL is a preview of C++20, as boost contained a preview of C++11.
 It is disturbingly lacking in official documentation, perhaps because
 still subject to change.
 [Unofficial
 documentation](http://modernescpp.com/index.php/c-core-guideline-the-guidelines-support-library)
 It provides an optional fix for C’s memory management problems, while
 still retaining backward compatibility to the existing pile of rusty
 razor blades and broken glass.
 # The Curiously Recurring Template Pattern
@ -456,85 +428,54 @@ named functor class.
 To construct a lambda in the heap:
-	auto p = new  auto([a,b,c](){})
+```c++
 auto p = new  auto([a,b,c](){})
 auto q = new auto([](int x) { return x * x; });
 int result = (*q)(5); // Calls the lambda with the argument 5
 delete p;
 delete q;
 ```
-Objects inside the lambda are constructed in the heap.
+But if you have pointers to lambdas, std::function is more useful than lambda, because then you can declare their pointer type.
 If you want to declare a pointer type that can point to any lambda with the same signature, or indeed any instance of
 a class that supports operator(), you can use std::function:
 ```C++
 #include <functional>
 // Declare a pointer to a lambda that takes an int and returns an int
 std::function<int(int)>* lambdaPtr = new std::function<int(int)>([](int x) { return x * x; });
 // Use the lambda
 int result = (*lambdaPtr)(5);
 // Clean up
 delete lambdaPtr;
 ```
 This way, lambdaPtr is a pointer to a std::function that can store any callable object, including lambdas, that take an int and return an int.
 similarly placement `new`, and `unique_ptr`.
-To template a function that takes a lambda argument:
+## callbacks
-	template <typename F>
+In C, a callback is implemented as an ordinary function pointer, and a pointer to void, which is then cast to a data
-	void myFunction(F&& lambda){
+structure of the desired type.
 		//some things
-You can put a lambda in a class using decltype,and pass it around for
+What the heavy C++ machinery of `std::function` does is bind the two together.
 continuations, though you would probably need to template the class:
-	template<class T>class foo {
+Sometimes you want to have indefinitely many data structures, which are dynamically allocated
-	public:
+and then discarded.
 		T func;
 		foo(T in) :func{ in } {}
 		auto test(int x) { return func(x); }
 	};
 		....
 		auto bar = [](int x)->int {return x + 1; };
 		foo<(bar)>foobar(bar);
-But we had to introduce a name, bar, so that decltype would have
+Sometimes you want to have a single data structure that gets overwritten frequently.  The latter is
-something to work with, which lambdas are intended to avoid. If we are
+preferable when it suffices, since it means that asynch callback code is more like sync code.
 going to have to introduce a compile time name, easier to do it as an
 old fashioned function, method, or functor, as a method of a class that
 is very possibly pod.
-If we are sticking a lambda around to be called later, might copy
+In one case, you would allocate the object every time, and when does with it, discard it.
 it by value into a templated class, or might put it on the heap.
-	auto bar = []() {return 5;};
+In the other case it would be a member variable of struct that hangs around and is continually
-
+re-used.
 You can give it to a std::function:
 	auto func_bar = std::function<int()>(bar);
 In this case, it will get a copy of the value of bar. If bar had
 captured anything by value, there would be two copies of those values on
 the stack; one in bar, and one in func_bar.
 When the current scope ends, func_bar will be destroyed, followed by
 bar, as per the rules of cleaning up stack variables.
 You could just as easily allocate one on the heap:
 	auto bar_ptr = std::make_unique(bar);
 	std::function <int(int)> increm{[](int arg{return arg+1;}}
 presumably does this behind the scenes
 On reflection we could probably use this method to produce a
 templated function that stored a lambda somewhere in a templated class
 derived from a virtual base class for execution when the event triggered
 by the method fired, and returned a hashcode to the templated object for
 the event to use when the event fired. The event gets the event handler
 from the hashcode, and the virtual base class in the event handler fires
 the lambda in the derived class, and the lambda works as a continuation,
 operating in the context wherein it was defined, making event oriented
 programming almost as intuitive as procedural programming.
 But then we have a problem, because we would like to store event
 handlers in the database, and restore them when program restarts, which
 requires pod event handlers, or event handlers constructible from POD
 data, which a lambda is not.
 We could always have some event handlers which are inherently not POD
 and are never sent to a database, while other event handlers are, but
 this violates the dry design principle. To do full on functional
 programming, use std::function and std::bind, which can encapsulate
 lambdas and functors, but are slow because of dynamic allocation
 C++ does not play well with functional programming. Most of the time you
 can do what you want with lambdas and functors, using a pod class that
 defines `operator(...)`
 # auto and decltype(variable)
@ -553,7 +494,7 @@ procedurally, as by defining a lambda, but by defining a derived type.
 # Variable length Data Structures
-C++ just does not handle them well, except you embed a vector in them,
+C++ just does not handle them well, except you embed an `std::vector` in them,
 which can result in messy reallocations.
 One way is to drop back into old style C, and tell C++ not to fuck
@ -576,9 +517,6 @@ around.
 		return output;
 	}
 Another solution is to work around C++’s inability to handle variable
 sized objects by fixing your hash function to handle disconnected data.
 # for_each
 	template<class InputIterator, class Function>
--- a/docs/libraries/mkdocs.sh
+++ b/docs/libraries/mkdocs.sh
--- a/docs/libraries/time.md
+++ b/docs/libraries/time.md
@ -2,20 +2,26 @@
 title: time
 sidebar: true
 notmine: false
-...
+abstract: >-
       We plan to have a network with all parties agreeing on the consensus network time,
       peers running on linux systems that claim their tai time is accurate should
       stubbornly steer the consensus time to the tai time, but not too stubbornly
       because they can have a wrong tai time.
 ---
-We plan to have a network with all parties agreeing on the consensus network time,
+# timekeeping primitives
 which accommodates leap seconds by being rubbery,
 passing at 1001 milliseconds per second, or 999 milliseconds per second, or so,
 when a leap second happens.
-the C++ library chrono reports the steady time, guaranteed not to jump,
+[chrono]:https://en.cppreference.com/w/cpp/chrono {target="_blank"}
 the C++ library [chrono] reports the steady time, guaranteed not to jump,
 and guaranteed to pass at very very close to one second per actual second,
 but not guaranteed to have any particular relationship with any other machine,
-and also the global official time, with no guarantees
+the global official time, the system time, with no guarantees
-that it is not wildly wrong, and which once in a while jumps by a second.
+that it is not wildly wrong, and which once in a while jumps by a second,
 and, on C++20, the tai time, guaranteed to not jump.
-To check the global posix time on linux, *and uncertainty in that time* on linux
+However there are linux specific calls that report the system time,
 the global posix time on linux, *and uncertainty in that time*.
 ``` c
 #include <stdio.h>
@ -39,6 +45,8 @@ int main()
 }
 ```
 # ad hoc consensus time
 Those machines that do not have an accurate global time will try to be
 at the median of all the machines that they have direct connection with
 while biasing the consensus towards the passage of one second per second,
@ -50,12 +58,14 @@ they will stubbornly drag their heels in moving to the new consensus.
 Those that do have an accurate global time will try to be nearer to the
 global time, while remaining inside two thirds of the distribution.
 If the network
-time differs by so from the authoritative time, they will be as
+time differs by so from the authoritative tai time, they will be as
-close as they can be to the authoritative time, while remaining inside
+close as they can be to the authoritative tai time, while remaining inside
 the majority consensus, thus causing the consensus to drift towards
-the authoritative time.
+the authoritative tai time.
-This describes an ad hoc mechanism for keeping consensus.
+# Bayesian consensus
 (People agree on metalog distribution of consensus time)
 [gamma distribution]:../estimating_frequencies_from_small_samples.html#beta-distribution
 {target="_blank"}
@ -73,3 +83,10 @@ Or cleverer still and accommodate leap seconds by consensus
 on both the time and the rate of passage of consensus time relative
 to steady time by a [delta distribution], in which case we have
 a three dimensional $α$ and $β$
 But both distributions suffer from the pathology that outliers will have
 large effect, when they should have little effect.
 So we need a metalog distribution that represents the sum of two
 distributions, one narrow, and one wide and long tailed.
 So outliers affect primarily the long tailed distribution
--- a/docs/manifesto/consensus.md
+++ b/docs/manifesto/consensus.md
@ -5,4 +5,33 @@ title: >-
 sidebar: true
 notmine: false
 abstract: >-
-	Consensus is a hard problem, and gets harder when you have shards
+	consensus is a hard problem, and considerably harder when you have shards
 	Nakomoto Satoshi's implementation of Nakomoto consensus, aptly described
 	as wading through molasses, failed to scale
 ---
 # Failure of Bitcoin consensus to scale
 Mining pools, asics
 # Monero consensus
 RandomX prevented asics, because RandomX mining is best done on a general purpose CPU,
 but as Monero got bigger, it came to pass that payouts got bigger.  And because
 more and more people were trying to mine a block, they got rarer and rarer
 So miners joined in mining pools, so as to get smaller, but more regular and
 predicable payouts more often.  Which destroys the decentralization objective
 of mining, giving the central authority running the mining pool dangerously great
 power over the blockchain.
 Monero's workaround for this is P2Pool, a mining pool without centralization.
 But not everyone wants to use P2Pool.
 Monero has a big problem with people preferring to mine in big pools, because
 of the convenience provided by the dangerously powerful central authority.
 It easier for the individual miner to let the center make all the decisions that
 matter, but many of these decisions matter to other people, and the center could
 make decisions that are contrary to everyone else's interests.  Even if the
 individual miner is better off than mining solo, this could well make everyone
 including the individual miner worse off, because he and everyone may be
 adversely affected by other people's decision to pool mine.
--- a/docs/manifesto/mkdocs.sh
+++ b/docs/manifesto/mkdocs.sh
--- a/docs/mkdocs.sh
+++ b/docs/mkdocs.sh
--- a/docs/names/mkdocs.sh
+++ b/docs/names/mkdocs.sh
--- a/docs/notes/big_circle_notation.md
+++ b/docs/notes/big_circle_notation.md
--- a/docs/notes/mkdocs.sh
+++ b/docs/notes/mkdocs.sh
--- a/docs/rootDocs/mkdocs.sh
+++ b/docs/rootDocs/mkdocs.sh
--- a/docs/setup/mkdocs.sh
+++ b/docs/setup/mkdocs.sh
--- a/docs/writing_and_editing_documentation.md
+++ b/docs/writing_and_editing_documentation.md
@ -222,7 +222,7 @@ This allows multiline, but visual studio code does not like it.  Visual Studio C
  Header    Aligned         Aligned Aligned
 ----------- ------- --------------- -------------------------
   First    row                12.0 Example of a row that
-                                    spans multiple lines.
+									spans multiple lines.
  Second    row            5.0      Here's another one. Note
                                    the blank line between