From 4678fba3ceac53c60ea3b64999968f67a48df1e4 Mon Sep 17 00:00:00 2001 From: "reaction.la" Date: Fri, 28 Jun 2024 07:52:41 +0000 Subject: [PATCH] modified: docs/estimating_frequencies_from_small_samples.md modified: docs/libraries/cpp_automatic_memory_management.md modified: docs/libraries/time.md modified: docs/manifesto/consensus.md renamed: docs/notes/big_cirle_notation.md -> docs/notes/big_circle_notation.md modified: docs/writing_and_editing_documentation.md --- docs/design/mkdocs.sh | 0 ...timating_frequencies_from_small_samples.md | 38 ++++- .../cpp_automatic_memory_management.md | 152 ++++++------------ docs/libraries/mkdocs.sh | 0 docs/libraries/time.md | 43 +++-- docs/manifesto/consensus.md | 31 +++- docs/manifesto/mkdocs.sh | 0 docs/mkdocs.sh | 0 docs/names/mkdocs.sh | 0 ...rle_notation.md => big_circle_notation.md} | 0 docs/notes/mkdocs.sh | 0 docs/rootDocs/mkdocs.sh | 0 docs/setup/mkdocs.sh | 0 docs/writing_and_editing_documentation.md | 2 +- 14 files changed, 137 insertions(+), 129 deletions(-) mode change 100644 => 100755 docs/design/mkdocs.sh mode change 100644 => 100755 docs/libraries/mkdocs.sh mode change 100644 => 100755 docs/manifesto/mkdocs.sh mode change 100644 => 100755 docs/mkdocs.sh mode change 100644 => 100755 docs/names/mkdocs.sh rename docs/notes/{big_cirle_notation.md => big_circle_notation.md} (100%) mode change 100644 => 100755 docs/notes/mkdocs.sh mode change 100644 => 100755 docs/rootDocs/mkdocs.sh mode change 100644 => 100755 docs/setup/mkdocs.sh diff --git a/docs/design/mkdocs.sh b/docs/design/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/estimating_frequencies_from_small_samples.md b/docs/estimating_frequencies_from_small_samples.md index af6dae6..f55918d 100644 --- a/docs/estimating_frequencies_from_small_samples.md +++ b/docs/estimating_frequencies_from_small_samples.md @@ -317,6 +317,25 @@ cases this is not going to be the cases, for a real is usually going to have som finite distribution, and thus it is more likely to be OK to use the uncorrected delta distribution +## Haldane prior $α=0$ $β=0$ + +This is a pathological and unusable prior, because $B(0,0)$ is infinite. +We cannot actually do anything with it, but has the desirable characteristic +that almost none or almost all are highly weighted, giving us much the same result as above. + +It presupposes that we truly know nothing, but one has to start with some ground under one's feet. +One must always start with some leap of faith. + +Since it is unusable, one may take $α=ɛ$, $β=ɛ$ where $ɛ$ is some small quantity, the smaller $ɛ$ is, +the smaller one's initial leap of faith If small, gives us a reasonable result for "all men are mortal". +Maybe immortals exist, but are enormously rare, and a small number of samples, assuming this prior, +gives us confidence that immortal men are extremely rare, but still predicts only very rare, not nonexistent. + +$ɛ = \frac{1}{2}$ is a prior that generally gives results that are not too silly, +and avoids the complexity of our dual distribution delta function. + +On the other hand, our dual distribution just gives us saner and more commonsensical conclusions. + ## the metalog (metalogistic) distribution The metalogistic distribution is like the Beta distribution in that @@ -326,25 +345,30 @@ as many terms as are required for the nature of the thing being represented. The Beta distribution plus two delta functions is a metalogistic distribution if we stretch the definition of the metalogistic distribution slightly. - The Beta distribution represents the probability of a probability (since we are using it for its Bayesian update capability). -For example we have a collection of urns containing red and blue balls, +For example we have a collection of urns containing red and not red balls, and from time to time we draw a ball out of an urn and replace it, whereupon the Beta distribution is our best guess -about the likelihood that it contains a certain ratio of red and blue balls +about the likelihood that it contains a certain ratio of red and not red balls (also assuming the urns are enormously large, -and also always contain at least some red and at least some blue balls) +and also always contain at least some red and at least some not red balls) Suppose, however, the jars contain gravel, the size of each piece of gravel in a jar being normally distributed, and we want to estimate the size and standard deviation of the gravel in an urn, -rather than the ratio of red balls and blue balls. +rather than the ratio of red balls and not balls. -(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative, but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed.) +(Well, the size $s$ cannot be normally distributed, because $s$ is strictly non negative, +but perhaps $\ln(s)$, or $s\ln(s)$, or $(s/a -a/s)$ is normally distributed., or +perhaps we should assume a poisson distributon.) Whereupon our Baysian updates become more complex, and our prior has to contain difficult to justify information (no boulders or dust in the urns), but we are still doing Bayesian updates, -hence the Beta distribution, and its generalization +hence the Beta distribution, or rather its generalization the metalogistic distribution, still applies. + +Typically we want to exclude outliers as evidence, so we assume +a metalogistic distribution that is the sum of two terms, one +narrowly distributed, and one broadly distributed with a long tail. diff --git a/docs/libraries/cpp_automatic_memory_management.md b/docs/libraries/cpp_automatic_memory_management.md index df9f41f..c96f32a 100644 --- a/docs/libraries/cpp_automatic_memory_management.md +++ b/docs/libraries/cpp_automatic_memory_management.md @@ -2,7 +2,9 @@ title: C++ Automatic Memory Management ... + # Memory Safety + Modern, mostly memory safe C++, is enforced by:\ - Microsoft safety checker @@ -151,7 +153,7 @@ objects that reference resources, primarily unique_pointer. `std::move(t)` is equivalent to `static_cast(t)`, causing move semantics to be generated by the compiler. -`t`, the compiler assumes is converted by your move constructor or move assignment into a valid state where your destructor will not need to anything very costly. +`t`, the compiler assumes, is converted by your move constructor or move assignment into a valid state where your destructor will not need to anything very costly. `std::forward(t)` causes move semantics to be invoked iff the thing referenced is an rvalue, typically a compiler generated temporary, *conditionally* @@ -348,41 +350,11 @@ use with placement new pmc->~MyClass(); //Explicit call to destructor. aligned_free(p);. -# GSL: Guideline Support Library +# spans -The Guideline Support Library (GSL) contains functions and types that -are suggested for use by the C++ Core Guidelines maintained by the -Standard C++ Foundation. This repo contains [Microsoft’s implementation -of GSL](https://github.com/Microsoft/GSL). +A span is a non owning pointer and count. - git clone https://github.com/Microsoft/GSL.git - cd gsl - git tag - git checkout tags/v2.0.0 - -Which implementation mostly works on gcc/Linux, but is canonical on -Visual Studio. - -For usage of spans ([the replacement for bare naked non owning pointers -subject to pointer -arithmetic)](http://codexpert.ro/blog/2016/03/07/guidelines-support-library-review-spant/) - -For usage of string spans ([String -spans](http://codexpert.ro/blog/2016/03/21/guidelines-support-library-review-string_span/) -These are pointers to char arrays. There does not seem to be a UTF‑8 -string_span. - -GSL is a preview of C++20, as boost contained a preview of C++11. - -It is disturbingly lacking in official documentation, perhaps because -still subject to change. - -[Unofficial -documentation](http://modernescpp.com/index.php/c-core-guideline-the-guidelines-support-library) - -It provides an optional fix for C’s memory management problems, while -still retaining backward compatibility to the existing pile of rusty -razor blades and broken glass. +It would perhaps be more useful if we also hand owning pointers with counts # The Curiously Recurring Template Pattern @@ -456,85 +428,54 @@ named functor class. To construct a lambda in the heap: - auto p = new auto([a,b,c](){}) +```c++ +auto p = new auto([a,b,c](){}) +auto q = new auto([](int x) { return x * x; }); +int result = (*q)(5); // Calls the lambda with the argument 5 +delete p; +delete q; +``` -Objects inside the lambda are constructed in the heap. +But if you have pointers to lambdas, std::function is more useful than lambda, because then you can declare their pointer type. + + +If you want to declare a pointer type that can point to any lambda with the same signature, or indeed any instance of +a class that supports operator(), you can use std::function: + +```C++ +#include + +// Declare a pointer to a lambda that takes an int and returns an int +std::function* lambdaPtr = new std::function([](int x) { return x * x; }); + +// Use the lambda +int result = (*lambdaPtr)(5); + +// Clean up +delete lambdaPtr; +``` + +This way, lambdaPtr is a pointer to a std::function that can store any callable object, including lambdas, that take an int and return an int. similarly placement `new`, and `unique_ptr`. -To template a function that takes a lambda argument: +## callbacks - template - void myFunction(F&& lambda){ - //some things +In C, a callback is implemented as an ordinary function pointer, and a pointer to void, which is then cast to a data +structure of the desired type. -You can put a lambda in a class using decltype,and pass it around for -continuations, though you would probably need to template the class: +What the heavy C++ machinery of `std::function` does is bind the two together. - templateclass foo { - public: - T func; - foo(T in) :func{ in } {} - auto test(int x) { return func(x); } - }; - .... - auto bar = [](int x)->int {return x + 1; }; - foo<(bar)>foobar(bar); +Sometimes you want to have indefinitely many data structures, which are dynamically allocated +and then discarded. -But we had to introduce a name, bar, so that decltype would have -something to work with, which lambdas are intended to avoid. If we are -going to have to introduce a compile time name, easier to do it as an -old fashioned function, method, or functor, as a method of a class that -is very possibly pod. +Sometimes you want to have a single data structure that gets overwritten frequently. The latter is +preferable when it suffices, since it means that asynch callback code is more like sync code. -If we are sticking a lambda around to be called later, might copy -it by value into a templated class, or might put it on the heap. +In one case, you would allocate the object every time, and when does with it, discard it. - auto bar = []() {return 5;}; - -You can give it to a std::function: - - auto func_bar = std::function(bar); - -In this case, it will get a copy of the value of bar. If bar had -captured anything by value, there would be two copies of those values on -the stack; one in bar, and one in func_bar. - -When the current scope ends, func_bar will be destroyed, followed by -bar, as per the rules of cleaning up stack variables. - -You could just as easily allocate one on the heap: - - auto bar_ptr = std::make_unique(bar); - - std::function increm{[](int arg{return arg+1;}} - -presumably does this behind the scenes - -On reflection we could probably use this method to produce a -templated function that stored a lambda somewhere in a templated class -derived from a virtual base class for execution when the event triggered -by the method fired, and returned a hashcode to the templated object for -the event to use when the event fired. The event gets the event handler -from the hashcode, and the virtual base class in the event handler fires -the lambda in the derived class, and the lambda works as a continuation, -operating in the context wherein it was defined, making event oriented -programming almost as intuitive as procedural programming. - -But then we have a problem, because we would like to store event -handlers in the database, and restore them when program restarts, which -requires pod event handlers, or event handlers constructible from POD -data, which a lambda is not. - -We could always have some event handlers which are inherently not POD -and are never sent to a database, while other event handlers are, but -this violates the dry design principle. To do full on functional -programming, use std::function and std::bind, which can encapsulate -lambdas and functors, but are slow because of dynamic allocation - -C++ does not play well with functional programming. Most of the time you -can do what you want with lambdas and functors, using a pod class that -defines `operator(...)` +In the other case it would be a member variable of struct that hangs around and is continually +re-used. # auto and decltype(variable) @@ -553,7 +494,7 @@ procedurally, as by defining a lambda, but by defining a derived type. # Variable length Data Structures -C++ just does not handle them well, except you embed a vector in them, +C++ just does not handle them well, except you embed an `std::vector` in them, which can result in messy reallocations. One way is to drop back into old style C, and tell C++ not to fuck @@ -576,9 +517,6 @@ around. return output; } -Another solution is to work around C++’s inability to handle variable -sized objects by fixing your hash function to handle disconnected data. - # for_each template diff --git a/docs/libraries/mkdocs.sh b/docs/libraries/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/libraries/time.md b/docs/libraries/time.md index ccdb671..b63a7a0 100644 --- a/docs/libraries/time.md +++ b/docs/libraries/time.md @@ -2,20 +2,26 @@ title: time sidebar: true notmine: false -... +abstract: >- + We plan to have a network with all parties agreeing on the consensus network time, + peers running on linux systems that claim their tai time is accurate should + stubbornly steer the consensus time to the tai time, but not too stubbornly + because they can have a wrong tai time. +--- -We plan to have a network with all parties agreeing on the consensus network time, -which accommodates leap seconds by being rubbery, -passing at 1001 milliseconds per second, or 999 milliseconds per second, or so, -when a leap second happens. +# timekeeping primitives -the C++ library chrono reports the steady time, guaranteed not to jump, +[chrono]:https://en.cppreference.com/w/cpp/chrono {target="_blank"} + +the C++ library [chrono] reports the steady time, guaranteed not to jump, and guaranteed to pass at very very close to one second per actual second, but not guaranteed to have any particular relationship with any other machine, -and also the global official time, with no guarantees -that it is not wildly wrong, and which once in a while jumps by a second. +the global official time, the system time, with no guarantees +that it is not wildly wrong, and which once in a while jumps by a second, +and, on C++20, the tai time, guaranteed to not jump. -To check the global posix time on linux, *and uncertainty in that time* on linux +However there are linux specific calls that report the system time, +the global posix time on linux, *and uncertainty in that time*. ``` c #include @@ -39,6 +45,8 @@ int main() } ``` +# ad hoc consensus time + Those machines that do not have an accurate global time will try to be at the median of all the machines that they have direct connection with while biasing the consensus towards the passage of one second per second, @@ -50,12 +58,14 @@ they will stubbornly drag their heels in moving to the new consensus. Those that do have an accurate global time will try to be nearer to the global time, while remaining inside two thirds of the distribution. If the network -time differs by so from the authoritative time, they will be as -close as they can be to the authoritative time, while remaining inside +time differs by so from the authoritative tai time, they will be as +close as they can be to the authoritative tai time, while remaining inside the majority consensus, thus causing the consensus to drift towards -the authoritative time. +the authoritative tai time. -This describes an ad hoc mechanism for keeping consensus. +# Bayesian consensus + +(People agree on metalog distribution of consensus time) [gamma distribution]:../estimating_frequencies_from_small_samples.html#beta-distribution {target="_blank"} @@ -73,3 +83,10 @@ Or cleverer still and accommodate leap seconds by consensus on both the time and the rate of passage of consensus time relative to steady time by a [delta distribution], in which case we have a three dimensional $α$ and $β$ + +But both distributions suffer from the pathology that outliers will have +large effect, when they should have little effect. + +So we need a metalog distribution that represents the sum of two +distributions, one narrow, and one wide and long tailed. +So outliers affect primarily the long tailed distribution diff --git a/docs/manifesto/consensus.md b/docs/manifesto/consensus.md index e52534a..9574f0a 100644 --- a/docs/manifesto/consensus.md +++ b/docs/manifesto/consensus.md @@ -5,4 +5,33 @@ title: >- sidebar: true notmine: false abstract: >- - Consensus is a hard problem, and gets harder when you have shards + consensus is a hard problem, and considerably harder when you have shards + Nakomoto Satoshi's implementation of Nakomoto consensus, aptly described + as wading through molasses, failed to scale +--- + +# Failure of Bitcoin consensus to scale +Mining pools, asics + +# Monero consensus +RandomX prevented asics, because RandomX mining is best done on a general purpose CPU, +but as Monero got bigger, it came to pass that payouts got bigger. And because +more and more people were trying to mine a block, they got rarer and rarer + +So miners joined in mining pools, so as to get smaller, but more regular and +predicable payouts more often. Which destroys the decentralization objective +of mining, giving the central authority running the mining pool dangerously great +power over the blockchain. + +Monero's workaround for this is P2Pool, a mining pool without centralization. +But not everyone wants to use P2Pool. + +Monero has a big problem with people preferring to mine in big pools, because +of the convenience provided by the dangerously powerful central authority. + +It easier for the individual miner to let the center make all the decisions that +matter, but many of these decisions matter to other people, and the center could +make decisions that are contrary to everyone else's interests. Even if the +individual miner is better off than mining solo, this could well make everyone +including the individual miner worse off, because he and everyone may be +adversely affected by other people's decision to pool mine. diff --git a/docs/manifesto/mkdocs.sh b/docs/manifesto/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/mkdocs.sh b/docs/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/names/mkdocs.sh b/docs/names/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/notes/big_cirle_notation.md b/docs/notes/big_circle_notation.md similarity index 100% rename from docs/notes/big_cirle_notation.md rename to docs/notes/big_circle_notation.md diff --git a/docs/notes/mkdocs.sh b/docs/notes/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/rootDocs/mkdocs.sh b/docs/rootDocs/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/setup/mkdocs.sh b/docs/setup/mkdocs.sh old mode 100644 new mode 100755 diff --git a/docs/writing_and_editing_documentation.md b/docs/writing_and_editing_documentation.md index b1ce7b7..fa14a23 100644 --- a/docs/writing_and_editing_documentation.md +++ b/docs/writing_and_editing_documentation.md @@ -222,7 +222,7 @@ This allows multiline, but visual studio code does not like it. Visual Studio C Header Aligned Aligned Aligned ----------- ------- --------------- ------------------------- First row 12.0 Example of a row that - spans multiple lines. + spans multiple lines. Second row 5.0 Here's another one. Note the blank line between