Git submodules leak complexity and surprising and inconvenient behavior
all over the place if one is trying to make a change that affects multiple
modules simultaneously. But having your libraries separate from your git
repository results in non portable surprises and complexity. Makes it hard
for anyone else to build your project, because they will have to, by hand,
tell your project where the libraries are on their system.
You need an enormous pile of source code, the work of many people over
a very long time, and GitSubmodules allows this to scale, because the
local great big pile of source code references many independent and
sovereign repositories in the cloud. If you have one enormous pile of
source code in one enormous git repository, things get very very slow. If
you rely someone else's compiled code, things break and you get
accidental and deliberate backdoors, which is a big concern when you are
doing money and cryptography.
GitSubmodules is hierarchical, but source code has strange loops. The Bob
module uses the Alice module and the Carol module, but Alice uses Bob
and Carol, and Carol uses Alice and Bob. How do you make sure that all
your modules are using the same commit of Alice?
Well, if modules have strange loops you make one of them the master, and
the rest of them direct submodules of that master, brother subs to each
other, and they are all using the same commit of Alice as the master. And
you should try to write or modify the source code so that they all call their
brother submodules through the one parent module above them in the
hierarchy, that they use the source code of their brothers through the
source code of their master, rather than directly incorporating the header
files of their brothers at compile time, albeit the header file of the master
that they include may well include the header of their brother, so that they
are indirectly, through the master header file, including the brother header
file.
# Git subtrees
Git subtrees are an alternative to submodules, and many people
recommend them because they do not break the git model the way
submodules do.
But subtrees do not scale. If you have an enormous pile of stuff in your
repository, Git has to check every file to see if it has changed every time,
which rather rapidly becomes painfully slow if one is incorporating a lot
of projects reflecting a lot of work by a lot of people. GitSubmodules
means you can incorporate unlimited amounts of stuff, and Git only has to
check the particular module that you are actually working on.
Maybe subtrees would work better if one was working on a project where
several parts were being developed at once, thus a project small enough
that scaling is not an issue. But such projects, if successful, grow into
projects where scaling is an issue. And if you are a pure consumer of a
library, you don't care that you are breaking the git model, because you are
seldom making synchronized changes in module and submodule.
The submodule model works fine, provided the divisions between one
submodule and the next are such that one is only likely to make changes in
one module at at time.
# Passphrases
All wallets now use random words - but you cannot carry an eighteen word random phrase though an airport in you head
Should use [grammatically correct passphrases](https://github.com/lungj/passphrase_generator).
Using those dictionaries, the phrase (adjective noun adverb verb adjective
noun) can encode sixty eight bits of entropy. Two such phrases suffice,
being stronger than the underlying elliptic curve. With password
strengthening, we can randomly leave out one of the adjectives or adverbs
from one of the passphrases.
# Polkadot, substack and gitcoin
It has become painfully apparent that building a blockchain is a very large project.
Polkadot is a blockchain ecosystem, and substack a family of libraries for
constructing blockchains. It is a lot a easier to refactor an existing
blockchain than to start entirely from scratch.
Polkadot is designed to make its ecosystem subordinate to the primary
blockchain, which I do not want - but it also connects its ecosystem to
bitcoin by De-Fi (or promises to do so, I don't know how well it works) so
accepting that subordination is a liquidity event. We can fix things so
that the tail will wag the dog once the tail gets big enough, as China licensed
from ARM, then formed a joint venture with ARM, then hijacked the joint
venture, once it felt it no longer needed to keep buying the latest ARM
intellectual property. Licensing was a fully subordinate relationship, the
joint venture was cooperation between unequal parties, and now ARM
China is a fully independent and competing technology, based on the old
ARM technology, but advancing it separately, independently, and in its
own direction. China forked the ARM architecture.
Accepting a fully subordinate relationship to get connected, and then
defecting on subordination when strong enough, is a sound strategy.
[Gitcoin]:https://gitcoin.co/
"Build and Fund the Open Web Together"
And talking about connections: [Gitcoin]
Gitcoin promises connection to money, and connection to a community of
open source developers. It is Polkadot's money funnel from VCs to
developers. The amount of cash in play is rather meagre, but it provides a
link to the real money, which is ICOs.
I suspect that its git hosting has been co-opted by the enemy, but that is
OK, provided our primary repo is not co-opted by the enemy.
# Installers
Wine to run Windows 10 software under Linux is a bad idea, and
Windows Subsystem for Linux to run Linux software under Windows 10
is a much worse idea – it is the usual “embrace and extend” evil plot by
Microsoft against open source software, considerably less competently
executed than in the past.
## The standard gnu installer
```bash
./configure && make && make install
```
## The standard windows installer
Wix creating an `*.msi` file.
Which `*.msi` file can be wrapped in an executable, but there is no sane
reason for this and you are likely to wind up with installs that consist of an
executable that wraps an msi that wraps an executable that wraps an msi.
To build an `*.msi`, you need to download the Wix toolset, which is referenced in the relevant Visual Studio extensions, but cannot be downloaded from within the Visual Studio extension manager.
The Wix Toolset, however, requires the net framework in order to install it
and use it, which is the cobbler’s children going barefoot. You want a
banana, and have to install a banana tree, a monkey, and a jungle.
There is a [good web page](https://stackoverflow.com/questions/1042566/how-can-i-create-an-msi-setup) on WIX resources
There is an automatic wix setup: Visual Studio-> Tools-> Extensions&updates ->search Visual Studio Installer Projects
Which is the Microsoft utility for building wix files. It creates a quite adequate wix setup by gui, in the spirit of the skeleton windows gui app.
People who know what they are doing seem to use this install system, and they
write nice installs with it.
To build setup program:
1. Build both x64 and Win32 Release configs
1. When you construct wallet.nsi in nullsoft, add it to your project.
1. When building a deliverable, Right click on the WalletSetup.nsi file in Visual Studio project and select properties.
1. Set Excluded from Build to No
1. OK Properties
1. Right click .nsi file again and choose Compile.
1. Set the .nsi file properties back to Excluded from Build.
This manual building of the setup is due to the fact that we need both x64
and Win32 exes for the setup program and Visual Studio doesn’t provide a
way to do this easily.
# Package managers
Lately, however, package managers have appeared: Conan and [vcPkg](https://blog.kitware.com/vcpkg-a-tool-to-build-open-source-libraries-on-windows/). Conan lacks wxWidgets, and has far fewer packages than [vcpkg](https://libraries.io/github/Microsoft/vcpkg).
I have attempted to use package managers, and not found them very useful. It
is easier to deal with each package as its own unique special case. The
uniform abstraction that a package manager attempts to provide invariably
leaks badly, while piling cruft on top of the library. Rather than
simplifying library use, piles its own idiosyncratic complexification on top
of the complexities of the library, often inducing multiplicative complexity,
as one attempts to deal with the irregularities and particulars of a
particular library though a package manager that is unaware of and incapable
of dealing with the particularity of that particular package, and is
unshakeably convinced that the library is organized in way that is different
from the way it is in fact organized.
# Multiprecision Arithmetic
I will need multiprecision arithmetic if I represent information in a base or
dictionary that is not a power of two.
[MPIR]:]http://mpir.org/
{target="_blank"}
[GMP]:https://gmplib.org
{target="_blank"}
The best libraries are [GMP] for Linux and
[MPIR] for windows. These are reasonably
compatible, and generally only require very trivial changes to produce a Linux
version and a windows version. Boost attempts to make the changes invisible,
but adds needless complexity and overhead in doing so, and obstructs control.
MPIR has a Visual Studio repository on Github, and a separate Linux repository
on Github. GMP builds on a lot of obscure platforms, but not really supported
on Windows.
For supporting Windows and Linux only, MPIR all the way is the way to go. For
compatibility with little used and obscure environments, you might want to
have your own custom thin layer that maps GMP integers and MPIR integers to
your integers, but that can wait till we have conquered the world.
My most immediate need for MPIR is the extended Euclidean algorithm
for modular multiplicative inverse, which it, of course, supports,
`mpz_gcdext`, greatest common divisor extended, but which is deeply
hidden in the [documentation](http://www.mpir.org/mpir-3.0.0.pdf).
A bitmessage client written in C. Designed to run on a linux mail server
and interface bitmessage to mail. Has no UI, intended to be used with the linux mail UI.
Unfortunately, setting up a linux mail server is a pain in the ass. Needs the Zooko UI.
But its library contains everything you need to share data around a group of people, many of them behind NATs.
Does not implement NAT penetration. Participants behind a NAT are second class unless they implement port forwarding, but participants with unstable IPs are not second class.
A reliable udp library with congestion control which has vastly more development work done on it than any other reliable udp networking library, but which is largely used to work with Steam gaming, and Steam's closed source code. Has no end of hooks to closed source built into it, but works fine without those hooks.
Written in C++. Architecture overly specific and married to Steam. Would
have to be married to Tokio to have massive concurrency. But you don't
need to support hundreds of clients right away.
Well, perhaps I do, because in the face of DDOS attack, you need to keep
a lot of long lived inactive connections around for a long time, any of
which could receive a packet at any time. I need to look at the
GameNetworkingSockets code and see how it listens on lots and lots of
sockets. If it uses [overlapped IO], then it is golden. Get it up first, and it put inside a service later.
on the other hand, is elegant, short, and self explanatory.
The code project has [example code written in C++](https://www.codeproject.com/Articles/13071/Programming-Windows-TCP-Sockets-in-C-for-the-Begin), but it is still mighty intimidating compared to the QT client server example. I have yet to look at the wxWidgets client server examples – but looking for wxWidgets networking code has me worried that it is a casual afterthought, not adequately supported or adequately used.
ZeroMQ is Linux, C, and Cish C++.
Boost Asio is highly praised, but I tried it, and concluded its architecture
is broken, trying to make simplicity and elegance where it cannot be made,
resulting in leaky abstractions which leak incomprehensible complexity the
moment you stray off the beaten path – I feel they have lost control of their
design, and are just throwing crap at it trying to make something that
cannot work, work. I similarly found the Boost time libraries failed, leaking
complexity that they tried to hide, with the hiding merely adding complexity.
[cpp-httplib](https://github.com/yhirose/cpp-httplib) is wonderful in its
elegance, simplicity, and ease of integration. You just include a single
header. Unfortunately, it is strictly http/https, and we need something that
can deal with the inherently messy lower levels.
[Poco](http://pocoproject.org/) does everything, and is C++, but hey, let us first see how far we can get with wxWidgets.
Further, the main reason for doing https integration with the existing
browser web ecosystem, whose security is fundamentally broken, due the
state’s capacity to seize names, and the capacity of lots of entities to
intercept ssl. It might well be easier to fork opera or embed chromium. I
notice that Chromium has features supporting payment built into it, a bunch
of “PaymentMethod\*\*\*\*\*Event”
The best open source browser, and best privacy browser, is Opera, in that it comes from an entity less evil than Google.
[Opera](https://bit.ly/2UpSTFy) needs to be configured with [a bunch of privacy add ons](https://gab.com/PatriotKracker80/posts/c3kvL3pBbE54NEFaRGVhK1ZiWCsxZz09) [HTTPS Everywhere Add-on](https://bit.ly/2ODbPeE),
[uBlock](https://bit.ly/2nUJLqd), [DisconnectMe](https://bit.ly/2HXEEks), [Privacy-Badger](https://bit.ly/2K5d7R1), [AdBlock Plus](https://bit.ly/2U81ddo), [AdBlock for YouTube](https://bit.ly/2YBzqRh), two tracker blockers, and three ad blockers.
It would be great if we could make our software another addon, possibly chatting by websocket to the wallet.
The way it would work be to add another protocol to the browser:
ro://name1.name2.name3/directory/directory/endpoint. When you connect to such
an endpoint, your wallet, possibly a wallet with no global name, connects to
the named wallet, and gets IP, a port, a virtual server name, a cookie
unique for your wallet, and the hash of the valid ssl certificate for that
name, and then the browser makes a connection to the that server, ignoring
the CA system and the DNS system. The name could be a DNS name and the
certificate a CA certificate, in which case the connection looks to the
server like any other, except for the cookie which enables it to send
messages, typically a payment request, to the wallet.
We could implement transaction outputs and inputs as a fixed amount of
fungible tokens, limited to $2^{64}-1$ tokens, using [Safeint] That will be
future proof for a long time, but not forever.
Indeed, anything that does not use Zksnarks is not future proof for the
indefinite future.
Or we could implement decimal floating point with unlimited exponents
and mantissa implemented on top of [MPIR]
Or we could go ahead with the canonical representation being unlimited
decimal exponent and unlimited mantissa, but the wallet initially only
generates, and only can handle, transactions that can be represented by[Safeint], and always converts the mantissa plus decimal exponent to and
from a safeint.
if we rely on safeint, and our smallest unit is the microrho, that is room for
eighteen trillion rho. We can start actually using the unlimited precision of
the exponent and the mantissa in times to come - not urgent, merely
architect it into the canonical format.
From the point of view of the end user, this will merely be an upgrade that
allows nanorho, picorho, femptorho, attorho, zeptorho, yoctorho, and allows a decimal point in yoctorho quantities. And then we go to a new unit, the jim, with one thousand yottajim equals one yoctorho, a billion yoctojim equals one attorho, a trillion exajim equals one attorho.
To go all the way around to two byte exponents, for testing purposes, will
need some additional new units after the jim. (And we should impose a
minimum unit size of $10^{-195}$ rho or $10{-6} rho, thereby ensuring
that transaction size is bounded while allowing compatibility for future expansion.)
Except in test and development code, any attempt to form a transaction
involving quantities with exponents less than $1000^{-2}$ will cause a
gracefully handled exception, and in all code any attempt to display
or perform calculations on transaction inputs and outputs for which no
display units exist will cause an ungracefully handled exception.
In the first release configuration parameters, the lowest allowed exponent
will be $1000^{-2}$, corresponding to microrho, and the highest allowed
exponent $1000^4$, corresponding to terarho, and machines will be
programmed to vote "incapable" and "no" on any proposal to change those
parameters. However they will correctly handle transactions beyond those
limits provided that when quantities are expressed in the smallest unit of
any of the inputs and outputs, the sum of all the inputs and of all the
outputs remains below $2^{64}$. To ensure that all releases are future
compatible, the blockchain should have some exajim transactions, and
unspent transaction outputs but the peers should refuse to form any more
of them. The documentation will say that arbitrarily small and large new
transaction outputs used to be allowed, but are currently not allowed, to
reduce the user interface attack surface that needs to be security checked
and to limit blockchain bloat, and since there is unlikely to be demand for
this, this will probably not be fixed for a very long time.
Or perhaps it would be less work to support humungous transactions from
the beginning, subject to some mighty large arbitrary limit to prevent
denial of service attack, and eventually implementing native integer
handling of normal sized transactions as an optimization, for transactions where all quantities fit within machine sized words, and rescaled intermediate outputs will be less than $64 - \lceil log_2($number of inputs and outputs$) \rceil$ bits.
Which leads me to digress how we are going to handle protocol updates:
## handling protocol updates
1. Distribute software capable of handling the update.
1. A proposed protocol update transaction is placed on the blockchain.
1. Peers indicate capability to handle the protocol update. Or ignore it,
My experience with Boost is that it is no damned good: They have an over
elaborate pile of stuff on top of the underlying abstractions, which pile has high runtime cost, and specializes the underlying stuff in ways that only
work with boost example programs and are not easily generalized to do what
one actually wishes done.
Their abstractions leak.
[Boost high precision arithmetic `gmp_int`]:https://gmplib.org/
[Boost high precision arithmetic `gmp_int`] A messy pile built on top of
GMP. Its primary benefit is that it makes `gmp` look like `mpir` Easier to use [MPIR] directly.
The major benefit of boost `gmp` is that it runs on some machines and
operating systems that `mpir` does not, and is for the most part source code
compatible with `mpir`.
A major difference is that boost `gmp` uses long integers, which are on sixty
four bit windows `int32_t`, where `mpir` uses `mpir_ui` and `mpir_si`, which are
on sixty four bit windows `uint64_t` and `int64_t`. This is apt to induce no
end of major porting issues between operating systems.
Boost `gmp` code running on windows is apt to produce radically different
results to the same boost `gmp` code running on linux. Long `int` is just not
portable, and should never be used. This kind of issue is absolutely typical
of boost.
In addition to the portability issue, it is also a typical example of boost
abstractions denying you access to the full capability of the thing being
abstracted away. It is silly to have a thirty two bit interface between sixty
four bit hardware and unlimited arithmetic precision software.
[QUIC] is UDP with flow control, reliability, and SSL/TLS encryption, but no
DDoS resistance, and total insecurity against CA attack.)
## Boost Asynch
Boost implements event oriented multithreading in IO service, but don’t like
it because it fails to interface with Microsoft’s implementation of asynch
internet protocol, WSAAsync, and WSAEvent. Also because brittle,
incomprehensible, and their example programs do not easily generalize to
anything other than that particular example.
To the extent that you need to interact with a database, you need to process
connections from clients in many concurrent threads. Connection handlers are
run in thread, that called `io_service::run()`.
You can create a pool of threads processing connection handlers (and waiting
for finalizing database connection), by running `io_service::run()` from
multiple threads. See Boost.Asio docs.
## Asynch Database access
MySQL 5.7 supports [X Plugin / X Protocol, which allows asynchronous query execution and NoSQL But X devapi was created to support node.js and stuff. The basic idea is that you send text messages to mysql on a certain port, and asynchronously get text messages back, in google protobuffs, in php, JavaScript, or sql. No one has bothered to create a C++ wrapper for this, it being primarily designed for php or node.js](https://dev.mysql.com/doc/refman/5.7/en/document-store-setting-up.html)
SQLite nominally has synchronous access, and the use of one read/write
thread, many read threads is recommended. But under the hood, if you enable
WAL mode, access is asynchronous. The nominal synchrony sometimes leaks into
the underlying asynchrony.
By default, each `INSERT` is its own transaction, and transactions are
excruciatingly slow. Wal normal mode fixes this. All writes are writes to the
writeahead file, which gets cleaned up later.
The authors of SQLite recommend against multithreading writes, but we
do not want the network waiting on the disk, nor the disk waiting on the
network, therefore, one thread with asynch for the network, one purely
synchronous thread for the SQLite database, and a few number crunching
threads for encryption, decryption, and hashing. This implies shared
[Libpcap and Win10PCap](https://en.wikipedia.org/wiki/Pcap#Wrapper_libraries_for_libpcap) provide very low level, OS independent, access to packets, OS independent because they are below the OS, rather than above it. [Example code for visual studio.](https://www.csie.nuk.edu.tw/~wuch/course/csc521/lab/ex1-winpcap/)
[Simple sequential procedural socket programming for windows sockets.](https://www.binarytides.com/winsock-socket-programming-tutorial/)
If I program from the base upwards, the bottom most level would be a single
thread sitting on a select statement. Whenever the select fired, would
execute a corresponding functor transfering data between userspace and system
space.
One thread, and only one thread, responsible for timer events and
transferring network data between userspace and systemspace.
If further work required in userspace that could take significant time (disk
operations, database operations, cryptographic operations) that functor under
that thread would stuff another functor into a waitless stack, and a bunch
of threads would be waiting for that waitless stack to be signaled, and one
of those other threads would execute that functor.
The reason we have a single userpace thread handling the select and transfers
between userpace and systemspace is that that is a very fast and very common
operation, and we don’t want to have unnecessary thread switches, wherein
one thread does something, then immediately afterwards another thread does
almost the same thing. All quickie tasks should be handled sequentially by
one thread that works a state machine of functors.
The way to do asynch is to wrap sockets in classes that reflect the intended
use and function of the socket. Call each instance of such a class a
connection. Each connection has its own state machine state and its own
Using wxSockets commits us to having a single thread managing everything. To
get around the power limit inherent in that, have multiple peers under
multiple names accessing the same database, and have a temporary and
permanent redirect facility – so that if you access `peername,` your
connection, and possibly your link, get rewritten to `p2.peername` by peers
trying to balance load.
Microsoft tells us:
> receiving, applications use the WSARecv or WSARecvFrom functions to supply
buffers into which data is to be received. If one or more buffers are posted
prior to the time when data has been received by the network, that data could
be placed in the user’s buffers immediately as it arrives. Thus, it can
avoid the copy operation that would otherwise occur at the time the recv or
recvfrom function is invoked.
Moral is, we should use the sockets that wrap WSA.
# Tcl
Tcl is a really great language, and I wish it would become the language of my new web, as JavaScript is the language of the existing web.
But it has been semi abandoned for twenty years.
It consists of a string (which is implemented under the hood as a copy on
write rope, with some substrings of the rope actually being run time typed
C++ types that can be serialized and deserialized to strings) and a name
table, one name table per interpreter, and at least one interpreter per
thread. The entries in the name table can be strings, C++ functions, or run
time typed C++ types, which may or may not be serializable or deserializable,
but conceptually, it is all one big string, and the name table is used to
find C and C++ functions which interpret the string following the command.
Execution consists of executing commands found in the string, which transform
it into a new string, which in turn gets transformed into a new string,
until it gets transformed into the final result. All code is metacode. If
elements of the string need to be deserialized to and from a C++ run time
type, (because the command does not expect that run time type) but cannot be,
because there is no deserialization for that run time type, you get a run
time error, but most of the time you get, under the hood, C++ code executing
C++ types – it is only conceptually a string being continually transformed
into another string. The default integer is infinite precision, because
integers are conceptually arbitrary length strings of numbers.
To sandbox third party code, including third party gui code, just restrict
the nametable to have no dangerous commands, and to be unable to load c++
modules that could provide dangerous commands.
It is faster to bring up a UI in Tcl than in C. We get, for free, OS
independence.
Tcl used to be the best level language for attaching C programs to, and for
testing C programs, or it would be if SWIG actually worked. The various C
components of Tcl provide an OS independent layer on top of both Linux and
Windows, and it has the best multithread and asynch system.
It is also a metaprogramming language. Every Tcl program is a metaprogram – you always write code that writes code.
The Gui is necessarily implemented as asynch, something like the JavaScript
dom in html, but with explicit calls to the event/idle loop. Multithreading
is implemented as multiple interpreters, at least one interpreter per thread,
sending messages to each other.
# Time
After spending far too much time on this issue, which is has sucked in far
too many engineers and far too much thought, and generated far too many
libraries, I found the solution was c++11 Chrono: For short durations, we
use the steady time in milliseconds, where each machine has its own
epoch, and no two machines have exactly the same milliseconds. For
longer durations, we use the system time in seconds, where all machines
are expected to be within a couple of seconds of each other. For the human
readable system time in seconds to be displayed on a particular machine,
we use the ISO format 2012‑01‑14_15:39:34+10:00 (timezone with 10
hour offset equivalent to Greenwich time 2012‑01‑14_05:39:34+00:00)
[For long durations, we use signed system time in seconds, for short durations unsigned steady time in milliseconds.](./libraries/rotime.cpp)
Windows and Unix both use time in seconds, but accessed and manipulated in
incompatible ways.
Boost has numerous different and not altogether compatible time libraries,
all of them overly clever and all of them overly complicated.
wxWidgets has OS independent time based on milliseconds past the epoch, which
however fails to compress under Cap\'n Proto.
I was favourably impressed by the approach to time taken in tcp packets,
that the time had to be approximately linear, and in milliseconds or larger,
but they were entirely relaxed about the two ends of a tcp connection
using different clocks with different, and variable, speeds.
It turns out you can go a mighty long way without a global time, and to the
extent that you do need a global time, should be equivalent to that used in
email, which magically hides the leap seconds issue.
# UTF‑8 strings
Are supported by the wxWidgets wxString, which provide support to and
from wide character variants and locale variants. (We don't want locale
variants, they are obsolete. The whole world is switching to UTF, but
our software and operating environments lag)
`wString::ToUTF8()` and `wString::FromUTF8()` do what you would expect.
On visual studio, need to set your source files to have bom, so that Visual
Studio knows that they are UTF‑8, need to set the compiler environment in
Visual Studio to UTF‑8 with `/Zc:__cplusplus /utf-8 %(AdditionalOptions)`
And you need to set the run time environment of the program to UTF‑8
with a manifest.
You will need to place all UTF‑8 string literals and string constants in a
resource file, which you will use for translated versions.
If you fail to set the compilation and run time environment to UTF‑8 then
for extra confusion, your debugger and compiler will *look* as if they are
handling UTF‑8 characters correctly as single byte characters, while at
least wxString alerts you that something bad is happening by run time
translating to the null string.
Automatic string conversion in wxWidgets is *not* UTF‑8, and if you have
any unusual symbols in your string, you get a run time error and the empty
string. So wxString automagic conversions will rape you in the ass at
runtime, and for double the confusion, your correctly translated UTF‑8
strings will look like errors. Hence the need to make sure that the whole
environment from source code to run time execution is consistently UTF‑8,
which has to be separately ensured in three separate place.
When wxWidgets is compiled using `#define wxUSE_UNICODE_UTF8 1`,
it provides UTF‑8 iterators and caches a character index, so that accessing
a character by index near a recently used character is fast. The usual
iterators `wx.begin()`, `wx.end()`, const and reverse iterators are available.
I assume something bad happens if you advance a reverse iterator after
writing to it.
wxWidgets compiled with `#define wxUSE_UNICODE_UTF8 1` is the
way of the future, but not the way of the present. Still a work in progress
Does not build under Windows. Windows now provide UTF8 entries to all
its system functions, which should make it easy.
wxWidgets provides `wxRegEx` which, because wxWidgets provides index
by entity, should just work. Eventually. Maybe the next release.
# [UTF8-CPP](http://utfcpp.sourceforge.net/ "UTF-8 with C++ in a Portable Way")
A powerful library for handling UTF‑8. This somewhat duplicates the
facilities provided by wxWidgets with `wxUSE_UNICODE_UTF8==1`
For most purposes, wxString should suffice, when it actually works with
UTF8. Which it does not yet on windows. We shall see. wxWidgets
recommends not using wxString except to communicate with wxWidgets,
and not using it as general UTF‑8 system. Which is certainly the current
state of play with wxWidgets.
For regex to work correctly, probably need to do it on wxString's native
UTF‑16 (windows) or UTF‑32 (unix), but it supposedly works on `UTF8`,
assuming you can successfully compile it, which you cannot.
# Cap\'n Proto
[Designed for a download from github and run cmake install.](https://capnproto.org/install.html) As all software should be.
But for mere serialization to of data to a form invariant between machine
architectures and different compilers and different compilers on the same
machine, overkill for our purposes. Too much capability.
# Awesome C++
[Awesome C++] A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things
[Awesome C++]:https://cpp.libhunt.com
"A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things"
{target="_blank"}
I encountered this when looking at the Wt C++ Web framework, which seems to be mighty cool except I don't think I have any use for a web framework. But [Awesome C++] has a very pile of things that I might use.
Wt has the interesting design principle that every open web page maps to a
windows class, every widget on the web page, maps to a windows class,
every row in the sql table maps to a windows class. Cool design.