From afd62af3cfb2eaf2668136f095df3e2b743dcd70 Mon Sep 17 00:00:00 2001 From: Cheng Date: Wed, 29 Jun 2022 11:45:27 +1000 Subject: [PATCH] documenting struggles with submodules --- docs/libraries.md | 150 +++++++++++++++++++++++++++------------------- 1 file changed, 88 insertions(+), 62 deletions(-) diff --git a/docs/libraries.md b/docs/libraries.md index ab14d36..815fdd4 100644 --- a/docs/libraries.md +++ b/docs/libraries.md @@ -95,34 +95,34 @@ Libraries are best dealt with as [Git submodules]. [build libraries]:https://git-scm.com/book/en/v2/Git-Tools-Submodules -Git submodules leak complexity and surprising and inconvenient behaviour -all over the place if one is trying to make a change that affects multiple -modules simultaneously. But having your libraries separate from your git -repository results in non portable surprises and complexity. Makes it hard -for anyone else to build your project, because they will have to, by hand, -tell your project where the libraries are on their system. +Git submodules leak complexity and surprising and inconvenient +behaviour all over the place if one is trying to make a change that affects +multiple modules simultaneously. But having your libraries separate from +your git repository results in non portable surprises and complexity. Makes +it hard for anyone else to build your project, because they will have to, by +hand, tell your project where the libraries are on their system. -When one is developing code, you normally have a git branch. But -the git commit of the master project in which the submodule is -contained does not notice its subproject has changed, unless the -subproject head has changed. And the subject project head will not -change if it points to a name, rather than to a particular commit. For -ones changes to a submodule to be reflected in the master project in -any consistent or predictable way, the submodule has to be in -detached head mode, with the head pointing directly to a commit, -rather than pointing to a branch that points to a commit. +When one is developing code, you normally have a git branch. But the git +commit of the master project in which the submodule is contained does +not notice its subproject has changed, unless the subproject head has +changed. And the subject project head will not change if it points to a +name, rather than to a particular commit. For ones changes to a submodule +to be reflected in the master project in any consistent or predictable way, +the submodule has to be in detached head mode, with the head pointing +directly to a commit, rather than pointing to a branch that points to a +commit. -Git commands in master project do not look inside the subproject. -They just look at the subproject's head. +Git commands in master project do not look inside the subproject. They +just look at the subproject's head. -This means that signing off on changes to a submodule is -irrelevant. One signs off on the master project, which includes the -hash of that submodule commit. +This means that signing off on changes to a submodule is irrelevant. One +signs off on the master project, which includes the hash of that submodule +commit. -When one is changing submodules for the use of a particular -project, making related changes in the master project and -submodules, one should not track the changes by creating and -updating branch names in the submodule, but by creating and +When one is changing submodules for the use of a particular project, +making related changes in the master project and submodules, one should +not track the changes by creating and updating branch names in the +submodule, but by creating and updating branch names in the containing module, so that the commits in the submodule have no name in the submodule, the submodule is always in detached head state, albeit the head may be @@ -146,37 +146,25 @@ the primary project module,and when you have done with a submodule, git switch --detach ``` -Within the submodule, commits are nameless with detached head, except -when you are working on them, the name in primary module naming a -group of related commits in several submodules, which commits do not -usually receive independent names of their own, even though the commits -have to be made within the submodule, not in the containing module -which names the complete set of interrelated commits. +From the point of view of the containing superproject, submodule +commits are nameless with detached head, except when you are working +on them, the name in primary module naming a group of related commits + in several submodules, which commits do not receive independent names + of their own, even though the commits have to be made within the + submodule, not in the containing module which names the complete set of + interrelated commits. The submodule commits may well belong to different branches and tags in -the superproject, but in the submodules, they are nameless in that all the -submodule commits wind up attached to the same branch, your submodule tracking -branch. +the superproject, but the submodules know nothing of superproject +names, and the superproject knows nothing of submodules names. -In this case, working on submodules as part of a single larger project, you should set - -```bash -git config --local submodule.recurse true -``` - -In the primary project, so that you conveniently push and pull a -group of related changes as one thing, and the build for the whole -project should treat the submodule libraries as having a -dependency on module/.git/modules/submodule/HEAD, rather than -checking every single file in the submodules every time to see -if one has changed, for there could be an enormous number of -them. The primary build should invoke the submodule build, which -*will* check each file in the submodule for changes, only when the -submodule detached head has changed. And therefore, you want it -to change, you want the submodule head to be nameless and -detached, whenever you modify a submodule as part of a larger -project where you test your changes by rebuilding the whole -project to make sure all your related changes fit together. + The primary build should invoke the submodule build, which *will* check + each file in the submodule for changes, only when the submodule + detached head has changed. And therefore, you want it to change, you + want the submodule head to be nameless and detached, whenever you + modify a submodule as part of a larger project where you test your + changes by rebuilding the whole project to make sure all your related + changes fit together. When tracking an upstream submodule that has submodules of its own, which have their own upstreams @@ -189,17 +177,33 @@ git pull upstream --recurse-submodules=on-demand «their-latest-release» Make sure things still work. Get everything working. (You do have unit test, right?) + +When you are working a submodule, your branch has to have a name, or +when you push it and pull it, strange things will happen. But the +superproject pushes and pulls by commit, not by name, so when you are +done, + then: +git submodule foreach --recursive 'git push` + ```bash git submodule foreach --recursive 'git switch --detach' -git submodule foreach --recursive 'git push origin HEAD:«your-tracking-branch»' +git submodule foreach --recursive 'git push` ``` + +As its own thing, a submodule has branches with names. As a component +of a superproject, it has nameless commits. + +If you are in a submodule directory of the superproject, and you push and +pull, what you are pushing and pulling had better have a name, or else +unpleasant surprises will happen. If you are in the superproject directory +and pushing and pulling the whole thing, that commit better be detached. + You pull a named release of the project that is a submodule of your project from `upstream`, diddling with it to make it work with your project, then -you push it to `origin` as a nameless commit, though you probably gave the -various commits you made while working on it temporary and local names -with `switch -c yet-another-idea` +you push it to `origin` under its own name, the you detach it from its name, +so the superproject will know that the submodule has been changed. All of which, of course, presupposes you have already set unit tests, upstream, origin, and your tracking branch appropriately. @@ -209,8 +213,7 @@ repository, on your remote submodule repository they need to have a name to be pushed to, hence you need to have a tracking branch in each of your remote images of each of your submodules, and that tracking branch will need to point to the root of a tree of all the nameless commits that the -names and commits -in your superproject that contains this submodules point to. +names and commits in your superproject that contains this submodules point to. You want `.gitmodules` in your local image of the repository to reflect the location and fork of your new remote repository, with @@ -225,10 +228,33 @@ you rely someone else's compiled code, things break and you get accidental and deliberate backdoors, which is a big concern when you are doing money and cryptography. -GitSubmodules is hierarchical, but source code has strange loops. The Bob -module uses the Alice module and the Carol module, but Alice uses Bob -and Carol, and Carol uses Alice and Bob. How do you make sure that all -your modules are using the same commit of Alice? +When your submodules are simply your copy of someone else code, it gets + little bit messy. When you change them, it gets messier. + +And visual studio's handling of submodules is just broken and buggy. A +command that works in git-bash will produce unexpected surprising, and +unpleasant results in visual studio's git. I really need to give up on +visual studio, it is closed source code, and turning bad. + +When one developer makes minor changes in submodule to make it work +with the whole project on which several developers are working on, no +end of mysterious grief ensues, because strange and curiously difficult to +identify differences appear between builds that Git would normally ensure +are the same build. Submodules are a halfway house between completely +absorbing the other party's code into your code, and using it as a prebuilt +library. Instead, we have walls dividing the project into pieces, which is a +lot less grief than on big pile of code, but managing those walls winds up +taking a lot of time, and mistakes get made because a git commit in a +project with submodules that have changed does not mean quite the same +thing, nor have quite the same behaviour, as git commit in a project with +unchanging submodules. But then truly integrating a project that is the +product of a great deal of time by a great many of people, and managing it + thereafter, is likely to take up a great deal more time. + +Git Submodules is hierarchical, but source code has strange loops. The +Bob module uses the Alice module and the Carol module, but Alice uses +Bob and Carol, and Carol uses Alice and Bob. How do you make sure that +all your modules are using the same commit of Alice? Well, if modules have strange loops you make one of them the master, and the rest of them direct submodules of that master, brother subs to each