documenting struggles with submodules

This commit is contained in:
Cheng 2022-06-29 11:45:27 +10:00
parent 762cfa6f53
commit afd62af3cf
No known key found for this signature in database
GPG Key ID: D51301E176B31828

View File

@ -95,34 +95,34 @@ Libraries are best dealt with as [Git submodules].
[build libraries]:https://git-scm.com/book/en/v2/Git-Tools-Submodules
Git submodules leak complexity and surprising and inconvenient behaviour
all over the place if one is trying to make a change that affects multiple
modules simultaneously. But having your libraries separate from your git
repository results in non portable surprises and complexity. Makes it hard
for anyone else to build your project, because they will have to, by hand,
tell your project where the libraries are on their system.
Git submodules leak complexity and surprising and inconvenient
behaviour all over the place if one is trying to make a change that affects
multiple modules simultaneously. But having your libraries separate from
your git repository results in non portable surprises and complexity. Makes
it hard for anyone else to build your project, because they will have to, by
hand, tell your project where the libraries are on their system.
When one is developing code, you normally have a git branch. But
the git commit of the master project in which the submodule is
contained does not notice its subproject has changed, unless the
subproject head has changed. And the subject project head will not
change if it points to a name, rather than to a particular commit. For
ones changes to a submodule to be reflected in the master project in
any consistent or predictable way, the submodule has to be in
detached head mode, with the head pointing directly to a commit,
rather than pointing to a branch that points to a commit.
When one is developing code, you normally have a git branch. But the git
commit of the master project in which the submodule is contained does
not notice its subproject has changed, unless the subproject head has
changed. And the subject project head will not change if it points to a
name, rather than to a particular commit. For ones changes to a submodule
to be reflected in the master project in any consistent or predictable way,
the submodule has to be in detached head mode, with the head pointing
directly to a commit, rather than pointing to a branch that points to a
commit.
Git commands in master project do not look inside the subproject.
They just look at the subproject's head.
Git commands in master project do not look inside the subproject. They
just look at the subproject's head.
This means that signing off on changes to a submodule is
irrelevant. One signs off on the master project, which includes the
hash of that submodule commit.
This means that signing off on changes to a submodule is irrelevant. One
signs off on the master project, which includes the hash of that submodule
commit.
When one is changing submodules for the use of a particular
project, making related changes in the master project and
submodules, one should not track the changes by creating and
updating branch names in the submodule, but by creating and
When one is changing submodules for the use of a particular project,
making related changes in the master project and submodules, one should
not track the changes by creating and updating branch names in the
submodule, but by creating and
updating branch names in the containing module, so that the
commits in the submodule have no name in the submodule, the
submodule is always in detached head state, albeit the head may be
@ -146,37 +146,25 @@ the primary project module,and when you have done with a submodule,
git switch --detach
```
Within the submodule, commits are nameless with detached head, except
when you are working on them, the name in primary module naming a
group of related commits in several submodules, which commits do not
usually receive independent names of their own, even though the commits
have to be made within the submodule, not in the containing module
which names the complete set of interrelated commits.
From the point of view of the containing superproject, submodule
commits are nameless with detached head, except when you are working
on them, the name in primary module naming a group of related commits
in several submodules, which commits do not receive independent names
of their own, even though the commits have to be made within the
submodule, not in the containing module which names the complete set of
interrelated commits.
The submodule commits may well belong to different branches and tags in
the superproject, but in the submodules, they are nameless in that all the
submodule commits wind up attached to the same branch, your submodule tracking
branch.
the superproject, but the submodules know nothing of superproject
names, and the superproject knows nothing of submodules names.
In this case, working on submodules as part of a single larger project, you should set
```bash
git config --local submodule.recurse true
```
In the primary project, so that you conveniently push and pull a
group of related changes as one thing, and the build for the whole
project should treat the submodule libraries as having a
dependency on module/.git/modules/submodule/HEAD, rather than
checking every single file in the submodules every time to see
if one has changed, for there could be an enormous number of
them. The primary build should invoke the submodule build, which
*will* check each file in the submodule for changes, only when the
submodule detached head has changed. And therefore, you want it
to change, you want the submodule head to be nameless and
detached, whenever you modify a submodule as part of a larger
project where you test your changes by rebuilding the whole
project to make sure all your related changes fit together.
The primary build should invoke the submodule build, which *will* check
each file in the submodule for changes, only when the submodule
detached head has changed. And therefore, you want it to change, you
want the submodule head to be nameless and detached, whenever you
modify a submodule as part of a larger project where you test your
changes by rebuilding the whole project to make sure all your related
changes fit together.
When tracking an upstream submodule that has submodules of its
own, which have their own upstreams
@ -189,17 +177,33 @@ git pull upstream --recurse-submodules=on-demand «their-latest-release»
Make sure things still work. Get everything working. (You do have unit test, right?)
When you are working a submodule, your branch has to have a name, or
when you push it and pull it, strange things will happen. But the
superproject pushes and pulls by commit, not by name, so when you are
done,
then:
git submodule foreach --recursive 'git push`
```bash
git submodule foreach --recursive 'git switch --detach'
git submodule foreach --recursive 'git push origin HEAD:«your-tracking-branch»'
git submodule foreach --recursive 'git push`
```
As its own thing, a submodule has branches with names. As a component
of a superproject, it has nameless commits.
If you are in a submodule directory of the superproject, and you push and
pull, what you are pushing and pulling had better have a name, or else
unpleasant surprises will happen. If you are in the superproject directory
and pushing and pulling the whole thing, that commit better be detached.
You pull a named release of the project that is a submodule of your project
from `upstream`, diddling with it to make it work with your project, then
you push it to `origin` as a nameless commit, though you probably gave the
various commits you made while working on it temporary and local names
with `switch -c yet-another-idea`
you push it to `origin` under its own name, the you detach it from its name,
so the superproject will know that the submodule has been changed.
All of which, of course, presupposes you have already set unit tests,
upstream, origin, and your tracking branch appropriately.
@ -209,8 +213,7 @@ repository, on your remote submodule repository they need to have a name
to be pushed to, hence you need to have a tracking branch in each of your
remote images of each of your submodules, and that tracking branch will
need to point to the root of a tree of all the nameless commits that the
names and commits
in your superproject that contains this submodules point to.
names and commits in your superproject that contains this submodules point to.
You want `.gitmodules` in your local image of the repository to
reflect the location and fork of your new remote repository, with
@ -225,10 +228,33 @@ you rely someone else's compiled code, things break and you get
accidental and deliberate backdoors, which is a big concern when you are
doing money and cryptography.
GitSubmodules is hierarchical, but source code has strange loops. The Bob
module uses the Alice module and the Carol module, but Alice uses Bob
and Carol, and Carol uses Alice and Bob. How do you make sure that all
your modules are using the same commit of Alice?
When your submodules are simply your copy of someone else code, it gets
little bit messy. When you change them, it gets messier.
And visual studio's handling of submodules is just broken and buggy. A
command that works in git-bash will produce unexpected surprising, and
unpleasant results in visual studio's git. I really need to give up on
visual studio, it is closed source code, and turning bad.
When one developer makes minor changes in submodule to make it work
with the whole project on which several developers are working on, no
end of mysterious grief ensues, because strange and curiously difficult to
identify differences appear between builds that Git would normally ensure
are the same build. Submodules are a halfway house between completely
absorbing the other party's code into your code, and using it as a prebuilt
library. Instead, we have walls dividing the project into pieces, which is a
lot less grief than on big pile of code, but managing those walls winds up
taking a lot of time, and mistakes get made because a git commit in a
project with submodules that have changed does not mean quite the same
thing, nor have quite the same behaviour, as git commit in a project with
unchanging submodules. But then truly integrating a project that is the
product of a great deal of time by a great many of people, and managing it
thereafter, is likely to take up a great deal more time.
Git Submodules is hierarchical, but source code has strange loops. The
Bob module uses the Alice module and the Carol module, but Alice uses
Bob and Carol, and Carol uses Alice and Bob. How do you make sure that
all your modules are using the same commit of Alice?
Well, if modules have strange loops you make one of them the master, and
the rest of them direct submodules of that master, brother subs to each