Maintaining Multiple Python Projects With myrepos
I maintain several open source Python projects, each in its own GitHub repository. I like to keep them all up to date according to a kind of template - similarity increases maintainability.
I have thought about putting them in a monorepo, but this would be unidiomatic and make it harder for contributors to join in. A tool I discovered recently to make working with multiple repositories more like using a monorepo is myrepos.
It allows you to run commands in multiple repositories, group the output by repository, and see a summary of successes/failures. It’s a relatively simple tool - not much more than a complex invocation of GNU parallel - but its integration with VCS’s does sometimes come in useful. It let me replace a flaky shell loop I was copy-pasting between uses.
It installs as the
mr command, which has a bunch of subcommands. The most useful one is
run, to run arbitrary commands in each repository.
For example, to see git status in all repositories, you can run:
$ mr run git status mr run: /Users/chainz/Documents/Projects/apig-wsgi On branch master Your branch is up to date with 'origin/master'. nothing to commit, working tree clean mr run: /Users/chainz/Documents/Projects/django-cors-headers On branch master Your branch is up to date with 'origin/master'. nothing to commit, working tree clean ... mr run: finished (19 ok)
(Output cropped because I have 19 repositories!)
You can install myrepos from most package managers (docs). On macOS I installed it with
brew install myrepos.
You then need to register your repositories by
cding to their folder and running
mr register. This adds them to the default configuration file
I added one change to my configuration file, to make 8 jobs to run in parallel by default:
[DEFAULT] jobs = 8
Most of the changes I make are parallel-safe and this gives quite a speed boost. You can switch back to serial for a single command with
--jobs 1 before the subcommand.
myrepos has saved me a heaps of time and reduced errors for many different changes on my projects. For example, recently I wanted to update the
MANIFEST.in files in all my repositories. This controls which files are added to the built Python packages. I also wanted to run the
check-manifest tool in my projects’ testing environments, to ensure their
MANIFEST.in files were correct.
I started by
cding to one of my smaller projects, and figuring out the steps that would do this.
First, I looked through the files and manually made the necessary edits. I then converted them to equivalent shell commands, like
sed. I then copied those into a shell script I called
add-check-manifest.sh, placed in my
~/scripts folder, which is on my
$PATH so myrepos can use it.
I iterated on the script by resetting the repository and rerunning the script, until it worked all the way through. The final version was this:
#!/bin/sh set -e rm MANIFEST.in echo 'global-exclude *.py[cod] graft tests prune __pycache__ prune requirements include HISTORY.rst include LICENSE include README.rst exclude .editorconfig exclude pyproject.toml exclude pytest.ini exclude tox.ini' > MANIFEST.in cd requirements echo "check-manifest ; python_version == '3.8.*'" >> requirements.in sort requirements.in -o requirements.in ./compile.sh cd .. sed -E -i '' -e 's/ multilint/ multilint\ check-manifest/g' tox.ini tox -r -e py38-codestyle git switch -c check-manifest git add --all git commit -m "Test MANIFEST.in with check-manifest"
- I used
set -eso the shell script would fail after the first error.
- The first two commands replace the
MANIFEST.infile to a new basic template - enough for many of my projects.
cdinto requirements is for updating the test requirements. I use pip-tools’
pip-compileto manage these.
sedcommand injects the call to
check-manifestin the test runs.
toxcommand runs the tests.
check-manifestwould fail the run on projects that the
MANIFEST.infile needed editing on.
- The final
gitcommands create a new branch and commit the changes. I left pushing to GitHub pull request for later.
I knew it wouldn’t work perfectly on all the repositories, but it was much faster than making all these changes manually.
I then ran it across all repositories with
mr run add-check-manifest.sh.
tox failed on a few, as I expected, so I went and patched them individually.
After each repository was ready, I ran another script I have called
pushupr. This pushes a branch and creates a GitHub pull request for it using GitHub’s
hub. I then needed CI to pass on all the PR’s, to be sure, so I took a break to go get on with life.
When I came back to my computer later, all the PR’s had passed. I tabbed through all the pull requests and hit the merge buttons.
I could then run a few myrepos commands to pull those locally and clean up the branches:
$ mr run git checkout master $ mr run git pull --prune $ mr run git branch -D check-manifest
Writing the script took a little initial investment, but saved me an hour or two of work. In fact, without it, I would probably have never have bothered to do such a mundane task across all my projects. A great example of automation as an enabler!
If your Django project’s long test runs bore you, I wrote a book that can help.
One summary email a week, no spam, I pinky promise.
- Comparing Generated Files Before and After Changes with Git Diff
- Word Counting My Whole Site
- Losslessly Compressing My JPEG Photos with jpegoptim