Git: count files in a repository

When you want to count the number of files within a Git repository, it's generally best to use Git itself for the task because it will skip over any generated or downloaded files. Here is a command chain to use to count all committed files within the current directory:
$ git ls-files -z | tr -d -c '\0' | wc -c
6947
Run within your repository root to count the whole repository, or within a subdirectory to count files within that subdirectory.
A breakdown:
git ls-files -zlists all files. It uses the-zoption to separate file names with null bytes, rather than newlines.tr -d -c '\0'deletes all characters except null bytes, leaving one null byte per file.wc -ccounts the incoming characters, one null byte per file, to produce a total.
Why use null byte separators?
Using null byte separators increases the accuracy of the count. By default, git ls-files separates filenames with newlines. Usually that is fine, but since filenames may contain newlines (usually by mistake), separating filenames with newlines could possibly overcount files.
The quicker-but-less accurate version using newline separators is:
$ git ls-files | wc -l
6947
In my sample repository, as in most, there are no filenames with newlines, so this command produces the same result as seen before.
Count particular file types
To count files matching a particular patterns, you can provide a pathspec to git ls-files. For example, to count Python files:
$ git ls-files '*.py' | wc -l
2853
Alternatively, to exclude a particular pattern, prefix it with :!. For example, to count all files except .po translation files:
$ git ls-files ':!*.po' | wc -l
5675
The pathspec syntax has a bunch of features, as covered in my Git book as well as the Git documentation.
😸😸😸 Check out my new book on using GitHub effectively, Boost Your GitHub DX! 😸😸😸
One summary email a week, no spam, I pinky promise.
Related posts:
- Git: list checked-in symlinks
- Git: find the largest commits
- Git: force colourization with
color.uior--color
Tags: git