Find files that are not in .gitignore
I have find command that display files in my project:
find . -type f -not -path './node_modules*' -a -not -path '*.git*'
-a -not -path './coverage*' -a -not -path './bower_components*'
-a -not -name '*~'
How can I filter the files so it don’t show the ones that are in .gitignore?
I thought that I use:
while read file; do
grep $file .gitignore > /dev/null && echo $file;
done
but .gitignore file can have glob patterns (also it will not work with paths if file is in .gitignore), How can I filter files based on patterns that may have globs?
You can use an array in which bash glob will be performed.
Having files like this :
touch file1 file2 file3 some more file here
And having an ignore
file like this
cat <<EOF >ignore
file*
here
EOF
Using
arr=($(cat ignore));declare -p arr
Will result to this:
declare -a arr='([0]="file" [1]="file1" [2]="file2" [3]="file3" [4]="here")'
You can then use any technique to manipulate those data.
I personally prefer something like this:
awk 'NR==FNR{a[$1];next}(!($1 in a))' <(printf '%sn' "${arr[@]}") <(find . -type f -printf %f\n)
#Output
some
more
ignore
git
provides git-check-ignore
to check whether a file is excluded by .gitignore
.
So you could use:
find . -type f -not -path './node_modules*'
-a -not -path '*.git*'
-a -not -path './coverage*'
-a -not -path './bower_components*'
-a -not -name '*~'
-exec sh -c '
for f do
git check-ignore -q "$f" ||
printf '%sn' "$f"
done
' find-sh {} +
Note that you would pay big cost for this because the check was performed for each file.
there is a git command for doing exactly this: e.g.
my_git_repo % git grep --line-number TODO
desktop/includes/controllers/user_applications.sh:126: # TODO try running this without sudo
desktop/includes/controllers/web_tools.sh:52: TODO: detail the actual steps here:
desktop/includes/controllers/web_tools.sh:57: TODO: check if, at this point, the menurc file exists. i.e. it was created
As you stated, it will do a basic grep will most of the normal grep options, but it will not search .git
or any of the files or folders in your .gitignore
file.
For more details, see man git-grep
Submodules:
If you have other git repos inside this git repo, (they should be in submodules) then you can use the flag --recurse-submodules
to search in the submodules as well
To show files that are in your checkout and that are tracked by Git, use
$ git ls-files
This command has a number of options for showing, e.g. cached files, untracked files, modified files, ignored files etc. See git ls-files --help
.
I think this works well:
git ls-files --cached --modified --other --exclude-standard
If you also want to recurse into submodules, add --recurse-submodules
.
This solution uses metaprogramming heavily, but it’s much faster than the answer marked correct by reducing repeated shell-command calls.
Replace sed
deletion pattern with a substantial subset of your own known working-directory-level (./
-starting) exclude list. Put the *
-starting ones in the sed substitute pattern like normal find invocation (proper escaping plz).
FILES="$(
find . -depth 1
| sed -e '//.git/d' -e '//node_modules/d'
-e 's/.*/find '"'&'"' -type f/'
| sh
)"
# >&2 echo debug: $FILES
HUGE_SED_COMMAND="$(
echo '/^('$(
git check-ignore --stdin <<<"$FILES"
| sed 's/./\./g; s///\//g'
)')$/d'
| sed 's/ /|/g'
)"
# >&2 echo debug: $HUGE_SED_COMMAND
sed -E "${HUGE_SED_COMMAND}" <<<"$FILES"