Find files that are not in .gitignore

I have find command that display files in my project:

find . -type f -not -path './node_modules*' -a -not -path '*.git*' 
       -a -not -path './coverage*' -a -not -path './bower_components*' 
       -a -not -name '*~'

How can I filter the files so it don’t show the ones that are in .gitignore?

I thought that I use:

while read file; do
    grep $file .gitignore > /dev/null && echo $file;
done

but .gitignore file can have glob patterns (also it will not work with paths if file is in .gitignore), How can I filter files based on patterns that may have globs?

Asked By: jcubic

||

You can use an array in which bash glob will be performed.

Having files like this :

touch file1 file2 file3 some more file here

And having an ignore file like this

cat <<EOF >ignore
file*
here
EOF

Using

arr=($(cat ignore));declare -p arr

Will result to this:

declare -a arr='([0]="file" [1]="file1" [2]="file2" [3]="file3" [4]="here")'

You can then use any technique to manipulate those data.

I personally prefer something like this:

awk 'NR==FNR{a[$1];next}(!($1 in a))'  <(printf '%sn' "${arr[@]}") <(find . -type f -printf %f\n)
#Output
some
more
ignore
Answered By: George Vasiliou

git provides git-check-ignore to check whether a file is excluded by .gitignore.

So you could use:

find . -type f -not -path './node_modules*' 
       -a -not -path '*.git*'               
       -a -not -path './coverage*'          
       -a -not -path './bower_components*'  
       -a -not -name '*~'                   
       -exec sh -c '
         for f do
           git check-ignore -q "$f" ||
           printf '%sn' "$f"
         done
       ' find-sh {} +

Note that you would pay big cost for this because the check was performed for each file.

Answered By: cuonglm

there is a git command for doing exactly this: e.g.

my_git_repo % git grep --line-number TODO                                                                                         
desktop/includes/controllers/user_applications.sh:126:  # TODO try running this without sudo
desktop/includes/controllers/web_tools.sh:52:   TODO: detail the actual steps here:
desktop/includes/controllers/web_tools.sh:57:   TODO: check if, at this point, the menurc file exists. i.e. it  was created

As you stated, it will do a basic grep will most of the normal grep options, but it will not search .git or any of the files or folders in your .gitignore file.
For more details, see man git-grep

Submodules:

If you have other git repos inside this git repo, (they should be in submodules) then you can use the flag --recurse-submodules to search in the submodules as well

Answered By: the_velour_fog

To show files that are in your checkout and that are tracked by Git, use

$ git ls-files

This command has a number of options for showing, e.g. cached files, untracked files, modified files, ignored files etc. See git ls-files --help.

Answered By: Kusalananda on strike

I think this works well:

git ls-files --cached --modified --other --exclude-standard

If you also want to recurse into submodules, add --recurse-submodules.

Answered By: Mitar

This solution uses metaprogramming heavily, but it’s much faster than the answer marked correct by reducing repeated shell-command calls.

Replace sed deletion pattern with a substantial subset of your own known working-directory-level (./-starting) exclude list. Put the *-starting ones in the sed substitute pattern like normal find invocation (proper escaping plz).

FILES="$( 
  find . -depth 1 
  | sed -e '//.git/d' -e '//node_modules/d' 
      -e 's/.*/find '"'&'"' -type f/' 
  | sh 
)"
# >&2 echo debug: $FILES

HUGE_SED_COMMAND="$( 
  echo '/^('$( 
    git check-ignore --stdin <<<"$FILES" 
    | sed 's/./\./g; s///\//g' 
    )')$/d' 
  | sed 's/ /|/g' 
)"
# >&2 echo debug: $HUGE_SED_COMMAND

sed -E "${HUGE_SED_COMMAND}" <<<"$FILES"
Answered By: rain-n-thunder
Categories: Answers Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.