5

I'd like to know all distinct extensions of files tracked by git in a given repo, in order to create appropriate .gitattributes file.

Example output expected:

bat
gitignore
gradle
html
jar
java
js
json
md
png
properties
py
svg
webp
xml
yml

What command can I use for that?

jakub.g
  • 34,363
  • 9
  • 87
  • 122

1 Answers1

5
git ls-tree -r HEAD --name-only | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u 

When you declare it as an alias, you have to escape $1:

alias gitFileExtensions="git ls-tree -r HEAD --name-only | perl -ne 'print \$1 if m/\.([^.\/]+)$/' | sort -u"

This is better than naive find, because:

  • it excludes untracked (gitignored) files
  • it excludes .git directory which contains usually hundreds/thousands of files and hence slows down the search

(inspired by How can I find all of the distinct file extensions in a folder hierarchy?)

Community
  • 1
  • 1
jakub.g
  • 34,363
  • 9
  • 87
  • 122
  • 1
    Any reason you used double quotes around the perl part? Also maybe it'd be worth using a hash rather than piping to sort - you could add `&& !$a{$1}++` (with single quotes around the whole command) to only print the first occurrence of each result. – Tom Fenech Dec 04 '15 at 14:36
  • If I use single quotes, it prints `SCALAR(0xa031e3c)SCALAR(0xa031e3c)...`. I have to remove the escape before `$1` for it to work back. But then when I declare an alias, I have to add the escape back. Updated. – jakub.g Dec 04 '15 at 15:23
  • Yeah, `$` needs escaping from the shell inside double quotes. Better to just use a function in my opinion. – Tom Fenech Dec 04 '15 at 15:28
  • ad `!$a{$1}++` I tested on a huge git repo (Chrome's blink) and the speed difference is negligible in practice (0.8s vs 1.0s on my machine). I think I'll leave `sort` for readability :) – jakub.g Dec 04 '15 at 15:34