To monitor why git add . slow?

Question

Assume project where no add and commit has been done for a long time. I do git add . but it takes too much time. I would like to estimate which files/directories are most expensive in the current case. I have a good .gitignore file which works sufficiently but, still sometimes, I have too much and/or something too difficult to be added and committed to Git.

I have often directories which size is from 300GB to 2 TB in my directories. Although excluding them by directory/* and directory/ in .gitignore, the addition is slow.

How can you estimate which directories/files are too expensive to be committed?

score 17 · Accepted Answer · edited May 23 '17 at 12:19

17

Git slowness is generally from large binary files. This isn't because they're binary, just because binary files tend to be large and more complex to compress & diff.

Based on your edit indicating the file sizes, I suspect this is your problem.

The answers to this question offer a few solutions: removing them from source control, manually running git gc, etc.

edited May 23 '17 at 12:19

Community

1
1

answered Aug 09 '15 at 12:30

Aaron Brager

64,148
18
155
274

score 1 · Answer 2 · answered Nov 16 '18 at 23:32

"git add" needs to internally run "diff-files" equivalent,

With Git 2.20 (Q4 2018), the codepath learned the same optimization as "diff-files" has to run lstat(2) in parallel to find which paths have been updated in the working tree.

See commit d1664e7 (02 Nov 2018) by Ben Peart (benpeart).
^{(Merged by Junio C Hamano -- gitster -- in commit 9235a6c, 13 Nov 2018)}

add: speed up cmd_add() by utilizing read_cache_preload()

During an "add", a call is made to run_diff_files() which calls check_removed() for each index-entry.
The preload_index() code distributes some of the costs across multiple threads.

Because the files checked are restricted to pathspec, adding individual files makes no measurable impact but on a Windows repo with ~200K files, 'git add .' drops from 6.3 seconds to 3.3 seconds for a 47% savings.

To monitor why git add . slow?

2 Answers2

`add`: speed up `cmd_add()` by utilizing `read_cache_preload()`

Linked

To monitor why git add . slow?

2 Answers2

add: speed up cmd_add() by utilizing read_cache_preload()

Linked

`add`: speed up `cmd_add()` by utilizing `read_cache_preload()`