git hash-object command somehow detects if content of a blob is a text file or a binary.
There is also a question of git configuration context (https://help.github.com/articles/dealing-with-line-endings/). If you configure git to treat certain types of files as binary content then git will act differently. Not knowing of the context you may generate a wrong hash code. Right ?
I think that the most secure way is to call git hash-object some_file in context of your project and then you can be 100% sure that it will give correct result.
Am I right or do I miss something ?
Below is the code that is a way of reproducing the situation.
import org.apache.commons.codec.digest.DigestUtils
import org.apache.commons.lang3.ArrayUtils
class Test3 {
public static void main(String[] args) {
def bytesU = "this \n is a text".bytes
def fileU = File.createTempFile("someFileU", ".tmp")
fileU << bytesU;
println DigestUtils.sha1Hex(ArrayUtils.addAll("blob ${bytesU.length}\0".bytes, bytesU))
println "git hash-object ${fileU.absolutePath}".execute().text
def bytesW = "this \r\n is a text".bytes
def fileW = File.createTempFile("someFileW", ".tmp")
fileW << bytesU;
println DigestUtils.sha1Hex(ArrayUtils.addAll("blob ${bytesW.length}\0".bytes, bytesW))
println "git hash-object ${fileW.absolutePath}".execute().text
println DigestUtils.sha1Hex(ArrayUtils.addAll("blob 0\0".bytes, [] as byte[]))
println DigestUtils.sha1Hex(ArrayUtils.addAll("blob 7\0foobar".bytes, [] as byte[]))
}
}
Below is the output of the program. The third line,result of git hash-obeject, is different because of the line endings.
- 792e2834867278884eeb8b5ff5f1954e1aa68660
- 792e2834867278884eeb8b5ff5f1954e1aa68660
- 7005d7429c4d219c73900f1a02e7980004614ac3
- 792e2834867278884eeb8b5ff5f1954e1aa68660
- e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
- 3a9f0b1970d7ed8d742dc3b9b36736eb03150766
The is an older post on this which is locked for me so i've decided to create separate question. Please merge this into How to assign a Git SHA1's to a file without Git?