find difference between two text files with one item per line

Question

I have two files:

file 1

dsf
sdfsd
dsfsdf

file 2

ljljlj 
lkklk 
dsf
sdfsd
dsfsdf

I want to display what is in file 2 but not in file 1, so file 3 should look like

ljljlj 
lkklk

score 160 · Answer 1 · answered Nov 02 '10 at 15:19

160

grep -Fxvf file1 file2

What the flags mean:

-F, --fixed-strings
              Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.    
-x, --line-regexp
              Select only those matches that exactly match the whole line.
-v, --invert-match
              Invert the sense of matching, to select non-matching lines.
-f FILE, --file=FILE
              Obtain patterns from FILE, one per line.  The empty file contains zero patterns, and therefore matches nothing.

answered Nov 02 '10 at 15:19

dogbane

254,755
72
386
405

4

Option `-n` could be added to number the differing lines – boczniak767 Nov 27 '14 at 12:59
Any way to highlight the non-matching part of each line? – PeterVermont May 06 '15 at 19:56
With this you can find the first difference only and print its line number too: `grep -m 1 -Fnxvf file1 file2` – Paolo M Oct 20 '15 at 15:50
Very inefficient on large files. – Ain Tohvri Dec 07 '15 at 11:14

score 59 · Accepted Answer · edited Jun 01 '15 at 19:14

59

You can try

grep -f file1 file2

or

grep -v -F -x -f file1 file2

edited Jun 01 '15 at 19:14

jopasserat

5,534
4
30
49

answered Nov 02 '10 at 15:05

krico

5,653
2
24
28

4

This won't work. Try adding `dsfblah` to file2. – dogbane Nov 02 '10 at 15:22
5

You can fix it with `grep -F -x` – tripleee May 30 '13 at 14:50
3

I think your suggestion was worth editing the answer @tripleee – jopasserat Jun 01 '15 at 19:15
2

Note that the ordering of the files matters. I'm trying to detect a new addition to a file. I have to write `grep -v -f oldfile newfile` or else it will output nothing. – Marvo Feb 06 '18 at 21:31
1

Imagine me: git add file1. git commit. cat file2 > file1. git diff. – joshuamabina Dec 02 '18 at 02:44
krico@ Can you add explanation for parameter that you pass? – Raghvendra Jan 30 '19 at 06:24

score 47 · Answer 3 · answered Nov 02 '10 at 15:29

47

You can use the comm command to compare two sorted files

comm -13 <(sort file1) <(sort file2)

answered Nov 02 '10 at 15:29

dogbane

254,755
72
386
405

3

FYI, it is actually `comm -1 -3 file1 file2`. The two flags `1` and `3` are merged into one. – cevaris Feb 19 '15 at 20:30
comm -23 – user1213320 Mar 31 '17 at 03:07

Luca Borrione · Answer 4 · 2016-06-16T13:55:38.153

12

I successfully used

diff "${file1}" "${file2}" | grep "<" | sed 's/^<//g' > "${diff_file}"

Outputting the difference to a file.

edited Jun 16 '16 at 13:55

answered Nov 02 '12 at 18:57

Luca Borrione

15,747
7
49
64

What better way to find differences than to use a diff tool haha. Is there higher overhead with using this versus grep? – Allison Jul 31 '17 at 15:26

score 9 · Answer 5 · answered Nov 02 '10 at 15:09

9

if you are expecting them in a certain order, you can just use diff

diff file1 file2 | grep ">"

answered Nov 02 '10 at 15:09

Nate

12,067
5
43
60

score 7 · Answer 6 · answered Nov 02 '10 at 15:48

7

join -v 2 <(sort file1) <(sort file2)

answered Nov 02 '10 at 15:48

Dennis Williamson

324,833
88
366
429

score 4 · Answer 7 · edited May 23 '17 at 12:26

4

A tried a slight variation on Luca's answer and it worked for me.

diff file1 file2 | grep ">" | sed 's/^> //g' > diff_file

Note that the searched pattern in sed is a > followed by a space.

edited May 23 '17 at 12:26

Community

1
1

answered Jan 30 '14 at 11:14

Riccardo Cicuttini

41
1
3

score 3 · Answer 8 · answered Jan 24 '14 at 13:13

file1 
m1
m2
m3

file2 
m2
m4
m5

>awk 'NR == FNR {file1[$0]++; next} !($0 in file1)' file1 file2
m4
m5

>awk 'NR == FNR {file1[$0]++; next} ($0 in file1)' file1 file2
m2

> What's awk command to get 'm1 and m3' ??  as in file1 and not in file2? 
m1
m3

score 1 · Answer 9 · answered Nov 02 '10 at 16:01

If you want to use loops You can try like this: (diff and cmp are much more efficient. )

while read line
do
    flag = 0
    while read line2
    do
       if ( "$line" = "$line2" )
        then
            flag = 1
        fi
     done < file1 
     if ( flag -eq 0 )
     then
         echo $line > file3
     fi
done < file2

Note: The program is only to provide a basic insight into what can be done if u dont want to use system calls such as diff n comm..

score 1 · Answer 10 · answered Nov 02 '10 at 17:26

1

an awk answer:

awk 'NR == FNR {file1[$0]++; next} !($0 in file1)' file1 file2

answered Nov 02 '10 at 17:26

glenn jackman

223,850
36
205
328

Jahid · Answer 11 · 2016-05-18T12:04:29.177

0

With GNU sed:

sed 's#[^^]#[&]#g;s#\^#\\^#g;s#^#/^#;s#$#$/d#' file1 | sed -f- file2

How it works:

The first sed produces an output like this:

/^[d][s][f]$/d
/^[s][d][f][s][d]$/d
/^[d][s][f][s][d][f]$/d

Then it is used as a sed script by the second sed.

edited May 18 '16 at 12:04

answered May 18 '16 at 07:07

Jahid

19,822
8
86
102

find difference between two text files with one item per line

11 Answers11

Linked

Related