23

As part of some Python tests using the unittest framework, I need to compare two relatively short text files, where the one is a test output file and the other is a reference file.

The immediate approach is:

import filecmp
...
self.assertTrue(filecmp.cmp(tst_path, ref_path, shallow=False))

It works fine if the test passes, but in the even of failure, there is not much help in the output:

AssertionError: False is not true

Is there a better way of comparing two files as part of the unittest framework, so some useful output is generated in case of mismatch?

EquipDev
  • 4,643
  • 10
  • 31
  • 58

4 Answers4

21

To get a report of which line has a difference, and a printout of that line, use assertListEqual on the contents, e.g

import io

self.assertListEqual(
    list(io.open(tst_path)),
    list(io.open(ref_path)))
Alan W. Smith
  • 23,261
  • 4
  • 66
  • 92
Ethan Bradford
  • 630
  • 7
  • 9
  • Under my understanding, this will leave the files open until the garbage-collector notices, which leaves the files locked for too long under Windows. Consider using context managers to limit the time the files are open. – Oddthinking Sep 04 '21 at 06:13
  • @Oddthinking probably something like: with open(...) as tst, open(...) as ref: ... - open those with with statement, it does list on them as well no need for io.open and such. should close once it leaves 'with' scope – MolbOrg Oct 09 '21 at 18:01
  • 2
    Yes, it's not a lot more complicated to include the auto-closing, e.g. with io.open(tst_path) as tst_f, io.open(ref_path) as ref_f: self.assertListEqual(list(tst_f), list(ref_f)) – Ethan Bradford Oct 12 '21 at 16:42
11

All you need to do is add your own message for the error condition. doc

self.assertTrue(filecmp(...), 'You error message')

Dan
  • 1,786
  • 1
  • 15
  • 20
  • 1
    A reminder for those who care: if the two files are different, it prints 'You error message' ONLY. – Tengerye Jun 18 '21 at 08:09
2

Comparing the files in the form of arrays bear meaningful assert errors:

assert [row for row in open(actual_path)] == [row for row in open(expected_path)]

You could use that each time you need to compare files, or put it in a function. You could also put the files in the forms of text string instead of arrays.

Adrien H
  • 559
  • 5
  • 18
  • 1
    in the event of multiple rows with mismatches, this will only report the first one. Not ideal. – Clint Eastwood Aug 23 '21 at 21:28
  • @ClintEastwood You can always join them I guess. Depending on your use case, it might be enough to fail with only one reported line. – Adrien H Aug 25 '21 at 07:52
1

Isn't it better to compare the content of the two files. For example if they are text files compare the text of the two files, this will output some more meaningful error message.

Bart
  • 448
  • 9
  • 22
  • 1
    The intention is to compare the contents, so I added ', shallow=False' to 'filecmp.cmp' to make that clear. – EquipDev Feb 28 '17 at 15:31