What is difference between samtools mark duplicates and remove duplicates ? Is it necessary to mark duplicates before removing duplicates with samtools?
2 Answers
samtools rmdup and samtools markdup -r do the same thing. Without the -r flag samtools markdup only flags the duplicates.
You'll have to run samtools fixmate -m and sort the output to add ms and MC tags to prepare the file for markdup..
According to the current documentation, rmdup is obsolete. Please do not use rmdup.
- 1,149
- 6
- 26
TL;DR: just use markdup.
rmdup removes duplicates from BAM, while markdup, like Picard's MarkDuplicates, marks duplicates by default without hard removal – the latter is usually the desired behavior. In addition, markdup implements a better algorithm that takes care of more corner cases and gives more consistent results.
PS: to emphasize – I see no point to use samtools rmdup nowadays. It has been declared obsolete as @kohlkopf said.
- 6,515
- 2
- 13
- 29