3

After alignment using BWA, I have removed the dupliment using the samtools(Version: 1.9).

My procedure is as follows:

bwa mem -k 32 -M ref.fa read1 read2 > out.sam 
samtools view -@ 0 -b -T ref.fa -o out.bam in.sam
samtools sort -n -o out.nameSrt.bam in.bam
samtools fixmate -r -m in.nameSrt.bam out.fixmate.bam
samtools sort -o out.fixmate.sort.bam in.fixmate.bam
samtools markdup -r -S -s in.fixmate.sort.bam out.markdup.bam
samtools flagstat in.markdup.bam > out.markdup.flagstat

The flagstat output result is as follows:

21611397 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
21611397 + 0 mapped (100.00% : N/A)
21422330 + 0 paired in sequencing
10711165 + 0 read1
10711165 + 0 read2
19797684 + 0 properly paired (92.42% : N/A)
21422330 + 0 with itself and mate mapped
0 + 0 singletons (0.00% : N/A)
1306000 + 0 with mate mapped to a different chr
727043 + 0 with mate mapped to a different chr (mapQ>=5)

Why is the total read number still more than the paired in sequencing after removing the duplicate in samtools flagstat output? Is there anything wrong with my procedure?

llrs
  • 4,693
  • 1
  • 18
  • 42

1 Answers1

2

Bwa-mem may produce chimeric alignment, where different parts of a read are mapped to distinct loci. flagstat counts them as two reads.

user172818
  • 6,515
  • 2
  • 13
  • 29