Our BAM files are created according to a "lossless" alignment procedure [1] from the Broad Institute GATK documenation and involves re-adding the unaligned/unmapped reads into an aligned BAM, using Picard's MergeBamAlignment.
The BAM files are produced in the end contain both the mapped and the unmapped reads. These files are then sorted with SortSam [2]- so that the sort order in the header becomes:
@HD VN:1.6 SO:coordinate
How does MarkDuplicates handle the unmapped reads of a BAM file containing both unmapped and mapped?
Note MarkDuplicates seems to normally take the BAM's ordering into account, namely, it accepts arguments such as --ASSUME_SORT_ORDER X. However it's not specified whether reads without a position are ignored, or have to be compared with all other possible reads.
Disclaimer: I initially posted this question on the GATK forum [3], but I'm reaching out to hopefully a broader audience.
Citations: