5

There are nicer genomics visualization tools available, but the samtools tview command is almost always my go-to for a quick first look at read alignments. I just brought up the following locus in tview.

81055951  81055961  81055971  81055981                                                81055991  81056001  81056011  81056021
GTGGAGGTCGGCAGCAGAGCCTGTGGCAGCT**********************************************CTGAGGGTCGGCGTGGCCTCCTGGTGGAAGTTGCACCTGGTTTGCTT
...............................                                              ...............................................
...............................**********************************************..       ......................................
   ,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,       .....T................................
.....                  ,,,,,,,,**********************************************,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
.......                      ,,**********************************************,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,                                                                    ...............................................
.............                                                                .........................................A.....
........................                                                     ...............................................
...............................**********************************************..         ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,           ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...............................**********************************************.A......                 ,,,,,,,,,,,,,,,,,,,,,,
...............................**********************************************..                         ....................
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,                         ....................
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,,,,,,,,,,,,,,,,,,,,,,,,,,,    ,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,        ,,,,,
...............................**********************************************..                               ,,,,,,,,,,,,,,
...............................**********************************************..                                             
...............................**********************************************..                                             
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,                                             
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,                                             
..................T............**********************************************...............................................
...............................**********************************************...............................................
...............................**********************************************...............................................
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...............................**********************************************...............................................
...............................**********************************************...............................................
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,                                             
..............G................**********************************************...............................................
,,c,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,                                             
...............................CTAAGCAAGTAACTGTGGTGTTAGAACCTACCTCCCTCGCCGGGCA...............................................
...............................**********************************************...............................................
...............................**********************************************........................................A......
...............................**********************************************...........................................T...
,,,,,,,,,,,,,,,,,,,,,g,,,,,,,,,**********************************************,,                                             
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,**********************************************,,                                             
                                                                             ...............................................
                                                                             ...............................................
                                                                             ...............................................
                                                                             ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
                                                                             ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,a, 
                                                                             ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,                
                                                                             ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
                                                                             ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
                                                                             ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

At first glance there appears to be only a single read spanning this insertion, although the abrupt end of many reads at the insertion breakpoints made me suspicious. Indeed, upon closer inspection, 19 additional reads have primary alignments with soft clipping at the insertion breakpoints.

By default, soft-clipped portions of the reads are not displayed. Is there any way to toggle this behavior?

Daniel Standage
  • 5,080
  • 15
  • 50
  • 1
    How would they be displayed? There's nothing in the CIGAR strings to align the soft-clipped bits to the reference – gringer Mar 30 '19 at 04:11
  • 1
    They could be displayed in the same way that igv does it: as a string of unassigned bases at the end of each read. – winni2k Aug 19 '20 at 14:52

1 Answers1

1

A program called gridss does this. It can be found here. It basically extracts the clipped bases and repeatedly realigns them to reference with the aligner of your choice with bwa as default. It will do this until there are no further realignments. The simple command-line for this tool after install is:

java -Xmx512M -cp gridss-VERSION-with-dependencies.jar gridss.SoftClipsToSplitReads I=your_input.bam O=your_output.bam REFERENCE_SEQUENCE=your_reference.fa

I hope this helps!

d_kennetz
  • 631
  • 5
  • 17
  • 2
    Could you please explain why you downvoted? – d_kennetz Apr 04 '19 at 22:45
  • I haven't voted up yet since I haven't had a chance to try out gridss. But yeah, that's an unfair downvote. – Daniel Standage Apr 05 '19 at 03:15
  • This answer does not appear to answer the OP's question? – winni2k Aug 19 '20 at 14:51
  • @winni2k would it be more beneficial for me to have posted an answer that says, "No"? OP's question boiled down to, "Is there any way to toggle this behavior?" There is no way to toggle this behavior, so I proposed an alternate program as a solution. Seems reasonable to me. – d_kennetz Aug 19 '20 at 16:22
  • 1
    Just explaining why someone might down vote. I think it's great that you tried to help though. – winni2k Aug 20 '20 at 20:05