I’m trying to create a CRAM file that stores its path to the FASTA reference as a relative path, rather than an absolute path, so that I can move the files around. Unfortunately I can’t get this to work; I was expecting the following to work:
⟩⟩⟩ samtools view -C -T ../reference/ref.fa -o output.cram input.bam
However, the resulting file contains an absolute path in its header:
⟩⟩⟩ samtools view -H output.cram
…
@SQ SN:1 LN:249250621 M5:hash UR:/absolute/path/to/data/mapped/../reference/ref.fa
…
As a result, I am unable to open the file via a different path mount that results in different absolute paths, and I can’t move the file (+ its reference) around, or to different machines.
I know that I could set the REF_PATH environment variable or specify -T when reading the file but I would like to avoid this (the result file needs to be readable by IGV, launched by users who don’t know how to set environment variables).
Is there a way of creating a CRAM file that stores a relative path to its reference?
M5hash, which is far more useful in an archival context than a file path on someone else's potentially long-gone filesystem.If there's an issue to be raised, it's against htslib which unconditionally makes the file path absolute when adding
– John Marshall Oct 23 '18 at 09:49URfrom a-Targument. You would indeed be able to set it arbitrarily when reheadering.