4

I have a VCF file with SNPs from a bacterial genome and want to find if the SNPs are located inside genes, is there some CLI-tool where you can pass a VCF file and a gff or gbk file and it returns the name of the genes?

haegglund
  • 91
  • 5
  • 1
    Please show us an example of your inputs. What format are your SNPs in? Are they in a proper VCF file? Are they in some other format one per line? Many per line? Do you want a command line solution? an online graphical tool? Please [edit] your question and clarify. – terdon Jun 21 '17 at 11:16

1 Answers1

3

Via BEDOPS:

$ gff2bed < annotations.gff > annotations.bed
$ vcf2bed < snps.vcf > snps.bed
$ bedmap --echo --echo-map-id-uniq snps.bed annotations.bed > answer.bed

This can be reduced to a one-liner if you're using bash:

$ bedmap --echo --echo-map-id-uniq <(vcf2bed < snps.vcf) <(gff2bed < annotations.gff) > answer.bed
Alex Reynolds
  • 3,135
  • 11
  • 27