I need a Software or Python Program for Converting VCF(.vcf) to FASTA(.fa) format with the help of reference (.fa)
Asked
Active
Viewed 1,166 times
2
-
Not quite a dupe since it asks for an R or python approach, but very relevant: Converting a VCF into a FASTA given a reference with Python, R. – terdon Sep 22 '19 at 18:27
-
https://bioinformatics.stackexchange.com/questions/2825/converting-a-vcf-into-a-fasta-given-a-reference-with-python-r – M__ Sep 23 '19 at 01:46
-
Also vcftools.. – M__ Sep 23 '19 at 01:47
1 Answers
2
I used FastaAlternateReferenceMaker previously. You will need to download GATK first. Next you prepare your reference genome (reference.fasta) and your vcf file (input.vcf). You will call the GATK function like this:
java -jar GenomeAnalysisTK.jar \
-T FastaAlternateReferenceMaker \
-R reference.fasta \
-o output.fasta \
-V input.vcf \
output.fasta will contain the new fasta with snps inserted at sites specific by the vcf file
StupidWolf
- 1,688
- 1
- 7
- 21
-
I got this error when I used gatk on a Linux plaftorm:
A USER ERROR has occurred: Input files reference and features have incompatible contigs: No overlapping contigs found.
– David Enoma May 14 '21 at 11:41 -
-
how do I resolve this please?
Additionally, the vcf files are from somewhere external and I actually subset it to get a specific sample. my reference is https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.fna.gz
– David Enoma May 14 '21 at 12:13