7

I have a FASTA file which I would like to convert into FASTQ format as the tool I want to use my data in requires it in FASTQ format. So dummy quality scores are fine.

Note: I am not using any functionality of the tool that will require the quality score.

Michael Hall
  • 663
  • 4
  • 11

5 Answers5

6

A simple Biopython solution:

from Bio import SeqIO

for r in SeqIO.parse("myfile.fa", "fasta"): r.letter_annotations["solexa_quality"] = [40] * len(r) print(r.format("fastq"), end='')

Example:

$ cat << EOF > myfile.fa 
>1
ATG
>2
ATGTAGA
EOF
$ python3 myscript.py 
@1
ATG
+
III
@2
ATGTAGA
+
IIIIIII
Chris_Rands
  • 3,948
  • 12
  • 31
4

One way of doing this is with two subcommands from the pyfastaq suite.

fastaq to_fake_qual in.fasta - | fastaq fasta_to_fastq in.fasta - out.fastq

The first tool, to_fake_qual, creates fake quality scores (default 40) for each base and the - sends that file (.qual) to stdout. The second tool, fasta_to_fastq, consumes both the original fasta and the quality scores coming from stdin and turns these into a fastq file.

Michael Hall
  • 663
  • 4
  • 11
2

I think the python package bioconvert does what you want if you don't provide qualities: https://github.com/bioconvert/bioconvert/blob/master/bioconvert/fasta2fastq.py

The following outputs lots of warnings for me, but does generate a fastq file test.fq from an existing fasta file test.fa (using dummy quality "I", and adding "None" to sequence headers that do not have a comment part in the original fasta):

bioconvert fasta2fastq test.fa test.fq

(This requires installing bioconvert using pip beforehand: pip install bioconvert.)

bli
  • 3,130
  • 2
  • 15
  • 36
2

Here's a quick one-liner with no dependencies other than the UNIX core utils and Perl.

cat seqs.fasta | paste - - | perl -ne 'chomp; s/^>/@/; @v = split /\t/; printf("%s\n%s\n+\n%s\n", $v[0], $v[1], "B"x length($v[1]))' > seqs.fastq

Caveat: assumes your Fasta sequences are not wrapped, i.e. two lines per record.

Daniel Standage
  • 5,080
  • 15
  • 50
0

I have just uploaded a Python tool to convert fasta to fastq: https://github.com/Mostafa-MR/A_tool_to_convert_fasta_to_fastq

Please note this is presented directly from a Jupyter notebook, i.e. it is a ipynb.

M__
  • 12,263
  • 5
  • 28
  • 47