5

The vertebrate genome project (VGP)

has a lot of interesting publications such as this one.

The rough pipeline is outlined below:

Here the pipeline in more detail:

While the paper describes all the steps of the iterative assembly pipeline, I cannot find information on the runtime of their assembly-pipeline.

I was excited to check out their github page to find out more in the documentation. Unfortunately did I not find any information on the runtime in the https://github.com/VGP/vgp-assembly repo.

I'have written some e-mails asking collaborators of VGP but I have still not gotten any answers.

Did any of you apply this assembly-pipeline? Could anyone tell me the runtime of the assembly-pipeline?

I hope this question is in the appropriate stackexchange, if not please point me to what you think fits best. Thanks for your suggestions, please be gentle I'm new to all of this.

For more info on the VGP check out this link

ilam engl
  • 280
  • 1
  • 10
  • 2
    Just a comment (please don't turn this into an answer): like much work in genomics, the VGP paper is already outdated when it is published. These days, 10x is discontinued and few are using CLR. When budget allows, go for HiFi. HiFi assembly is much faster (half a day for mammals) and is of much higher quality. – user172818 Aug 25 '21 at 08:15
  • Could you please specify - like posting a publication and a brand of a platform you'd recommend? Thanks in advance – ilam engl Aug 25 '21 at 12:36
  • 1
    From what I understand the VPG pipeline includes HiFi on the PacBio plattform anyways - so I'm not sure if you mean something else and newer... – ilam engl Aug 25 '21 at 12:41
  • Sorry for the late response. I haven't seen their v2.0 pipeline. Its paper, many repos and your original figure are about obsolete data types. For PacBio HiFi, you mostly need an assembler only. No need to apply the VGP pipeline. – user172818 Aug 28 '21 at 01:08
  • @172818 Looking at papers I've seen only publications using a 'hybrid' approach... You say if I was to use PacBio HiFi I need an 'assembler only'. What workflow would you suggest? If I was to produce a high quality assembly, should I still use several platforms e.g. including illumina for polishing? – ilam engl Sep 08 '21 at 12:06

1 Answers1

2

Note to OP that your tag of @user172818 was not successful in your last comment. That's probably why they didn't respond.

Because no one has gone there so far, I will simply link to HiFiAsm. See linked Nature Methods paper. This tool accepts HiFi reads, which are now standard, and gives you a GFA that you can quite easily convert to FA.

Polishing with Illumina would actively harm the assembly, as it would remove real heterozygosity or add errors where Illumina reads map ambiguously.

Run "out of the box"t with default options on HiFi it will probably give you much better results than VGP, haplotype-resolved.

If you also add Hi-C PE reads with no additional options, it will give you something much much better than VGP.

It's great. See the docs for practical info on running it.

There is no reason to follow VGP workflows, as they are very very outdated, as pointed out in comments.

Maximilian Press
  • 3,989
  • 7
  • 23