6

One big problem that I'm regularly facing is that URLs for downloading Bioinformatics data (e.g., RefSeq releases or NCBI genome releases) disappear.

Does anyone have any good solution for this?

Manuel
  • 588
  • 4
  • 5
  • 3
    Do you have examples of specific URLs that have stopped working? Some projects try very hard to provide permalinks. – Kusalananda Jun 01 '17 at 02:50
  • Genome releases are versioned, so the URL may move but the data is still there. – Devon Ryan Jun 01 '17 at 06:47
  • 1
    Could you specify which platform or which kind of data you are looking for a stable URL? Many sites provide a ftp server for big downloads – llrs Jun 01 '17 at 07:19
  • 1
    I think this is a good question, maybe with minor modifications: URIs for big databases rarely disappear but stable identifiers aren’t always obviously marked (e.g. Ensembl defaults to “current” release rather than a stable URI). – Konrad Rudolph Jun 01 '17 at 09:45
  • 2
    The question should be flavoured with some examples, but it rises a very good point. – Kamil S Jaron Jun 01 '17 at 12:15
  • I was referring to NCBI stuff, e.g., dbSNP where the latest release for GRCh37 has disappeared and has to be interpolated from an UCSC table. – Manuel Jun 01 '17 at 19:59

1 Answers1

4

The Persistent uniform resource locator or PURL is one such solution, these are designed to be a bit more robust than permalinks in so much as they are supposed to survive the change of domain name. The bio ontology community already use them http://purl.bioontology.org/docs/index.html

Matt Bashton
  • 1,069
  • 6
  • 16