8

I've converted dog coordinates to human using UCSC LiftOver. These are 200bp intergenic regions that are differentially methylated from normal dogs to cancer dogs. I've converted these to human coordinates and found that a lot of them overlap with differentially methylated regions we found in the human model.

  1. Is this okay to do?
  2. What stringencies should I check or modify, the min ratio of bases needed to overlap?
  3. Lastly, is LiftOver taking the 200bp dog sequence and saying that at least 10% of the sequence has to align with a region within the human genome?

I could not find an answer online.

Maximilian Press
  • 3,989
  • 7
  • 23

1 Answers1

1

Based on this old answer, this is not recommended for liftOver. For between-species liftovers, it is claimed that you are likely to want a different tool such as pslMap. For a lot of rather detailed discussion of how liftOver operates in this context and how to parameterize properly, see here.

See also this old Biostars question addressing the issue of mapping not between versions of the same assemblies but different genomes entirely. This thread also notes that using liftOver can be dangerous. That isn't to say that you can't learn something interesting, just that you might make mistakes.

pslMap apparently drops non-syntenic regions, which may lose a lot of data, but it is supposedly making rather fewer assumptions about assembly relationships.

orthoMap is apparently a similar tool to pslMap, but I can't find a way to run it except for a very old and restricted web interface. There are other tools with very similar names ("orthomap") but somewhat different uses, which may increase confusion.

A counterpoint to all the foregoing is that this rather detailed sum-up of liftover tools does not seem to be very concerned about the issues of using tools like liftOver between species. It specifically uses dog to human liftover as the example, just as you want to. It's notable that the assumption is made in that page that a high-quality chain file exists, which does seem to be true for you.

To go through your questions:

  1. depends, see above.
  2. For suggested parameters for running blat/chain generation/liftOver for this application, see here.
  3. This is derived in part from the [chain file] I believe rather than liftOver parameters (http://genomewiki.ucsc.edu/index.php/LiftOver_Howto). However overall this is set by the parameters you used in generating a chain file, which probably boils down to parameters of the workflow here. liftOver itself is just relying on the chain alignments for each region with the best score.
Maximilian Press
  • 3,989
  • 7
  • 23