10

If we have a PDB structrure, how can we find residues physically interacting with each other in space? I know that we must find the distance between residues and if the distance is less than 5-6 Angstrom, we say that residues are physically interacting. But how can we find the distance between all residues and how can we finally determine the distances between all residues? Is there a software or webserver for that?

Sara
  • 777
  • 1
  • 6
  • 18
  • These interaction may be already annotated in the PDB, see Connectivity Section and Connectivity Annotation Section in the PDB spec. – marcin Jun 16 '17 at 14:47
  • One thing you're going to have to define for yourself is what you mean by "distance between residues". Do you mean closest atom pair? Closest heavyatom pair? Distance between C-alphas? The exact mechanism is probably going to be similar, regardless, but the details and the results will differ based on the definition. – R.M. Jun 16 '17 at 21:18
  • I mean the distance between C-alphas – Sara Jun 16 '17 at 21:33
  • Is the Ring 2.0 webserver (http://protein.bio.unipd.it/ring/) good enough for this purpose? – Sara Jun 17 '17 at 10:56
  • If anyone is interested in speed benchmarks of distance calculations using various software, there are some at https://github.com/jgreener64/pdb-benchmarks. There are also example scripts for each software in that repository. – jgreener Sep 10 '19 at 21:29

5 Answers5

10

If you need to process multiple files, you could use Biopython to parse a PDB structure.

from Bio.PDB import PDBParser

# create parser
parser = PDBParser()

# read structure from file
structure = parser.get_structure('PHA-L', '1fat.pdb')

model = structure[0]
chain = model['A']

# this example uses only the first residue of a single chain.
# it is easy to extend this to multiple chains and residues.
for residue1 in chain:
    for residue2 in chain:
        if residue1 != residue2:
            # compute distance between CA atoms
            try:
                distance = residue1['CA'] - residue2['CA']
            except KeyError:
                ## no CA atom, e.g. for H_NAG
                continue
            if distance < 6:
                print(residue1, residue2, distance)
        # stop after first residue
        break

If you need to look at one structure, using a viewer perhaps would be easier. You could try PyMOL: (how to measure distance). There are other PDB viewers, some of which can work even through a browser.

Iakov Davydov
  • 2,695
  • 1
  • 13
  • 34
  • Dear lakov, I installed Anaconda and Biopython packages and run the above code in PyCharm 5.0 and got the following error: – Sara Jun 16 '17 at 16:27
  • Traceback (most recent call last): File "C:/Users/VN7-592/PycharmProjects/untitled/sara.py", line 7, in structure = parser.get_structure('PHA-L', '1FAT.pdb') – Sara Jun 16 '17 at 16:29
  • File "C:\Users\VN7-592\Desktop\Anaconda\lib\site-packages\Bio\PDB\PDBParser.py", line 81, in get_structure with as_handle(file, mode='rU') as handle: File "C:\Users\VN7-592\Desktop\Anaconda\lib\contextlib.py", line 17, in enter return self.gen.next() File "C:\Users\VN7-592\Desktop\Anaconda\lib\site-packages\Bio\File.py", line 88, in as_handle with open(handleish, mode, **kwargs) as fp: IOError: [Errno 2] No such file or directory: '1FAT.pdb' – Sara Jun 16 '17 at 16:29
  • 1
    You don't have the 1FAT.pdb file in the working directory, or your operating system is case sensitive. – Ajasja Jun 16 '17 at 20:01
  • But this gives only intermolecular distances? BioPython ignores crystal symmetry? In general, to check for contacts between residues X and Y one would need to calculate distance between X and the nearest symmetrical image of Y. – marcin Jun 17 '17 at 19:36
4

Could you use CCP4's NCONT program? There's a GUI and a command line interface, whatever suits. You can specify which chains you want to target and interact with and set a cut off for distance. The bonus here is once you're in you have a nice suite of other structural tools to use.

If you're just doing it once, the GUI is friendly enough to work things out, if you're doing a batch then you can run it across several files via the command line.

Download

TW93
  • 449
  • 3
  • 11
  • I have downloaded the program but I haven't been able to work with it yet. I'm trying to run it – Sara Jun 17 '17 at 02:03
  • @Sara what is it you're having a problem with? – TW93 Jun 17 '17 at 18:24
  • It doesn't run with an error. The program says choose a project name and even when I have chosen a project name, again says choose a project name and stops running – Sara Jun 18 '17 at 08:03
2

As part of a project me and some teammates did a script that outputs visual maps of distances between residues. It uses Biopython.

The module contact_map.py does what you are looking for. As an example, if you want to find the residues whose CA are below 5 you can run the following command:

python3 contact_map.py pdb1cd8.ent -a CA -CA 5 

This will produce three files:

distance_map_pdb1cd8_CA.png # Heatmap of the distance between the residues
contact_map_pdb1cd8_CA.png # Black/White heatmap: If it is at that min distance
contact_map.log  # The actions taken

If you don't have downloaded already the pdb structue you can use the main module:

python3 cozmic.py real 1cd8 -a CA -CA 5 
llrs
  • 4,693
  • 1
  • 18
  • 42
  • How it calculates distances? Inside the asymmetric unit only, or as the shortest distance between the two residues in the crystal (or in biological assembly)? – marcin Jun 20 '17 at 16:28
  • @marcin It calculates the distance between the C alpha (or beta if you chose so) between each residue in the file. – llrs Jun 21 '17 at 07:29
  • Usually, the file == asymmetric unit. If you'd like to find all contacts in biological assembly or in crystal you'd need to take the symmetry into account. – marcin Jun 21 '17 at 08:12
  • Thanks for the tips, maybe you want to contribute to the modules. – llrs Jun 21 '17 at 08:53
  • I'm working now on a project that in the future may be competitive to Bio.PDB, but probably not this year yet :-) – marcin Jun 21 '17 at 18:52
  • I think Bio.PDB is being actively modified, but good luck with your work – llrs Jun 22 '17 at 07:26
0

You can use MDtraj. The package is easy to install using Anaconda.

You can get the interacting residues using the following snippet (taken from http://mdtraj.org/1.6.2/examples/native-contact.html)

heavy_pairs = np.array(
    [(i,j) for (i,j) in combinations(heavy, 2)
        if abs(native.topology.atom(i).residue.index - \
               native.topology.atom(j).residue.index) > 3])

# compute the distances between these pairs in the native state
heavy_pairs_distances = md.compute_distances(native[0], heavy_pairs)[0]
# and get the pairs s.t. the distance is less than NATIVE_CUTOFF
native_contacts = heavy_pairs[heavy_pairs_distances < NATIVE_CUTOFF]
print("Number of native contacts", len(native_contacts))
Ajasja
  • 101
  • 1
0

Distance between C-alphas can also be found using the BioStructures.jl package in Julia:

using BioStructures
struc = read("1AKE.pdb", PDB)
calphas = collectatoms(struc['A'], calphaselector)
dm = DistanceMap(calphas)

Then dm[5, 10] gives 12.19677604943208 etc.

Other variations also work, see the docs.

jgreener
  • 941
  • 4
  • 9