If we have a PDB structrure, how can we find residues physically interacting with each other in space? I know that we must find the distance between residues and if the distance is less than 5-6 Angstrom, we say that residues are physically interacting. But how can we find the distance between all residues and how can we finally determine the distances between all residues? Is there a software or webserver for that?
-
These interaction may be already annotated in the PDB, see Connectivity Section and Connectivity Annotation Section in the PDB spec. – marcin Jun 16 '17 at 14:47
-
One thing you're going to have to define for yourself is what you mean by "distance between residues". Do you mean closest atom pair? Closest heavyatom pair? Distance between C-alphas? The exact mechanism is probably going to be similar, regardless, but the details and the results will differ based on the definition. – R.M. Jun 16 '17 at 21:18
-
I mean the distance between C-alphas – Sara Jun 16 '17 at 21:33
-
Is the Ring 2.0 webserver (http://protein.bio.unipd.it/ring/) good enough for this purpose? – Sara Jun 17 '17 at 10:56
-
If anyone is interested in speed benchmarks of distance calculations using various software, there are some at https://github.com/jgreener64/pdb-benchmarks. There are also example scripts for each software in that repository. – jgreener Sep 10 '19 at 21:29
5 Answers
If you need to process multiple files, you could use Biopython to parse a PDB structure.
from Bio.PDB import PDBParser
# create parser
parser = PDBParser()
# read structure from file
structure = parser.get_structure('PHA-L', '1fat.pdb')
model = structure[0]
chain = model['A']
# this example uses only the first residue of a single chain.
# it is easy to extend this to multiple chains and residues.
for residue1 in chain:
for residue2 in chain:
if residue1 != residue2:
# compute distance between CA atoms
try:
distance = residue1['CA'] - residue2['CA']
except KeyError:
## no CA atom, e.g. for H_NAG
continue
if distance < 6:
print(residue1, residue2, distance)
# stop after first residue
break
If you need to look at one structure, using a viewer perhaps would be easier. You could try PyMOL: (how to measure distance). There are other PDB viewers, some of which can work even through a browser.
- 2,695
- 1
- 13
- 34
-
Dear lakov, I installed Anaconda and Biopython packages and run the above code in PyCharm 5.0 and got the following error: – Sara Jun 16 '17 at 16:27
-
Traceback (most recent call last): File "C:/Users/VN7-592/PycharmProjects/untitled/sara.py", line 7, in
structure = parser.get_structure('PHA-L', '1FAT.pdb') – Sara Jun 16 '17 at 16:29 -
File "C:\Users\VN7-592\Desktop\Anaconda\lib\site-packages\Bio\PDB\PDBParser.py", line 81, in get_structure with as_handle(file, mode='rU') as handle: File "C:\Users\VN7-592\Desktop\Anaconda\lib\contextlib.py", line 17, in enter return self.gen.next() File "C:\Users\VN7-592\Desktop\Anaconda\lib\site-packages\Bio\File.py", line 88, in as_handle with open(handleish, mode, **kwargs) as fp: IOError: [Errno 2] No such file or directory: '1FAT.pdb' – Sara Jun 16 '17 at 16:29
-
1You don't have the 1FAT.pdb file in the working directory, or your operating system is case sensitive. – Ajasja Jun 16 '17 at 20:01
-
But this gives only intermolecular distances? BioPython ignores crystal symmetry? In general, to check for contacts between residues X and Y one would need to calculate distance between X and the nearest symmetrical image of Y. – marcin Jun 17 '17 at 19:36
Could you use CCP4's NCONT program? There's a GUI and a command line interface, whatever suits. You can specify which chains you want to target and interact with and set a cut off for distance. The bonus here is once you're in you have a nice suite of other structural tools to use.
If you're just doing it once, the GUI is friendly enough to work things out, if you're doing a batch then you can run it across several files via the command line.
- 449
- 3
- 11
-
I have downloaded the program but I haven't been able to work with it yet. I'm trying to run it – Sara Jun 17 '17 at 02:03
-
-
It doesn't run with an error. The program says choose a project name and even when I have chosen a project name, again says choose a project name and stops running – Sara Jun 18 '17 at 08:03
As part of a project me and some teammates did a script that outputs visual maps of distances between residues. It uses Biopython.
The module contact_map.py does what you are looking for. As an example, if you want to find the residues whose CA are below 5 you can run the following command:
python3 contact_map.py pdb1cd8.ent -a CA -CA 5
This will produce three files:
distance_map_pdb1cd8_CA.png # Heatmap of the distance between the residues
contact_map_pdb1cd8_CA.png # Black/White heatmap: If it is at that min distance
contact_map.log # The actions taken
If you don't have downloaded already the pdb structue you can use the main module:
python3 cozmic.py real 1cd8 -a CA -CA 5
- 4,693
- 1
- 18
- 42
-
How it calculates distances? Inside the asymmetric unit only, or as the shortest distance between the two residues in the crystal (or in biological assembly)? – marcin Jun 20 '17 at 16:28
-
@marcin It calculates the distance between the C alpha (or beta if you chose so) between each residue in the file. – llrs Jun 21 '17 at 07:29
-
Usually, the file == asymmetric unit. If you'd like to find all contacts in biological assembly or in crystal you'd need to take the symmetry into account. – marcin Jun 21 '17 at 08:12
-
-
I'm working now on a project that in the future may be competitive to Bio.PDB, but probably not this year yet :-) – marcin Jun 21 '17 at 18:52
-
You can use MDtraj. The package is easy to install using Anaconda.
You can get the interacting residues using the following snippet (taken from http://mdtraj.org/1.6.2/examples/native-contact.html)
heavy_pairs = np.array(
[(i,j) for (i,j) in combinations(heavy, 2)
if abs(native.topology.atom(i).residue.index - \
native.topology.atom(j).residue.index) > 3])
# compute the distances between these pairs in the native state
heavy_pairs_distances = md.compute_distances(native[0], heavy_pairs)[0]
# and get the pairs s.t. the distance is less than NATIVE_CUTOFF
native_contacts = heavy_pairs[heavy_pairs_distances < NATIVE_CUTOFF]
print("Number of native contacts", len(native_contacts))
- 101
- 1
Distance between C-alphas can also be found using the BioStructures.jl package in Julia:
using BioStructures
struc = read("1AKE.pdb", PDB)
calphas = collectatoms(struc['A'], calphaselector)
dm = DistanceMap(calphas)
Then dm[5, 10] gives 12.19677604943208 etc.
Other variations also work, see the docs.
- 941
- 4
- 9