I wrote a small script to generate the 3D structures of the amino acids because I wan't to investigate how well the MMFF94 force field can calculate the charge of them. I've done this before for entire proteins, but here I just want to do it one amino acid at a time. To challenge myself, I cannot rely on the excellent (o)babel tool of the Open Babel package when generating the structures. All code is on github, but have shared it on the blog too via select gists.
My SMILES for the amino acids are saved in a dictionary:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def getAminoAcids(include_protonated=False): | |
molecules = dict() | |
molecules['GLY'] = 'NCC(=O)O' | |
molecules['ALA'] = 'NC(C)C(=O)O' | |
molecules['SER'] = 'NC(CO)C(=O)O' | |
molecules['THR'] = 'NC(C(C)O)C(=O)O' | |
molecules['CYS'] = 'NC(CS)C(=O)O' | |
molecules['VAL'] = 'NC(C(C)C)C(=O)O' | |
molecules['LEU'] = 'NC(CC(C)C)C(=O)O' | |
molecules['ILE'] = 'NC(C(C)CC)C(=O)O' | |
molecules['MET'] = 'NC(CCSC)C(=O)O' | |
molecules['PRO'] = 'N1C(C(=O)O)CCC1' | |
molecules['PHE'] = 'NC(Cc1ccccc1)C(=O)O' | |
molecules['TYR'] = 'NC(Cc1cc(O)ccc1)C(=O)O' | |
molecules['TRP'] = 'NC(Cc1c2ccccc2nc1)C(=O)O' | |
molecules['ASP'] = 'NC(CC(=O)O)C(=O)O' | |
molecules['GLU'] = 'NC(CCC(=O)O)C(=O)O' | |
molecules['ASN'] = 'NC(CC(=O)N)C(=O)O' | |
molecules['GLN'] = 'NC(CCC(=O)N)C(=O)O' | |
molecules['HIS'] = 'NC(C(=O)O)Cc1cncn1' | |
molecules['LYS'] = 'NC(CCCCN)C(=O)O' | |
molecules['ARG'] = 'NC(CCCNC(=N)N)C(=O)O' | |
if include_protonated is True: | |
# add protonated versions of bases | |
molecules['HISp'] = 'NC(C(=O)O)Cc1c[nH+]cn1' | |
molecules['LYSp'] = 'NC(CCCC[NH3+])C(=O)O' | |
molecules['ARGp'] = 'NC(CCCNC(=[H2+])N)C(=O)O' | |
# add deprotonated versions of acids | |
molecules['ASPd'] = 'NC(CC(=O)[O-])C(=O)O' | |
molecules['GLUd'] = 'NC(CCC(=O)[O-])C(=O)O' | |
return molecules |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def OBMolMinimize(mol): | |
"""Minimize a molecule | |
""" | |
ff = openbabel.OBForceField.FindForceField("MMFF94") | |
ff.Setup(mol) | |
ff.ConjugateGradients(100, 1.0e-5) | |
ff.GetCoordinates(mol) | |
return mol | |
def OBStructureFromSmiles(smilesstring, filename=None): | |
mol = openbabel.OBMol() | |
obConversion = openbabel.OBConversion() | |
obConversion.SetInAndOutFormats("smi", "pdb") | |
obConversion.ReadString(mol, smilesstring) | |
mol.AddHydrogens() #False, True, 7.4) | |
builder = openbabel.OBBuilder() | |
builder.Build(mol) | |
mol = OBMolMinimize(mol) | |
if filename is None: return mol | |
# save structures in subfolder molecules | |
obConversion.WriteFile(mol, "molecules/%s.pdb" % filename) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import openbabel | |
from obutil import OBStructureFromSmiles | |
from aminoacids import getAminoAcids | |
if __name__ == '__main__': | |
# for now just default to amino acids from the amino acids library | |
AA = getAminoAcids() | |
for k in AA: | |
OBStructureFromSmiles(AA[k], k) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" | |
molecule_charge.py - Determine the charge of a molecule | |
""" | |
import sys | |
if( len(sys.argv) != 2 ): | |
print "Usage: molecule_charge <in.pdb>" | |
sys.exit() | |
file_in = sys.argv[1] | |
import openbabel | |
obConversion = openbabel.OBConversion() | |
obConversion.SetInFormat("pdb") | |
mol = openbabel.OBMol() | |
obConversion.ReadFile(mol, file_in) | |
# get the charges of the atoms | |
charge_model = openbabel.OBChargeModel.FindType("MMFF94") | |
charge_model.ComputeCharges(mol) | |
partial_charges = charge_model.GetPartialCharges() | |
print "%i" % int(sum(partial_charges)) |
In this case, all charges are zero on all amino acids. Notice that I did not make use of Open Babel trying to assign Hydrogen for a specific pH as that is not what I want to test. Instead, notice in the aminoacids.py file listed above there are specific patterns for obtaining (de)protonated states of a few select amino acids: ASP, GLU, HIS, LYS and ARG.
Running the (de)protonated molecules above gives the following table
The disappointing thing here is that the protonated Arginine is incorrectly assigned a charge of 0 (zero). Edit: As noted by Jan, the depiction of the protonated ARG, ASP and GLU is also wrong.
Edit2: It appears the problem is also present in the SVG depicter.
Maybe it's time to submit a bug-report?
Running the (de)protonated molecules above gives the following table
Amino Acid | SMILES | Charge | Depiction |
---|---|---|---|
ASP- | NC(CC(=O)[O-])C(=O)O | -1 | |
GLU- | NC(CCC(=O)[O-])C(=O)O | -1 | |
HIS+ | NC(C(=O)O)Cc1c[nH+]cn1 | +1 | |
LYS+ | NC(CCCC[NH3+])C(=O)O | +1 | |
ARG+ | NC(CCCNC(=[NH2+])N)C(=O)O | 0 |
The disappointing thing here is that the protonated Arginine is incorrectly assigned a charge of 0 (zero). Edit: As noted by Jan, the depiction of the protonated ARG, ASP and GLU is also wrong.
Edit2: It appears the problem is also present in the SVG depicter.
Maybe it's time to submit a bug-report?
While you are making bug reports, the depiction of ASP-, GLU-, and ARG+ is also incorrect
ReplyDeleteNicely spotted Jan. The post has been updated to reflect your comment.
DeleteYou're also missing ff.GetCoordinates(mol) after minimizing mol. :)
ReplyDeleteHow do you make sure, that only L- amino acids are generated?
ReplyDeleteMaciek, thanks for posting that. The code has been updated to reflect this change.
ReplyDeleteAnders have a look at the first table here: http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html where an example of chirality is listed
What file have I to convert with OpenBabel after I made the python scripts?? Help me...thank you
ReplyDeleteok i will try it thankyou
ReplyDelete