Tutorial: Visualization of Macromolecules

Computer-Aided Drug Design Tutorials: 4.1. Docking with DOCK 6

Background

Docking is a term that covers a large class of computer algorithms that attempt to find an optimal placement of a rigid or flexible ligand in the receptor binding site. The ligands is typically a small molecule; peptide-protein and protein-protein docking algorithms are currently under active development. Docking algorithms also generate a score that attempts to distinguish between molecules that bind strongly in their optimal placement from these that bind weakly. In this tutorial, you will learn to use the program DOCK, which was originally developed in Irwin Kuntz group at UCSF.

UCSF Chimera Molecular Modeling System

It would be nice if one could just feed the name of the PDB file as stored in the RCSB Protein Databank and a SDF file defining nearly a million of lead-like compounds from a database such as ZINC into a docking program, and sit back while the computer docks every compound to every binding pocket in the protein. In reality, meaningful docking calculations require a careful preparation of receptor and ligand structures before the docking programs can do their work. This preparation is greatly simplified by molecular modeling programs such as UCSF Chimera that provide a powerful, menu-driven graphical user interface. You will use UCSF Chimera to prepare input files for the program UCSF Dock. You can also use Chimera to analyze the results of docking. UCSF Chimera has a nice tutorial for beginners.

DOCK Tutorial

The DOCK website has a nice tutorial prepared by P. Therese Lang, a graduate student at UCSF. You will find details on how to accomplish each of the tasks listed below by carefully reading her tutorial. The UCSF tutorial is rather comprehensive and sometimes offers multiple ways of performing a task. The list below will serve as a guide to the overall work flow, and suggests which option to follow.

Preparing Molecules for DOCKing

Using UCSF Chimera, obtain and examine the structure of L-arabinose-binding protein (1ABE)
Using UCSF Chimera, prepare the structure of the receptor for docking. This involves deleting ligand and solvent molecules, elimination alternate locations of residues, change of selenomethionines to methionines, adding hydrogen atoms, and assigning charges to protein atoms. UCSF Chimera also check for incomplete residues and automatically change these to glycines. Thus steps e) and f) in the UCSF tutorial are obsolete and you can directly proceed to write the MOL2 file. Note that if the side-chain with missing atoms is close to the binding site, its replacement with Gly will affect results; consider manual replacement with Ala or manual rebuilding of the residue.
Using UCSF chimera, delete hydrogen atoms from the protein and save this receptor structure as a PDB file. This file will be used in the next part for the generation of the surface and spheres. Close the Chimera session.
Using UCSF Chimera, re-obtain the structure of the complex (1ABE). Prepare ligand by adding hydrogen atoms (use of defaults is OK), and assigning charges using the Chimera Add Charge tool (use of default AMBER force field is OK). Save the ligand file in MOL2 format after unchecking Sybyl-style hydrogen naming. Close the Chimera session. Examine this file with a text editor and verify that charges on the ligand are consistent with your chemical intuition. For example, hydroxyl oxygens should be somewhat negative and hydroxyl oxygens somewhat positively charged.

Generating Spheres

Open the prepared receptor structure without hydrogens in UCSF Chimera. Generate the molecular surface of the target (Actions -> Surface -> Show) and write the molecular surface in the DMS format (Tools -> Structure Editing -> Write DMS). If you are using an older version of Chimera, run the program dms using the prepared receptor without hydrogens as an input and examine the result with UCSF Chimera.
Download the text file named INSPH using your browser or the unix command wget link location. Examine this file with a text editor; the meaning of each line is explained in the UCSF tutorial. Generate spheres by running the program sphgen based on the parameters in the INSPH file. Notice that after a successful run, a file with the extension .sph was created.
Select spheres within 10 Angstroms of the active site (OPTION 2) by running the program sphere_selector with receptor spheres file, the prepared ligand file, and the radius as inputs. Notice that a file named selected_spheres.sph was generated.
Generate the PDB file with binding site sphere coordinates by running the program showsphere in interactive mode. Input the filename for the selected spheres file that was generated in the previous step. You will have one cluster to process and there is no need to generate the surface. Give a new unique file name, such as selected_spheres.pdb for the PDB file that will hold the coordinates of selected spheres.
Open the prepared receptor structure file, the prepared ligand structure file, and the sphere coordinate PDB file with UCSF Chimera. Select residue SPH and with Actions -> Atoms/Bonds -> ball & stick display the selected spheres. Verify that the selected spheres surround the ligand binding pocket.

Generating the Grid

Create the box around the grid defined by selected spheres by running the program showbox. Values suggested in UCSF tutorial (Construct Automatically, 5 Å margins, cluster number 1) are fine. Examine the box and spheres in the binding site with UCSF Chimera.
Download and examine a text file named grid.in with a text editor. Your charged receptor structure is likely in the current directory (rec_charged.mol2); the van der Waals parameter file that you need to specify is in /opt/dock6/parameters/vdw_AMBER_parm99.defn. Practice generation of the energy-scoring grid following instructions in the UCSF tutorial. The grid generation calculation will take about 15 minutes and consumes significant CPU resources. Thus, once you are sure that the grid generation calculation has successfully created, stop it (Cntr-C). You can copy the grid file from the course directory cp ../grid.nrg .

Docking

Perform Rigid Ligand Docking. Download and examine a text file named rigid.in with a text editor. The meaning of parameters is explained in the UCSF tutorial. Change the location of the receptor structure, spheres, grid, and parameter file appropriately and save the file. Start the docking calculation by issuing command dock6 -i rigid.in -o rigid_ligA.out. The output is now saved to a file rigid_ligA.out. Wait for the calculation to finish (less than a minute) and examine the output file by issuing command more rigid_ligA.out. Record the DOCK score. By default, the best pose from docking is written into a mol2 file that is always named rigid_scored.mol2. Rename this file by issuing command mv rigid_scored.mol2 rigid_scored_ligA.mol2. Visually examine the receptor, original ligand, and the best docked pose with UCSF Chimera.
Perform Flexible Ligand Docking. Download a text file named anchor_and_grow.in and edit it appropriately; the meaning of parameters is explained in the UCSF tutorial. Start the calculation by issuing command dock6 -i anchor_and_grow.in -o anchor_and_grow_ligA.out. The output is now saved to a file anchor_and_grow_ligA.out. Wait for the calculation to finish; flexible docking takes longer time (few minutes in this case). Examine the output file by issuing command more anchor_and_grow_ligA.out. Record the DOCK score. By default, the best pose from docking is written into a mol2 file that is always named flex_scored.mol2. Rename this file by issuing command mv flex_scored.mol2 flex_scored_ligA.mol2. Visually examine the receptor, original ligand, and the best docked pose with UCSF Chimera.

Analysis and Final Comments

Check the Grid Scores from rigid and flexible docking. Notice that the Grid Score from flexible docking is more negative, indicating a better binding. Also notice that for this ligand, the difference arises from the electrostatic energy. It is common that flexible docking gives better scores. If you are docking a series of compounds it may be helpful to monitor how each subscore changes as the ligand structure is changing. For example, if you substitute hydrogen in the lead compound with chlorine or methyl group during the optimization and the van der Waals score does not change much, the substituted analog is likely to bind better because it is more hydrophobic without creating steric clashes in the binding pocket.

Visually examine the structures of the receptor, the original ligand, the best rigid pose, and the best flexible pose with UCSF Chimera. Under Favorites, activate the Model Panel, which allows you to select which structures are shown. For moment, hide the receptor. Notice that the flexible and rigid docking results differ slightly in the position of hydrogen atoms. Keep in mind that while the best rigid pose appears closer to the original ligand, the hydrogen positions in the original structure were not experimentally observable; the hydrogen atoms were added by UCSF Chimera. Show the receptor and examine reasons behind better electrostatic energy from flexible docking. Pay attention to the interactions of hydroxyl at C₂ with residues Glu14 and Lys10.

Docking of the ligand that was originally present in the crystal structure to the same protein is one of the easiest docking tasks. Many docking programs correctly identify a ligand pose very similar to the experimentally observed pose as one with one with the highest score. The success of such re-docking does not always indicate that the docking of novel compounds will be reliable with the same program. First, comparison of novel compounds involves scoring of these compounds against each other; this step is not needed for docking a single compound. Second, the shape of the binding pocket during docking of novel compounds is usually not allowed to change to accommodate the novel ligand, and thus ligands larger than the original ligand may not fit into the pocket according to the docking program. In reality, proteins are flexible, and it is likely that the active site will change slightly to accommodate slightly larger ligands that otherwise make good contacts with the receptor.

The success of docking depends on several factors. First, a sufficiently large number of orientations should be tried; and the required number tends to grow with the increased complexity of the molecule. Unfortunately, the computational time also grows as the number of orientations is increased. Second, a sufficiently accurate force field should be used for evaluation of intermolecular electrostatic and van der Waals energy. In case of DOCK 6, which uses AMBER force field, this means that the Chimera's DockPrep hopefully provided good partial charges and Lennard-Jones parameters based on its perception of the molecular structure. Oftentimes, quantum mechanical calculations are needed in order to parameterize molecules with unusual chemical structures. Because AMBER force field utilizes a fixed point-charge model, effects such as polarization or π-π stacking are not rigorously described by DOCK 6. Finally, in many experimental structures, ligand is bound to the receptor via bound water molecules. Such water molecules are usually not present during docking, and if their contribution is significant, misleading results could be obtained.