##################################### Bio 5476 Protocol for Lab Exercise #2 ##################################### (1) Choose one of the following small, fast folding proteins, and download its PDB file from the Protein Data Bank at http://www.rcsb.org/. Protein PDB ID Albumin Binding Domain 1PRB Engrailed Homeodomain 1ENH Tryptophan Cage 1L2Y Villin Headpiece 1VII WW Domain of FBP28 1E0L (1E-zero-L) All of these proteins are mentioned in an interesting review article on the mechanism of protein folding from Bill Eaton's lab at the NIH. The article, Current Opinion in Structural Biology, 14:76-88 2004, is available on the ftp site for the course. (2) Create a keyfile for your protein that points to the OPLS AA/L force field parameters (ie, the file oplsaal.prm in the /params directory). Then run the "pdbxyz" program to convert the PDB file to a TINKER coordinates .xyz file. (3) Use the "xyzedit" program to move the center-of-mass of the protein molecule to the coordinate system origin at (0,0,0). This can be done using option 11 of the xyzedit program, and will create a new version of the .xyz file. (4) Copy the file "waterbig.xyz" from the /test directory into your working directory for the lab. This is a large, pre-equilibrated TIP3P water box. Use either a text editor, or option 6 of the "xyzedit" program to change the atom type numbers of the oxygen atoms from 1 to 164, and the hydrogens from 2 to 165. Confirm that 164 and 165 are the TIP3P atom types in the oplsaal.prm file. (5) Run the "xyzedit" program on your protein mass-centered .xyz file. Use option 17 to embed the protein into the pre-equilibrated water box. This will write out a new .xyz file for the protein contained in the waterbox, with all overlapping water molecules removed. Be sure to add the size of the box to your keyfile. The waterbig system is a cube of size 36.342 Ang on a side. (6) Use the "minimize" program to energy minimize your protein-water system to a final rms gradient of approximately 1 kcal/mol/Ang (the exact value is unimportant; the minimization may require several hunderd iterations and many minutes of CPU time). (7) Add the "EWALD" and "RATTLE WATER" keywords to your keyfile prior to running dynamics. Now, start a series of molecular dynamics runs using the "dynamic" program. Use the Canonical statistical mechanical ensemble, ie, constant temperature or NVT. First, perform a preliminary equilibration run at some low temperature (perhaps 100K). Your final target is to perform a production MD run of a few nanoseconds at room temperature. If you use 2fs time steps for dynamics, then you should be able to run about one ns per day on an unloaded lab machine. (8) Make plots of the backbone and whole protein rms from the crystal structure as a function of simulation time. Do you think your MD run is equilibrated? (9) Discuss the stability of the secondary structural features of your protein. For example, if your protein contains an alpha helix or a hairpin of beta sheet, what fluctuations does this structure undergo on the time scale of your simulation? (9) Pick a couple of residues in your protein (a buried aromatic side chain and a solvent exposed surface residue might be good choices) and compute the time correlation function of the rotation of these side chains (for example, the chi2 angle of an aromatic residue). Correlation functions are discussed in Section 7.6 of Leach. (10) What is the concentration of your protein in your periodic simulation system. Is this reasonable? What do you estimate would be the CPU requirements for using TINKER to run a 1 microsecond simulation of your protein in a water box large enough to afford a reasonable protein concentration. (11) [OPTIONAL] Analyze the water structure in the first layer of water around your protein. Are there any very tightly bound waters (perhaps hydrogen bonds to one or more highly polar or charged protein groups)? Can you estimate the average residence time of some bound waters? (12) [OPTIONAL] Run an implicit solvent simulation of your protein. Use the Generalized Born (GBSA) solvation model as in Lab 1. How does the ensemble of structures and RMSD from the crystal compare between the explicit water and implicit solvent simulations?