Refinement of Triclinic Lysozyme at 1.1 A

  hits
since 21-Jul-00

Comments / Questions ?

SHELX Programs

Copyright by
Thomas Schneider

Scenario: We managed to collect atom resolution data on a protein where we have a model at 2 A in a different space group that we can use for molecular replacement.
We have: a scalepack output file p1lys.sca and a model from the pdb: 2lym.pdb
We want: a model atomic resolution with an Rfactor below 10 percent and an Rfree below 13 percent

Preliminaries

In this tutorial, we will refine the structure of Lysozyme in space group P1 to a resolution of 1.1 A. The data were collected at 120 K on beamline X11 at EMBL c/o DESY to a resolution of 0.92 A. The structure has in fact been refined to this resolution by Walsh et al. (Acta Cryst. D54:522 (1998), [MEDLINE]) and these coordinates are available from the pdb under entry code 3LZT. For our exercise, the data were cut at 1.1 A. To make things more interesting we also pretend that we do not have a starting model of the protein in space group P1. Instead we take a model for Lysozyme in space group P43212 as the searchmodel for molecular replacement. A good one can be found under pdb-code 2LYM (Kundrot et al. J.Mol.Biol. 193:157-170 (1987), [MEDLINE]).

In the following, user input is marked in red and program output in blue. All SHELXL input and output files are available in a gzipped file p1lys.tar.gz. To look at intermediate models and maps as we go along, it is useful to have Xtalview installed on your computer.

The rest of the document describes the following steps:


Preparing the Molecular Replacement Model

To prepare the model for molecular replacement, the original pdb file has to be modified, i.e waters, hydrogens, and hetero compounds should be removed and B-factors reset. All this can be done using the program SHELXPRO. This program is started from the command line by typing shelxpro. It comes up with a menu and you have to answer some questions:


trs/p1lys> shelxpro

 SHELXPRO - SHELX interface for protein applications - Version 97-2
 Copyright(C) George M. Sheldrick 1996-7

 [F] New output filename                   [V] R(free) files
 [A] Anisotropic scaling (Hope & Parkin)   [I] .ins from PDB file
 [P] Progress of LS refinement diagram     [L] Luzzati plot
 [T] Thermal displacement analysis         [E] Esd analysis
 [U] Update .res (and .pdb) to .ins file   [N] NCS analysis
 [R] Ramachandran Phi-Psi plot             [K] Kleywegt NCS plot
 [M] Map file for O from .fcf              [O] PDB file for O
 [H] .hkl file from other data formats     [Y] X-PLOR/CNS .fob to .hkl
 [D] Convert DENZO/SCALEPACK .sca to .hkl  [C] Color plots (now on) 
 [X] Write XtalView map coefficients       [W] Write Turbo-Frodo map
 [S] Reflection statistics from .fcf       [Z] Least-squares fit
 [J] Generate restraints from model        [B] PDB deposition
 [G] Generate PDB file from .res or .pdb   [Q] Quit

 Enter option: G

 Reads a .ins, .res or .pdb format file and generates a new PDB format file.
 This file may be used for input to standard protein programs such as AMoRe,
 or re-read by SHELXPRO for least-squares fitting.  B-values may be reset to
 typical values, disorder, solvent and H-atoms removed, chain IDs created,
 and multiple copies of chains generated by (non-)crystallographic symmetry.
 In the new PDB file all atoms are isotropic.

 Enter N to abort option, <Enter> to continue:  <Enter>

 Read PDB (P) or SHELX .ins or .res (S) file [S]: P
 Name of file to read [shelxpro.pdb]:  2lym.pdb
 Replace B-values with standard values (Y or N) ? [N]:  y
 Remove hydrogen atoms (Y or N) ? [Y]:  y
 Reset PART 1 occ. to 1, delete other disorder components (Y or N) ? [Y]: y
 Remove all residues except standard amino-acids (Y or N) ? [N]:  y

  1001 atoms stored

 PDB file to write (may be same as read) [shelxpro.pdb]:  2lym_mod.pdb
 Name of protein for PDB file: 
tetragonal lysozyme
 Now the atoms are written to the PDB file, starting with chains, followed
 by the remaining atoms. In both cases residues may be selected by number;
 symmetry transformations may also be applied.

 Select chain ('$' if chain ID blank,  to exit):  $
 New ID for this chain in PDB file [ ]: <Enter>
 The symmetry operator may be specifed using decimals or fractions
 Symmetry operator [x,y,z]: <Enter>
 First and last residues to process [1 129]: <Enter>
 New residue number for the first of these [1]: <Enter>
 1001 atoms written to PDB file
 Select chain ('$' if chain ID blank,  to exit): <Enter>

 SHELXPRO - SHELX interface for protein applications - Version 97-2
 Copyright(C) George M. Sheldrick 1996-7

 [F] New output filename                   [V] R(free) files
 [A] Anisotropic scaling (Hope & Parkin)   [I] .ins from PDB file
 [P] Progress of LS refinement diagram     [L] Luzzati plot
 [T] Thermal displacement analysis         [E] Esd analysis
 [U] Update .res (and .pdb) to .ins file   [N] NCS analysis
 [R] Ramachandran Phi-Psi plot             [K] Kleywegt NCS plot
 [M] Map file for O from .fcf              [O] PDB file for O
 [H] .hkl file from other data formats     [Y] X-PLOR/CNS .fob to .hkl
 [D] Convert DENZO/SCALEPACK .sca to .hkl  [C] Color plots (now on) 
 [X] Write XtalView map coefficients       [W] Write Turbo-Frodo map
 [S] Reflection statistics from .fcf       [Z] Least-squares fit
 [J] Generate restraints from model        [B] PDB deposition
 [G] Generate PDB file from .res or .pdb   [Q] Quit

 Enter option: Q
The resulting file 2lym_mod.pdb needs a short massage to keep the molecular replacement program EPMR from crashing: we have to put some spaces at the end of all lines. After loading the file into my favourite editor vi, This can be done by telling the editor to:

:s/.$/&        /g.
- handy, isn't it ?

Now we have a the search model ready in the file 2lym_mod.pdb.


Preparing the reflection file for molecular replacement

EPMR needs a file containing structure factors, not intensities. The easiest way to make F's out of I's is to use the program XPREP (available from Bruker AXS).

Alternatively you can use programs from the CCP4 suite. A script to do the conversion and its output are available here: sca2hkl3.csh sca2hkl3.out. In any case, we will end up with a file p1lys.hkl3 containing h, k, l, F, sig(F). truncate will also tell us that the Wilson B factor of the data is 5.3 A2.


Solving the structure using EPMR

EPMR is a very easy to use molecular replacement program. To solve this structure we have to put the unit cell and the space group number into a file p1lys.cell. Then we run the program:


trs/p1lys> epmr p1lys.cell 2lym_mod.pdb p1lys.hkl3 > epmr.out &
After about 1 minute it comes up with a solution which is dumped to a pdb file: epmr.1.best.pdb. The logfile of this EPMR run can be found here: epmr.out. For the solution found, the correlation coefficient and the R value for data between 4.0 and 15.0 A are 45.6% and 44.5% respectively.

Now we have a starting model. The next step is to convert the pdb file to something useful for SHELXL and to put Rfree flags onto the reflections which also have to be converted from SCALEPACK format to SHELX format.


Creating the first SHELXL ins file

The file epmr.1.best.pdb is a pretty normal pdb file. We use SHELXPRO to convert this pdb file to a SHELXL ins file, which contains both instructions and coordinates for a refinement job:


trs/p1lys> shelxpro

 SHELXPRO - SHELX interface for protein applications - Version 97-2
 Copyright(C) George M. Sheldrick 1996-7

 [F] New output filename                   [V] R(free) files
 [A] Anisotropic scaling (Hope & Parkin)   [I] .ins from PDB file
 [P] Progress of LS refinement diagram     [L] Luzzati plot
 [T] Thermal displacement analysis         [E] Esd analysis
 [U] Update .res (and .pdb) to .ins file   [N] NCS analysis
 [R] Ramachandran Phi-Psi plot             [K] Kleywegt NCS plot
 [M] Map file for O from .fcf              [O] PDB file for O
 [H] .hkl file from other data formats     [Y] X-PLOR/CNS .fob to .hkl
 [D] Convert DENZO/SCALEPACK .sca to .hkl  [C] Color plots (now on) 
 [X] Write XtalView map coefficients       [W] Write Turbo-Frodo map
 [S] Reflection statistics from .fcf       [Z] Least-squares fit
 [J] Generate restraints from model        [B] PDB deposition
 [G] Generate PDB file from .res or .pdb   [Q] Quit

 Enter option: I

 Reads a PDB file and generates a SHELXL .ins file.  The PDB file is assumed
 to conform strictly to the PDB format as defined by the Protein Data Bank,
 but closely related non-standard formats (e.g. CCP4 and XPLOR) can usually
 be understood.  The program will ask for the missing cell and symmetry
 information etc.  Engh and Huber restraints are included in the .ins file
 for standard residues, and extra restraints are added for disulfide bridges
 and C-terminal carboxyl groups.  A summary of the residue and atom names is
 written to the .pro file for subsequent reference.

 ** The 'I' option is intended for initial input of a structure to SHELXL,
 NOT for updating between refinement jobs, for which 'U' should be used. **

 Enter N to abort option,  to continue: I

 Enter name of .ins file [shelxpro.ins]: p1lys_mr.ins
 Enter name of PDB file [shelxpro.ent]: epmr.1.best.pdb
 Enter title [shelxpro]: P1 lysozyme after mol rep
 CELL in Angstroms and deg. [ ]:
  26.650    30.800    33.630    89.300    72.600    67.800
 Enter Z (number of molecules per cell) [4]: 1
 Enter space group in PDB or XPREP notation [P212121]: P1
 Enter wavelength in Angstroms [1.54178]: 0.927
 SCALE instructions not found in PDB file - standard transformation
 applied using current cell

 Enter old residue numbers (modified by chain ID, if any) for all N-terminii
 ( if none). To continue on the next line, put "=" at the end of the line
 : 1

 Enter old residue numbers for all C-terminii in the same way: 129

 Enter old residue numbers in the same way at which renumbering of a block of
 residues should start. The block continues until the next residue specified
 here ( if none): 

 New residue number for first solvent water [1001]: 

 Reset water occupancies to unity (Y or N) ? [Y]: 

 HKLF code (3 for F, 4 for F-squared) [4]: 

 The .ins file has been written successfully.  The U option in SHELXPRO may
 be used for further checking of occupancies etc.

  to continue: 

 .
 . 

 Main Menue to Quit
 .

The file p1lys_mr.ins now contains a SHELXL instruction file.

The only thing missing now is an Rfree-flagged list of reflections in format suitable for SHELXL.


Preparing the reflection file for refinement

Again, we run SHELXPRO:

trs/p1lys> shelxpro
 

 SHELXPRO - SHELX interface for protein applications - Version 97-2
 Copyright(C) George M. Sheldrick 1996-7

 [F] New output filename                   [V] R(free) files
 [A] Anisotropic scaling (Hope & Parkin)   [I] .ins from PDB file
 [P] Progress of LS refinement diagram     [L] Luzzati plot
 [T] Thermal displacement analysis         [E] Esd analysis
 [U] Update .res (and .pdb) to .ins file   [N] NCS analysis
 [R] Ramachandran Phi-Psi plot             [K] Kleywegt NCS plot
 [M] Map file for O from .fcf              [O] PDB file for O
 [H] .hkl file from other data formats     [Y] X-PLOR/CNS .fob to .hkl
 [D] Convert DENZO/SCALEPACK .sca to .hkl  [C] Color plots (now on) 
 [X] Write XtalView map coefficients       [W] Write Turbo-Frodo map
 [S] Reflection statistics from .fcf       [Z] Least-squares fit
 [J] Generate restraints from model        [B] PDB deposition
 [G] Generate PDB file from .res or .pdb   [Q] Quit

 Enter option: D

 Reads DENZO/SCALEPACK .sca file created with or without the "anomalous"
 option and writes SHELX .hkl file for input to SHELXS or SHELXL with HKLF 4.
 If the .sca file was created with the "anomalous" option, an anomalous
 delta-F file may be created for heavy-atom location with SHELXS.

 Enter N to abort option,  to continue: 
 

 Name of .sca file created using DENZO/SCALEPACK [shelxpro.sca]: p1lys.sca
 Cell:    26.650    30.800    33.630    89.300    72.600    67.800
 Space group: P1                            
 Enter name of .hkl output file [shelxpro.hkl]: p1lys.hkl
 Copy all data including Friedel opposites (C), merge Friedel opposites
 if any (M) or prepare anomalous delta-F file (A) [M]: 
   35836 Reflections written in HKLF 4 format to file p1lys.hkl

  to continue: 
 

 SHELXPRO - SHELX interface for protein applications - Version 97-2
 Copyright(C) George M. Sheldrick 1996-7

 [F] New output filename                   [V] R(free) files
 [A] Anisotropic scaling (Hope & Parkin)   [I] .ins from PDB file
 [P] Progress of LS refinement diagram     [L] Luzzati plot
 [T] Thermal displacement analysis         [E] Esd analysis
 [U] Update .res (and .pdb) to .ins file   [N] NCS analysis
 [R] Ramachandran Phi-Psi plot             [K] Kleywegt NCS plot
 [M] Map file for O from .fcf              [O] PDB file for O
 [H] .hkl file from other data formats     [Y] X-PLOR/CNS .fob to .hkl
 [D] Convert DENZO/SCALEPACK .sca to .hkl  [C] Color plots (now on) 
 [X] Write XtalView map coefficients       [W] Write Turbo-Frodo map
 [S] Reflection statistics from .fcf       [Z] Least-squares fit
 [J] Generate restraints from model        [B] PDB deposition
 [G] Generate PDB file from .res or .pdb   [Q] Quit

 Enter option: 
This time we stay in the program, as the freshly written file p1lys.hkl is not what we really want. We have to do one more round to put Rfree flags:


 Enter option: V

 Reads a file in SHELX HKLF 3 or 4 format and creates a new .hkl file in
 which P% of the data are flagged for use in an R(free) test by SHELXL using
 CGLS N -1 or L.S. N -1. These reflections may be chosen either at random or
 in thin resolution shells. The latter option is recommended when NCS (non-
 crystallographic symmetry) or twinning is present. CGLS or L.S. without the
 second parameter may be used for the final refinement against all data.
 See A.T. Brunger, Nature 355 (1992) 472-475 for a discussion of R(free).

 Enter N to abort option,  to continue:          
 

 Input reflection data file [shelxpro.hkl]: p1lys.hkl
 Filename for .hkl file to write [shelxprot.hkl]: p1lys_rf.hkl
 Percentage of data to be flagged for R(free) [5]: 
 R(free) reflections random (R) or in thin shells (S) [R]: 

   35836 Reflections copied, of which  1795 flagged for R(free)

  to continue: 

 .
 .

 Main Menue to Quit
 .

Finally, we have a file containing Rfree-flagged intensities in SHELXL format: p1lys_rf.hkl. We should make a backup of this file and, to play it safe, make it not writeable:

trs/p1lys> cp p1lys_rf.hkl ../backup
trs/p1lys> chmod -w p1lys_rf.hkl 

First round of refinement: rigid body

To adjust the model to the data beyond 4 A, a rigid body refinement is a good thing todo. First we copy the original ins file:

trs/p1lys> cp p1lys_mr.ins p1lys_0.ins
The following changes have to be done: To save disk space we do not copy the hkl file, but we create a link:

trs/p1lys> ln p1lys_rf.hkl p1lys_0.hkl
And start our first refinement job:

trs/p1lys> shelxl p1lys_0.hkl
and the program immediately starts to complain:

trs/p1lys> shelxl p1lys_0

 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 +  SHELXL-97 - CRYSTAL STRUCTURE REFINEMENT - UNIX VERSION  +
 +  Copyright(C) George M. Sheldrick 1993-7    Release 97-2  +
 +  p1lys_0              started at 09:50:21 on 27-Jun-2000  +
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 Read instructions and data
 ** Warning: no match for    2 atoms in  DFIX  **
 ** Warning:   118 distances involving residues not restrained **
 ** Warning:   11 bad CHIV instructions ignored **
It is always important to understand the warnings that SHELXL gives to you - normally they point you to something interesting/worrying. In this case the non restrained distances are due to symmetry crashes which put atoms that should not have anything to do with each other at short distances. SHELXL then thinks that these atoms should have a bond between them but can not find a restraint for this bond. And complains. The symmetry crashes will simultaneously envoke a lot of anti-bumping restraints which may substantially confuse the minimizer. So we should fix the symmetry crashes.

To find out what is going on, we have a look at the following table in the lst-file p1lys_0.lst:


Following 1,2- or 1,3-distances involving residues not restrained

 CA_2 NE_73$5   C_2 NE_73$5   C_2 NE_73$5   C_2 CD_73$5   O_2 NE_73$5   N_3 NE_73$5   N_3 NH1_73$5
 N_3 NE_73$5   N_3 CZ_73$5   CA_3 NE_73$5   C_3 NH1_73$5   C_3 NE_73$5   C_3 CZ_73$5   CB_3 NH1_73$5
 CB_3 NE_73$5   CB_3 CZ_73$5   CB_3 NH1_73$5   CD1_3 NH1_73$5   CD2_3 NH1_73$5   CZ_5 O_101$5   CD_7 NH2_73$5
 CZ_23 NH2_68$4   CA_68 CZ_125$7   CA_68 NH2_125$7   CA_68 NE_125$7   C_68 NH2_125$7   C_68 NH2_125$7
 C_68 CZ_125$7   C_68 CZ_125$7   O_68 NH2_125$7   O_68 NE_125$7   O_68 CZ_125$7   CZ_68 OH_23$2   N_69 NH2_125$7
 N_69 CZ_125$7   N_69 NE_125$7   N_69 CZ_125$7   N_69 NH1_125$7   CA_69 NH2_125$7   CA_69 CZ_125$7
 CA_69 NH1_125$7   C_69 CZ_125$7   C_69 NH1_125$7   C_69 NH1_125$7   O_69 NH1_125$7   CB_69 CZ_125$7
 CB_69 NH1_125$7   N_70 NH1_125$7   N_70 NH1_125$7   CA_70 NH1_125$7   CG_70 NH1_125$7   CD_70 NH1_125$7
 CG_73 O_2$1   CD_73 CA_3$1   CD_73 O_2$1   CD_73 N_3$1   CD_73 C_2$1   NE_73 O_2$1   NE_73 CA_3$1
 CZ_73 CA_3$1   CZ_73 CA_3$1   CZ_73 CG_3$1   CZ_73 N_3$1   CZ_73 OE1_7$1   CZ_73 O_2$1   CZ_73 C_2$1
 NH1_73 CA_3$1   NH2_73 CA_3$1   C_101 NH2_5$1   N_109 NH2_128$8   CA_109 NE_128$8   CA_109 NH2_128$8
 CA_109 CZ_128$8   C_109 NH2_128$8   CB_109 NH2_128$8   CB_109 NH2_128$8   CB_109 NE_128$8   CB_109 CZ_128$8
 CG1_109 NE_128$8   CG1_109 NH2_128$8   CG1_109 CZ_128$8   CG2_109 NH2_128$8   CG2_109 CZ_128$8   CG2_109 NE_128$8
 CD_125 C_68$3   NE_125 CA_69$3   NE_125 N_69$3   NE_125 O_68$3   NE_125 C_68$3   CZ_125 C_68$3   CZ_125 CD_70$3
 CZ_125 N_70$3   CZ_125 CA_69$3   CZ_125 C_69$3   CZ_125 N_69$3   CZ_125 C_68$3   CZ_125 O_68$3   NH1_125 CA_69$3
 NH1_125 C_68$3   NH1_125 O_68$3   NH1_125 N_69$3   NH2_125 C_68$3   NH2_125 O_68$3   NH2_125 N_69$3
 NH2_125 CA_69$3   CD_128 CB_109$6   CD_128 CG2_109$6   NE_128 CG2_109$6   NE_128 CB_109$6   CZ_128 CG2_109$6
 CZ_128 CA_109$6   CZ_128 CG2_109$6   CZ_128 CB_109$6   CZ_128 CB_109$6   NH1_128 CG2_109$6   NH1_128 CB_109$6
 NH2_128 CB_109$6   NH2_128 CG2_109$6
Most of these are Arg sidechains making trouble. The easiest solution is to simply cut all the bad ones after CB, i.e. to delete the respective atoms from the ins files using an editor. Doing this for residues 5, 68, 73, 125, and 128 gets rid of most of the complaints:

 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 +  SHELXL-97 - CRYSTAL STRUCTURE REFINEMENT - UNIX VERSION  +
 +  Copyright(C) George M. Sheldrick 1993-7    Release 97-2  +
 +  p1lys_0a             started at 10:07:03 on 27-Jun-2000  +
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 Read instructions and data
 ** Warning: no match for  167 atoms in  DFIX RTAB FLAT CHIV  **
 Data:    3174 unique,      0 suppressed   R(int) = 0.0000   R(sigma) = 0.0278
 Systematic absence violations:    0    Bad equivalents:    0
 wR2 =  0.8144 before cycle   1 for    699 data and     9 /   980 parameters
 GooF = S =    99.999;     Restrained GooF =     47.287  for   3042 restraints
 Mean shift/esd =   1.532    Maximum = -10.240 for  OSF            at 10:07:26
 Max. shift = 0.207 A for CB_70
The remaining warning concerning 'no match' for some restraints now corresponding to atoms missing from the model. At this stage, this warning can be safely ignored as we do not espect our model to be complete right now, anyway. The program happily finishes producing an Rwork of 46.4 and an Rfree of 47.4 percent. The result of the refinement is stored in the file: p1lys_0a.res. Diagnostic output can be found in p1lys_0a.lst.

Second round of refinement: coordinates and isotropic B-factors

We copy the res file to become the new ins file:

trs/p1lys> cp p1lys_0a.res p1lys_1.ins
and apply the following changes: Now our model is more or less happy at 1.5 A. We should have a look at the maps and do some clean up. The list file is here: p1lys_1.lst.


First rebuilding and subsequent refinement

At this stage we should only fix major problems, as we still have to include all data. It does not make sense to spend a lot of time on a disordered sidechains now - it will be much easier to sort out this kind of thing after the data to the full resolution have been included.

Have a look at the models and the maps now, if you want !

A good criterion for finding problematic places is to look at the Max(SIMU) deviation given in the list of reliability criteria towards the end of the lst file. If this number is larger than 0.15 (i.e. the B factors of two neighbouring atoms differ by more than 0.15 * 8 * pi^2 = 12 A^2), there is a good chance that the respective residues need some rebuilding. Based on this criterion, the following residues were identified and rebuild: Lys1, Lys13, Arg14, Arg21, Asn44, Arg45, Asn46, Thr47, Arg61, Thr62, Thr69, Pro70, Leu75, Ser85, Lys97,, Lys116, Some residues were removed altogether: Asp48,Val99-Asp101,Leu129. The previously deleted sidechains of Arg5 and Arg73 were built.

As there were drastic changes, the 'I' option of SHELXPRO was used to create a new ins-file:



 Menu ...
 . 

 ** The 'I' option is intended for initial input of a structure to SHELXL,
 NOT for updating between refinement jobs, for which 'U' should be used. **

 Enter N to abort option,  to continue: 

 Enter name of .ins file [shelxpro.ins]: p1lys_2.ins
 Enter name of PDB file [shelxpro.ent]: p1lys_1_mod.pdb
 Enter title [shelxpro]: triclinic lysozyme after first rebuilbing
 CELL in Angstroms and deg. [26.650 30.800 33.630 89.30 72.60 67.80]:
 
 Enter Z (number of molecules per cell) [4]: 1
 Enter space group in PDB or XPREP notation [P212121]: P1
 Enter wavelength in Angstroms [1.54178]: 0.927
 Generate atom coordinates using SCALE instructions from PDB file (P) or use
 current cell to calculate transformation matrix (C) [C]: 

 Enter old residue numbers (modified by chain ID, if any) for all N-terminii
 ( if none). To continue on the next line, put "=" at the end of the line
 : 1

 Enter old residue numbers for all C-terminii in the same way:  

 Enter old residue numbers in the same way at which renumbering of a block of
 residues should start. The block continues until the next residue specified
 here ( if none): 1

 New residue number for first solvent water [1001]: 

 Reset water occupancies to unity (Y or N) ? [Y]: 

 Current old residue number is    1.  Enter new residue number.  This defines the
 offset to be applied to residue numbers for the rest of this block: 

 Current old residue number is    1.  Enter new residue number.  This defines the

ld residue numbers for all C-terminii in the same way:  

 Enter old residue numbers in the same way at which renumbering of a block of
 residues should start. The block continues until the next residue specified
 here ( if none): 1

 New residue number for first solvent water [1001]: 

 Reset water occupancies to unity (Y or N) ? [Y]: 

 Current old residue number is    1.  Enter new residue number.  This defines the
 offset to be applied to residue numbers for the rest of this block: 

 Current old residue number is    1.  Enter new residue number.  This defines the
 offset to be applied to residue numbers for the rest of this block: 1

 HKLF code (3 for F, 4 for F-squared) [4]: 

 The .ins file has been written successfully.  The U option in SHELXPRO may
 be used for further checking of occupancies etc.

  to continue: 

 .

 Main Menue to Quit
 .

Two small changes have to be done in ins file produced by SHELXPRO: The refinement is unproblematic and reaches (Rwork,Rfree) = (26.2,31,4). The list file is here: p1lys_2.lst.


Automatic Building of a Preliminary Water Structure

After quickly checking that there are no major problems with the model, we now build a preliminary water structure to improve the phases. Some waters will probably sneak into the regions where we deleted some of the residues, but it in terms of improving the phases, it is better to put a water molecule than no atom to interpret the electron density.

To prepare for solvent divining using SHELXWAT, the .res file from the previous run needs only minor changes:

  • only five instead of ten cycles of refinement after automatic updating of the model (otherwise running shelxwat will take ages: CLGS 10 -1 -> CGLS 5 -1
  • we have to add one water as residue 1001 to have something for SHELXWAT to start with. For this water, we simply take the coordinates of the first difference peak, Q1:
    
    RESI 1001 HOH
    O     4  -0.6579  0.9624  0.6135  11.00000  0.1
    

    Then we run shelxwat with the following parameters:

     -n10 Number of overall cycles
     -s4 Scattering factor number for oxygen
     -u0.100 Starting isotropic U for waters
     -r0.200 Water rejected or halved if U exceeds this value
     -m50 Maximum number of waters to be added in one cycle
     -w4.000 Minimum height/sigma for added water
     -f Full occupancies only [use -h for full and half occupancies]
    
    The corresponding command line is:
    
    shelxwat -n10 -s4 -u0.1 -r0.2 -m50 p1lys_3 > p1lys_3.out
    

    If you want to compare the previous res file and the new ins file you have to run the UNIX diff command on the new bak file (this file is a copy of the initial ins file - the actual ins file is overwritten all the time when shelxwat is running):

    
    diff p1lys_2.res p1lys_3.bak
    
    Using the 'P' option of SHELXPRO, we can make a nice plot monitoring the progress made: p1lys_3.ps

    Overall, some 90 plus waters were added lowering Rwork and Rfree to 20.2 and 24.6%, respectively. We keep the waters and continue.


    Improving and finishing the model at 1.5 A

    Again we check the largest SIMU outliers and rebuild the following residues: Glu7, Asn19, Thr47, Ile78, Ser86,Lys116.

    The following residues are possibly disordered. As we are not using all data yet, we simply ignore them: Lys13, Arg21, Leu25, Arg112, Ile124

    The following residues were missing and we can build them: Asp48, Val99, Asn103, Gly102.

    We save the resulting model from within XtalView as p1lys_3_mod.pdb and this time, use the 'U' option of SHELXPRO to produce the next ins file:

    
     .
     .
     [S] Reflection statistics from .fcf       [Z] Least-squares fit
     [J] Generate restraints from model        [B] PDB deposition
     [G] Generate PDB file from .res or .pdb   [Q] Quit
    
     Enter option: U
    
     Converts SHELXL .res file to a new .ins file by including new or changed atoms
     from PDB format files such as those written by the graphics program "O".  All
     other SHELXL commands are retained unchanged.  This instruction also provides
     for setting up disorder refinement and updating the list of solvent molecules.
     The .res file should not contain instructions other than RESI, AFIX, PART and
     atoms between FVAR AND HKLF, and both FVAR and HKLF must be present. Note that
     although it is possible to set up threefold or multiple disorders in this way,
     the necessary SUMP restraints must be edited into the .ins file later by hand.
     This option may also be used without a .pdb file to update .res to .ins and
     apply various checks.
    
     Enter N to abort option,  to continue: Enter
    
    
     Name of .res (or .ins) file to read [shelxpro.res]: p1lys_3.res
      1.0  0.0  0.0    0.0  1.0  0.0    0.0  0.0  1.0    0.00000  0.00000  0.00000
    
      1027 atoms and    0 peaks read
    
    I don't really know what the 1.0 0.0 0.0 etc. mean ... But they do not harm. We simply continue ...
    
     There are now two alternative approaches to updating the atom list.  If a
     graphics program such as XtalView that understands disorder and anisotropy
     has been used to prepare a PDB format file, ALL atoms may be taken from
     this file.  With other graphics programs such as O it is better to start
     with atoms from a .res file and update individual residues interactively.
    
     Replace ALL atoms and peaks with atoms from a PDB file (Y/N)? [Y]: y
     Name of PDB file to read [shelxpro.pdb]: p1lys_3_mod.pdb
    
     Renumber residues (other than waters) ? [N]: N
    
     Add, halve or delete waters (Y or N) ? [Y]: Enter
     Should occupancies be halved for waters with high U-values (Y or N) ? [N]: Enter
     Ueq-threshold for rejecting waters [0.8]: Enter
    
     Renumber residues for waters ? [Y]: yy
     Starting residue number for waters [1001]: Enter
    
        84 full and     0 partly occupied waters plus    960 other atoms in list
    
     Transform waters to equivalent nearest to a non-water (Y or N) ? [Y]: Enter
    
     Emulate WHAT-IF bug of ignoring PART numbers greater than 1? This only works
     if PART 1 atoms come before PART 2 etc.! [N]: 
    
     3.096  O_1016 NH1_Arg73
     3.003 #O_1016 NZ_Lys33
     Transform water to symmetry equivalent # [Y]: 
    
     3.349  O_1018 CE1_Phe3
     3.211 #O_1018 CB_Asp48
     Transform water to symmetry equivalent # [Y]: 
     . 
     . 
    
    ... more waters to be put into the right place ...
    
     .
     .
     3.459  O_1078 NZ_Lys13
     3.424 #O_1078 CE_Lys33
     Transform water to symmetry equivalent # [Y]: 
    
     3.696  O_1082 ND2_Asn37
     3.646 #O_1082 NH1_Arg61
     Transform water to symmetry equivalent # [Y]: 
    
     Repeat water reorganization (Y or N) ? [N]: N
    
     .ins file to write (may be same as read) [shelxpro.ins]: p1lys_4.ins
     
    
     SHELXPRO - SHELX interface for protein applications - Version 97-2
     Copyright(C) George M. Sheldrick 1996-7
    
     [F] New output filename                   [V] R(free) files
     [A] Anisotropic scaling (Hope & Parkin)   [I] .ins from PDB file
     [P] Progress of LS refinement diagram     [L] Luzzati plot
     [T] Thermal displacement analysis         [E] Esd analysis
     .
     . 
    
    We start the next job:
    
    /home/trs> shelxl p1lys_4
    
    and get some warnings:
    
     Following 1,2- or 1,3-distances involving residues not restrained
    
     CG_7 O_1045   CD_7 O_1045   OE1_7 O_1045   OE2_7 O_1045   CA_99 O_1064   CB_99 O_1064   CB_99 O_1064
     CB_99 O_1064   CG1_99 O_1064   CG1_99 O_1064   CG2_99 O_1064   CG2_99 O_1064
    
    These warnings are caused by waters that SHELXWAT put to interpret an electron density for a sidechain. As we forgot to delete them when modelling the sidechain, they are know causing problems. After deleting Hoh1045 and HoH1064, we call the new ins file p1lys_4a.ins and restart the job, again using shelxwat:
    
    shelxwat -n10 -s4 -u0.1 -r0.2 -m50 p1lys_4a
    

    Including all data

    Before including all data, we have a quick look at p1lys_4a.pdb and its maps: Arg21 is clearly in the wrong rotamer, Ser100 and Asp101 are missing. Thr43 is definitely disordered. We ignore all these. As we also do not find any real problems in 'Disagreeable restraints':

    
     Disagreeable restraints before cycle    6
    
       Observed   Target    Error     Sigma     Restraint
    
                            1.8777    0.5000    FLAT O_3 CA_3 N_4 CA_4
                            1.6648    0.5000    FLAT O_53 CA_53 N_54 CA_54
                            2.5451    0.5000    FLAT O_62 CA_62 N_63 CA_63
    
    , we can safely include all data into the refinement by copying p1lys_4a.res to p1lys_5.ins and applying some small changes:

    Going Anisotropic

    Now that we have adjusted the model to all the data, we could, in principle start to build disordered sidechains etc. But, I prefere to first switch on anisotropic B-factors to improve the phases in order to have the best possible maps before I start thinking about complicated disorder.

    To include anisotropic displacement parameters into the refinement, the following change have to be made to p1lys_6.ins (which is a copy of p1lys_5.res).

    At this stage you also may want to play with the HOPE statement (see Manual) to model overall anisotropy before starting to use ADP's. I tried it for this structure as well. The drop in Rfree was only 0.4 percent, so to not complicate things I decided to not explicitly include the HOPE parameters into the refinement.

    Including ADP's into the refinement more than doubled the number of parameter in the model (from 4204 to 9453). This new parametrization caused a significant drop in both Rwork (3.1%) and Rfree (2.5%) and therefore is justified.


    First building and refinement at 1.1 resolution

    Now the bit of the refinement that is really different from lower resolution refinement starts: this mostly concerns the identification and modelling of disordered regions of the molecule. To find out where to improve the model, we will look at the following diagnostics: (1) the list of disagreeable restraints, (2) the highest peaks in the 1Fo-1Fc difference density, (3) residues with missing atoms.

    1. The list of disagreeable restraints

    The list of disagreeable restraints shows restraints where the model is off by more than 3 sigma from the value imposed by that restraint. This list is printed after every cycle of refinement and is a very good way to monitor the progress of the refinement and to spot problematic parts of the structure. In our case, the list of disagreeable restraints at the end of 20 cycles of CGLS-refinement looks like:
    
     Disagreeable restraints before cycle   21
    
       Observed   Target    Error     Sigma     Restraint
    
        2.6298    2.4620    0.1678    0.0400    DANG C_103 N_103
        2.3186    2.4710   -0.1524    0.0400    DANG CG_21 NE_21
        2.3656    2.5040   -0.1384    0.0400    DANG C_103 CB_103
        2.3818    2.5040   -0.1222    0.0400    DANG CG1_99 CG2_99
                            1.9181    0.5000    FLAT O_3 CA_3 N_4 CA_4
                            2.3906    0.5000    FLAT O_62 CA_62 N_63 CA_63
                           -0.3100    0.1000    SIMU U33 CD_112 NE_112
    
    Now we will look at the places indicated in the list. Please open Xtalview and display the model p1lys_6.pdb and the corresponding electron density using the file p1lys_6.fcf. To my opinion, the best choice of maps is to look at a SIGMAA-weighted 2mFo-DFc map at 1.0sigma (in blue) together with a straight 1Fo-1Fc difference map at +/- 2.5 sigma (in green and red, respectively).

    Asn103: This place is basically a mess. Center on N_Gly104 to see what is going on. The difference density is implying a possible second backbone conformatione starting at N_Gly104 and going backwards. If this was a complete conformation, we would see larger difference peaks. We make a note to decrease the occupancy starting with N_Gly104. I would also include Asn103 and Gly102 (maybe this is the other hinge) for this thing.
    Val99-Gly102: Checking Gly102 we realize beautiful difference density for Asp101 and Ser101 which we can now easily build. In fact it was the right choice to not bother about these two residues before including all data. To build this piece in Xtalview I first delete all peaks and waters in this density - then the real space fitting of Xtalview gets less confused. Ser100 and Asp101 are easy to build, I don't put too much effort to put them nicely into the density - this is the job of SHELXL.
    Arg21: Another stunning example. Remove the peaks and the waters and rotate the sidechain into the green density

    Disagreeable FLAT restraints on residue 3 and 62: There is nothing special to seen in the electron density. It is actually quite normal to see violated FLATS for omega angles as these are not as flat as people thought (Mac Arthur & Thornton, J.Mol.Biol 1996 264:1180-1195), MEDLINE)

    Disagreeable SIMU restraint for Arg112: There is no clear indication on what to do, we leave this one as it is

    2. Peaks in the Fo-Fc difference density map

    There is a number of different ways to go through the list of difference density peaks. One way is to go to the part of the lst-file that has the complete list of positive peaks and center on the atom closest to the peak. Here are the first five from p1lys_6.lst:
    
     Fourier peaks appended to .res file
    
                  x       y       z       sof     U      Peak   Distances to nearest atoms (including symmetry equivalents)
    
     Q1    1  -0.3243  0.8430  1.0186   1.00000  0.05    1.50   2.17 O_1004  3.02 NE1_62  3.11 O_1081  3.27 NH1_73
     Q2    1  -0.7717  0.6415  0.3616   1.00000  0.05    1.44   1.42 C_99  2.23 O_99  2.52 CA_99  2.75 O_96
     Q3    1  -0.6273  0.6219  0.1878   1.00000  0.05    1.31   2.26 N_102  2.58 O_1073  2.84 CA_102  3.37 OD1_103
     Q4    1  -0.5204  0.4281  0.1610   1.00000  0.05    1.22   2.72 O_102  2.95 O_67  3.12 O_125  3.18 NH2_5
     Q5    1  -0.2483  0.7984  0.4732   1.00000  0.05    1.19   2.56 OE1_35  2.73 O_1028  2.76 O_1023  3.48 CD_35
    
    Q1,Q7: We center on O_1004. Voila, an acetate or a nitrate ion from the crystallization buffer, which was partly modelled by water number 1004. It is difficult to decide whether it is an acetate or a nitrate at this stage, even if we take the chemical environment into account, so we first try a nitrate and then decide later based on bond-length and on the relative B-values of the atoms. We steal a nitrate molecule somewhere from the pdb (actually, there is a good one in 3lzt.pdb ...) and load it into XtalViews dictionary. Then we can simply mutate Hoh1004 into No3 and then move the nitrate into the density.

    Now the other peaks:

    Q2-Q4,Q6: close to Val99-Asp102. done before

    Q5: a water molecule. ignore in this phase

    Q8: another nitrate, apply the same procedure as above

    and some more waters that we ignore.

    3. Missing residues

    Now we look at the missing residues (the list of 'Following atoms could not be matched for particular residues for DFIX') is a good place to identify the missing bits
    Arg68: Acceptable density, put it in.
    Arg125: Nice density, put it in.

    Residues Arg128 and Leu129 are still a mess.

    So far, we have not modelled any disorder. We write the model from XtalView to a pdb file called p1lys_6_mod.pdb. The we use the 'U' option in SHELXPRO to update p1lys_6.res using p1lys_6_mod.pdb and obtain the file p1lys_7.ins. Some small modifications are necessary before we start the next job:

    Finally, we start the next SHELXWAT job:
    
    shelxwat -n5 -s4 -u0.15 -r0.3 -m50 p1lys_7a > p1lys_7a.out
    
    The job converges at R-values of 15.8 (free) and 13.5 (work).

    Second building and refinement at full resolution

    Here is the list of Disagreeable restraints
        2.5139    2.3730    0.1409    0.0400    DANG OG1_43 CG2_43
        2.6230    2.4620    0.1610    0.0400    DANG C_103a N_103a
        2.5872    2.4660    0.1212    0.0400    DANG CD_68 CZ_68
        2.6218    2.4970    0.1248    0.0400    DANG C_99 CB_99
        2.3396    2.5040   -0.1644    0.0400    DANG CG1_99 CG2_99
                            1.9074    0.5000    FLAT O_3 CA_3 N_4 CA_4
                            2.4837    0.5000    FLAT O_62 CA_62 N_63 CA_63
                            1.5070    0.5000    FLAT O_101 CA_101 N_102 CA_102a
                           -0.3195    0.1000    SIMU U33 CB_114 CG_114
    
    Thr43:This is clearly a double conformation. There is negative density at the oxygen atom an a bit of positive density on the CG-carbon. Probably there is a second rotamer related to the first by a 120 deg. clockwise rotation around the Chi1 angle.
    Asn103:Although we decreased the occupancy of Asn103 and N_Gly104 in the previous cycle we still see some red density on the carbonyl group of Asn103. The symmetric density around N_Gly104 indicates a pep-flip where both possibilities are about half occupied. This is a bit tricky to model, if you want load the final model into Xtalview and have a look.
    Arg68:Not a very clear situation. We only decrease the occupancy of the first conformer to maybe have a better picture after the next round of refinement.
    I also looked at the difference density peaks (found some more nitrates) and at the missing residues (nothing exciting) All this is skipped to not get too bored.

    Some of the more interesting places to look at are: a wrong Chi1 rotamer for Val99, a potential second conformation for Arg114 (a little bit of density for every atom, very nice).


    More building and refinement at full resolution

    The next couple of cycles of model building and refinement are documented in the files corresponding to p1lys_8 to p1lys_13. This is the main part of the refinement that envolves a lot of fiddling. Only some important points will be discussed here. The basic strategy is always the same:

    1. Run the refinement job specifying PLAN 100 -1.0 0.1 to get most of the difference density peaks dumped to the pdb-file. With XtalView you can then simply go through all the peaks by hitting <space>
    2. Load the resulting pdb-file into XtalView and display maps based on the fcf-file.
    3. Systematically check all places mentionend in the list of 'Disagreeable restraints'. Violations of FLATs can normally be ignored.
    4. Systematically check all difference density peaks higher than, let's say, 5.0 sigma.
    5. Take notes on all the changes you made - you will need these notes later to not get lost when you are editing the .ins file for next round.
    6. To create the .ins-file for the next round,
      (a) Write out the modified model from XtalView and convert it to a new ins-file using the 'U' option of shelxpro. Call this file, e.g. p1lys_9_mod.ins as it is a modified version of of p1lys_9.pdb.
      (b) copy the last res file to become the new ins file (e.g. cp p1lys_9.res p1lys_10.ins)
      (c) modify the new .ins file (p1lys_10.ins) using an editor and by cutting and pasting bits and pieces from the ins-file made from the modified pdb file (here p1lys_9_mod.ins).
      This is not a very elegant way, but it works for me ...

    At places where you have to modify the model, do the following:

    1. Remove (water-)residues corresponding to peaks that are in protein density
    2. Adjust sidechains if necessary and take a note
    3. If you see signs of a second conformation for a currently fully occupied sidechain take a note about this and follow the instructions given in the SHELX-FAQ here.
    4. If a difference density peak corresponds to a water, mutate the water to become an HOH-residue (this will effectively only change the name of oxygen from 'W' to 'O'). This way one can easily distinguish new waters and left-over peaks when creating the next .ins files.

    For p1lys_8 to p1lys_13 we know look at some interesting situations:

    p1lys_8, Arg45: We had set the occupancy of the first conformation to a fixed value of 0.65 in the previous round of refinement. This has produced some green density for atoms CB to CD and clarified the situation for the rest of the sidechain. We can now put a second conformation.
    Here is the piece of the next ins-file that describes Arg45 in two conformations whose occupancies are stored in free variable number 4 (the one in red):

    FVAR  0.14409   0.60127 0.47703 0.5 0.5 0.5 0.5 0.5
    
    .
    .
    
    RESI   45   ARG
    N     3   -0.253501    1.101474    0.452285    11.00000    0.07022    0.07846 =
             0.09258    0.04009   -0.03593   -0.04887
    CA    1   -0.252002    1.124127    0.413177    11.00000    0.07121    0.07364 =
             0.07850    0.03471   -0.03840   -0.05241
    C     1   -0.196467    1.096687    0.379008    11.00000    0.05700    0.09054 =
             0.09636    0.04182   -0.03807   -0.04355
    O     4   -0.151361    1.094247    0.383487    11.00000    0.06484    0.15748 =
             0.15236    0.04420   -0.05401   -0.06088
    PART 1 41.0
    CB      1 -0.261554  1.175634  0.423002  10.50000   0.1
    CG      1 -0.303197  1.196700  0.466914  10.50000   0.1
    CD      1 -0.320214  1.249627  0.478206  10.50000   0.1
    NE      3 -0.383742  1.274377  0.498092  10.50000   0.1
    CZ      1 -0.411667  1.288025  0.538303  10.50000   0.1
    NH1     3 -0.384939  1.279152  0.566438  10.50000   0.1
    NH2     3 -0.467895  1.310270  0.550692  10.50000   0.1
    PART 2 -41.0
    CB      1 -0.261124  1.177588  0.424570  10.50000   0.1
    CG      1 -0.318260  1.204679  0.458320  10.50000   0.1
    CD      1 -0.333151  1.257347  0.468953  10.50000   0.1
    NE      3 -0.384572  1.277852  0.507407  10.50000   0.1
    CZ      1 -0.438420  1.290365  0.509540  10.50000   0.1
    NH1     3 -0.452909  1.287381  0.475916  10.50000   0.1
    NH2     3 -0.478846  1.307200  0.546176  10.50000   0.1
    PART 0
    
    If you look at the model you will notice a small confusion. The first known conformation has become PART 2 and the new one PART 1 - this is the way XtalView splits residues (somewhat confusing, but it works).

    p1lys_9Nothing special.

    p1lys_10, Lys145: Another clear double conformation, despite only one atom being disordered I prefer to split the entire sidechain. This costs some parameters but in most cases gives a more stable refinement. And, for practical reasons, it is much easier to do the analysis in the end if all disordered residues are treated the same way. Look at p1lys_11.ins if you want to see how the disorder is described.

    If you want to know what happened in p1lys_12 and p1lys_13, have a look at the ins files and the maps yourself.


    Putting hydrogens

    Now that we are more or less finished with refinement, we put the hydrogens. In SHELXL, this is very easy. Simply remove the REM statements in front of the HFIX statements.

    Make sure that you do not put the hydrogens you want to see (i.e. protonation of histidines etc). For the case of histidines, I also do not activate the generation of hydrogens for CE1 and CD2 - these hydrogens must be there and the corresponding electron density will give you a nice calibration for what you can espect for the hydrogens on the nitrogen atoms. For histidine we have:

    HFIX_HIS 13 CA
    HFIX_HIS 23 CB
    HFIX_HIS 43 N 
    REM HFIX_HIS 43 N ND1 CE1 CD2
    

    Also do not activate the generation of hydroxyl hydrogens on Thr, Ser, and Tyr residues (i.e. keep the REM cards in front of e.g. HFIX_TYR 83 OH. Placing these hydrogens in not trivial and automatic placement will often put them into the wrong place.

    The nterminal NH3 group needs special treatment. Simply put HFIX 33 N_1 before any other HFIX statements.

    For incomplete residues (in our case Arg121 and Arg128 are cut after CB), the easiest way to stop SHELXL from complaining is by cheating a bit and making the CB methyl groups, in this case:

    HFIX 33 CB_121
    HFIX 33 CG_128
    
    Again, not elegant, but works. You must take a note about this trick in order to not submit wrong hydrogens to the PDB.

    There is some more notes on hydrogens in the SHELXL-FAQ.

    The final ins file is p1lys_14.ins. The job runs without any major problems and finshes at R(work,free) = (9.74,12.81). Note, that we did not include a single extra parameter into the refinement to achieve this gain. The job runs about 40 percent longer than an equivalent jobs without hydrogens. This is due to the larger number of atoms that have to be included into the structure factor calculation.

    The inclusion of hydrogens has changed the model and we get quite a long list of disagreeable restraints:

     
       Observed   Target    Error     Sigma     Restraint
    
        2.7274    2.8000   -0.0726    0.0200    BUMP O_18 CB_19a
        1.9863    2.1000   -0.1137    0.0200    BUMP HA_19a HD2B_19a
        1.9670    2.1000   -0.1330    0.0200    BUMP HD2A_19a HA_81
        1.9552    2.1000   -0.1448    0.0200    BUMP HD2B_19a HD1A_84
        2.0308    2.1000   -0.0692    0.0200    BUMP HB2_15 HG1B_92
        1.8944    2.1000   -0.2056    0.0200    BUMP HA_93 HD2B_93
        2.1057    2.2450   -0.1393    0.0400    DANG OD1_37 ND2_37
        2.5222    2.3930    0.1292    0.0400    DANG CB_106b OD1_106b
        2.1403    2.4190   -0.2787    0.0400    DANG CB_19a ND2_19a
        2.2771    2.4300   -0.1529    0.0400    DANG CA_43a OG1_43a
        2.3127    2.4350   -0.1223    0.0400    DANG C_71 CA_72b
        2.3252    2.4550   -0.1298    0.0400    DANG CB_19a N_19
        2.5889    2.4660    0.1229    0.0400    DANG CD_68b CZ_68b
        2.3488    2.4710   -0.1222    0.0400    DANG CG_114b NE_114b
        2.6482    2.5040    0.1442    0.0400    DANG C_19 CB_19a
                            0.0725    0.0200    SAME/SADI O1_504 O3_504a O1_505 O2_505
                            2.0640    0.5000    FLAT O_3 CA_3 N_4 CA_4
                            2.4586    0.5000    FLAT O_62 CA_62 N_63 CA_63
                            1.5779    0.5000    FLAT O_71 CA_71 N_72b CA_72b
                            1.5126    0.5000    FLAT O_101 CA_101 N_102b CA_102b
                           -0.3327    0.1000    ISOR U22 O_1149
                            0.3206    0.1000    ISOR U33 O_1149
    
    There is still things to do:

    Sorry, but I ran out of steam here ...


    Finishing the refinement

    My personal criteria to consider a refinement as finished are:

    The final job should include all data (i.e. work and free set). This can be done by removing the -1 from the CGLS statement. In this case, the job blows up, see (p1lys_15.out and p1lys_15.lst) due to the badly modelled double conformation of Arg68. Normally we would go back to the previous step and built a better model. Because I am lazy and this is only for the tutorial I simply deleted the respective atoms from the .ins file and made a new one: p1lys_15a.ins. This refinement happily runs through (p1lys_15.out) and finishes with an R-value of 9.84 percent for all reflections higher than 4 sigma.


    Calculation of standard uncertainties

    I am still working on this one. For the time being you have to live with what you can find in the SHELXL-FAQ


    How to continue

    Once the refinement is finished you have to see what you can learn from the structure. In addition of what you can get from lower resolution structures, here is a few things you can look for: This list is not meant to be comprehensive.

    Summary of the refinement

    #par is number of parameter in model
    #obs is number of reflections used in refinement
    Rw is Rwork in percent
    Rf is Rfree in percent
    Rd is the difference betwee Rw and Rf in percent
    CPU is approximate CPU time normalized to a Pentium III running at 500 MHz
    step                       data used   #par   #obs    Rw    Rf   Rd   CPU   job#
    ---------------------------------------------------------------------------------
    molecular replacement       15.0-4.0      3    775  44.5     -    -    30  
    rigid body                  10.0-2.5      9   2997  46.4  47.4  1.0   140  
    first round                 10.0-1.5   3887  13818  30.2  34.3  4.1   500  
    after first rebuild 10cgls         "   3735      "  26.2  31.4  5.2   360  
    SHELXWAT                           "   4091      "  20.2  24.6  4.4  2000  
    bld + SHELXWAT                     "   4231      "  18.7  22.8  4.1  2000 
    include all data 20cgls     10.0-1.1   4203  33993  19.1  21.7  2.6  1450     5
    ANIS 20cgls                        "   9453      "  16.0  19.2  3.2  2850     6
    Rebuild 10 SHELXWAT                "   9557      "  13.4  15.8  1.8  1480     7a
    rebuild 10 cgls                    "  10481      "  12.3  15.3  3.0  1610     8
    rebuild 10 cgls                    "  10819      "  12.1  15.1  3.0  1650     9
    rebuild 10 cgls                    "  10838      "  11.6  14.6  3.0  1800    10
    rebuild 10 cgls                    "  11494      "  11.3  14.4  3.1  1800    11
    rebuild 10 cgls                    "  11576      "  11.0  14.0  3.0  1800    12
    rebuild 10 cgls                    "  11774      "  10.7  13.8  3.1  1800    13
    put hydrogens               10.0-1.1  11765      "   9.7  12.8  3.1  2500    14
    include test set            10.0-1.1  11693  35786   9.8     -    -  2600    15a
    

    Good Bye

    Thanks for following the tutorial. Let me know what you think about it (thomas.schneider@shelx.uni-ac.gwdg.de). More links related to SHELXL can be found on the SHELX Homepage.