SHELX FAQ

General SHELXL questions
SHELXL small molecule refinement
SHELXC/D/E macromolecular phasing
SHELXL macromolecular refinement

Q1: I have moved to another location. Do I need to register again?
A:
Yes, please register again! Then send me an email giving sufficient details so that I can remove your old affiliation from the users' list.

Q2: My computer self-destructed and I have lost the programs. Do I need to register again?
A:
No, if you kept the confirmation email you can use it to download the programs again (you may do this as often as you wish). If you have really lost the password, the quickest solution is to register again (so that you immediately receive the downloading instructions by email) and send me an email to ask me to ignore the duplicate registration.

Q3: What about updates and bug-fixes?
A:
The time between major SHELX releases is measured in decades, not days. Improved versions will however be announced on the SHELX homepage as soon as they become available for download. Although the programs do not have expiry dates, it is a good idea to click on "Recent changes" from time to time to see if important changes or bug-fixes have been made.

Q4: Who needs to license SHELX?
A:
All users, even though a license for academic use is free. 'For-profit' firms are expected to pay one annual license fee per site, irrespective of the number of users or computers. A SHELX-97 (or SHELX76) license does not cover the use of the current version!

Q5: How can I verify whether I or my firm is a licensed user of the current SHELX?
A:
Look at the users' list, but there may be up to two weeks delay in adding new names to this list. Commercial users are allowed two months free trial to cover the time for arranging payment, but will lose their licences if their annual contributions are appreciably in arrears.

Q6: When I start SHELX on my PC the disk rattles loudly for several hours and smoke comes out of the back. Is this a bug?
A:
You must be trying to run SHELX under some version of Windows! The best solution is to reformat the hard disk and install Linux. However the current release should produce less smoke.

Q7: When I double-click on the filename under Windows, the program doesn't start!?
A:
SHELX is intended to be run from a command prompt. If you think that this is too old-fashioned, you can call the programs from a GUI such as shelXle, WinGX, Olex2, Oscail or hkl2map.


General SHELXL Questions

Q8: I get the error message ** ATOM NAMES NOT ALLOWED **?
A:
There is probably an unexpected character in one of the instructions. Look at the .lst file to see which instruction caused the problem. The character could be invisible!

Q9: I get the message ** UNSET FREE VARIABLE FOR ATOM ... ** but I have not used any 'free variables'?
A:
There is a typo in your atom coordinates, e.g. a decimal point missing or replaced by a comma. Alternatively you may have really referenced a free variable that wasn't defined by a FVAR instruction!

Q10: What should I do about 'may be split' warnings?
A:
Probably nothing. The program prints out this warning whenever it might be possible to interpret the anisotropic displacement of an atom in terms of two discrete sites, and estimates the coordinates of these sites. Such atoms should be checked (e.g. with the help of an ORTEP plot) but in many cases the single-site anisotropic description is still eminently suitable.

Q11: What does NPD mean?
A:
The U or Uij values of an atom have become Non Positive Definite. This can be prevented with XNPD 0.02, in which case the eigenvalues of the ADP-tensor would be constrained to be 0.02 Å² or larger, but it might be better to find the cause. A common cause is that the scattering factor number has been specified wrongly, e.g. an oxygen atom is being refined with a hydrogen atom scattering factor. Another common cause is that two atoms (e.g. disorder components) are on (almost) the same position, but neither EADP constraints or SIMU restraints have been applied to make their ADP values equal. DELU or RIGU do not do this when the atoms have different PART numbers.

Q12: How do I refine against neutron diffraction data?
A:
Insert a NEUT instruction before a SFAC instruction that just uses element names. This will use neutron scattering lengths that are weighted sums over all isotopes at natural abundance for all atoms except for D, which is assumed to be a pure isotope. It also switches off the special treatment of H and D when interpreting instructions (e.g. ANIS will make all atoms anisotropic, including H and D), but AFIX instructions can still be used for placing and refining hydrogens. FMAP -2 can be used to generate positive and negative 'Q-peaks'. See also HKLF 2 and LAUE.


SHELXL questions primarily for small molecules

Q13: Why have the CIF files from SHELXL become so much larger?
A:
Because they now include embedded .hkl, .res and possibly .fab files. This is also a good way to archive the structure. If you only intend to use the .cif file for input to another program, you can use ShredCIF to break it up into its components. This may also be used to verify the checksums to check that a CIF file has been transmitted correctly.

Q14: Should I do a final refinement with TWIN and BASF to obtain a definitive value for the Flack x parameter?
A:
No! The Parsons' quotient method almost always gives a better estimate and is even valid for twinned structures, but does require that (a) Friedel pairs are present in the data, and (b) the structure has been fully refined. TWIN (either without a matrix or with a matrix that has a negative determinant) disables the Parsons' method. There are good reasons to believe that methods based on quotients or differences of Friedel opposites give better estimates than the classical inclusion of the Flack parameter in the full-matrix refinement [Parsons, Flack and Wagner, Acta Cryst. B69 (2013) 249-259].

Q15: My non-centrosymmetric structure has no atom heavier than oxygen and the referee insists that I use MERG 4 rather than MERG 2, however this now appears to be incompatible with ACTA. What should I do?
A:
Read Thompson and Watkin, Tetrahedron Asymmetry 20 (2009) 712-717 and Parsons, Flack and Wagner, Acta Cryst. B69 (2013) 249-259, then send your paper to Acta Cryst. instead! Hardware and software have now progressed so far that it is quite possible to determine absolute structure even with MoKα radiation when the heaviest atom is oxygen (and for at least one case with CuKα data for a pure hydrocarbon)! So even with MoKα data, the latest IUCr recommendation is never to average Friedel opposites. A further advantage of this is that more comprehensive data statistics can be obtained (and written to .cif) if the unmerged data are used. The problem of a possible artificial reduction in the parameter esds has been elegantly eliminated by calculating them differently (suggested by Ton Spek). MERG 2 is not necessary for macromolecules, MERG 4 can still be used for them because the ACTA instruction is not required, and because the saving in computer time is more significant.

Q16: I get a Flack parameter of 0.5 with a small esd. Is my structure racemicaly twinned?
A:
Probably not! In many such cases, the true space group was centrosymmetric. Racemic twinning is possible but relatively rare.

Q17: My structure could only be solved in P1, not P-1, but on refinement some of the bond lengths and U-values are wildly different for the two molecules. If I use SAME the geometries of the two molecules become similar but what should I do about the Uij?
A:
You could try RIGU, but it might be better to look for the inversion center instead, otherwise you will probably be marshed!

Q18: How can I produces nice tables from the final .cif file to pad out my thesis?
A:
Run CIFTAB and specify the formats 'rta' (Å) or 'rtm' (SI units). This will produce a Rich text format (rtf) file that can be read directly into Microsoft Word or Open Office. The tables will then be formatted, and it is easy to add a personal touch with the word processor. However nice pictures are even better for padding out theses!

Q19: How can I squeeze my structure?
A:
"Squeezing" is understood to mean estimating the complex scattering factors for severely disordered solvent sites and adding them into the structure factor calculation. This facility can easily be misused and so should be used with care and only when there are compelling reasons, e.g. because it is not possible to define an atomic model for an infinite river of disordered solvent density along a crystallographic axis. The program PLATON is able to calculate the appropriate partial structure factors for input to SHELXL using the ABIN instruction. ABIN also specfies a scale factor and overall B-value that can be refined for the partial structure factors, and may also be used for twinned structures, in which case the partial structure factors should correspond to the untwinned structure. For further details see the PLATON documentation.


Questions primarily for phasing macromolecules

Q20: I have solved my structure by MR but am not able to make further progress. What should I do?
A:
If you are lucky enough to have good resolution (2.5A or better) and a high solvent content, a easy (but slow) way to check whether the solution is correct is to rename the PDB file from MR to name.pda and feed it into SHELXE:
shelxe name.pda -s0.55 -a20 -q -t4
where -s sets the solvent content. If the CC against the native data rises to above 25%, the structure is almost certainly solved, and the poly-Ala trace in name.pdb may help you to interpret it. In borderline cases the new options -o and -O in SHELXE may help by optimizing the starting model. If anomalous data are also available, MRSAD may improve the phases, e.g.
shelxe name.pda name_fa -h -z -q -a5 -t4
In this case, SHELXC (or hkl2map) should be run first to make the file name_fa.hkl.

Q21: What is the best way to input the intensity data to solve a structure with the SAD or MAD methods?
A:
Input XDS_ASCII.HKL, name.sca or name.hkl type files into hkl2map. If you processed the data with XDS, it is much better to input XDS_ASCII.HKL files directly rather than maul the data with other programs first! The use of unmerged data in this way also gives better statistics, e.g. CC1/2. This file can be renamed, e.g. to XDS_PEAK.HKL, XDS_INFL.HKL and XDS_HREM.HKL for a typical MAD experiement, because SHELXC will recognise it whatever it is called. If you are using a Bruker system, it is best to use unmerged .hkl files. The Bruker XPREP can read, write and convert various formats, scale datasets together, determine the space group and transform the cell and data to the appropriate conventional setting, and add or transfer free-R flags. It also provides an alternative to SHELXC for making the name_fa.hkl and .ins files.

Q22: Why has sulfur-SAD not become the standard method for phasing new protein structures?
A:
Sulfur-SAD requires extremely accurate data. Many beamlines are simply not stable enough to produce such data, and radiation damage can easily swamp the 1-2% anomalous signal. Highly redundant in-house data collected with a kappa or three-circle goniometer often gives better phase information, but combining it with higher resolution native synchrotron data (which does not require such a high accuracy or redundancy) can still make a big difference to map quality and the ease of autotracing.

Q23: How can I fine-tune the substructure solution?
A:
Vary the resolution cutoff, do many more trials (some substructures solved only once in 50000 trials), use the disulfide option (DSUL) in SHELXD and try different space groups. If you collected MAD data, try the SMAD option in SHELXC to use the combined data for SAD. The new -z option in SHELXE to optimize and extend the substructure is strongly recommended.

Q24: How can I fine-tune the phasing with SHELXE?
A:
Vary the solvent content (-s), the number of density modification cycles (-m), include the specific alpha-helix search (-q) and give it more time for the helix and peptide search (e.g. -t4). However these last two options are expensive in terms of CPU time. MRSAD (see Q20) may be useful if you have a possible search fragment and weak anomalous data.

Q25: What is the best way to make a nice picture of the SHELXE trace and map?
A:
Use Tim Grüne's shelx2map to convert the .phs into a CCP4 format map file, then input this file and the .pdb from SHELXE into PYMOL. You may also wish to add e.g. -e1.0 to the SHELXE command line to improve the final map by using the free lunch algorithm.


Questions primarily for refining macromolecules with SHELXL

Q26: What is the best way to create the .hkl input file for SHELXL?
A:
Many commercial systems (e.g. Bruker APEX2/PROTEUM2) can create this file directly. If you use SHELXC/D/E or hkl2map to solve the structure you get this file as a byproduct. If you only have .mtz files, be sure to keep the intensities and set the free-R flags in CCP4. Then you can use Tim Grüne's mtz2hkl to make the .hkl file. If the intensities are not present in the .mtz file, mtz2hkl can make a .hkl file containing F-values rather than intensities, but then you must be careful to specify this when reading the file. See also Q21.

Q27: How do I set up the first .ins file for SHELXL?
A:
You will have to use the 'I' option in SHELXPRO (part of SHELX-97) until a better program is ready. You will then need to add restraints for residues other than the 20 standard amino-acids yourself.

Q28: Where can I find suitable geometrical restraints?
A:
A good start is the CCP4 monomer library. Even better, Dale Tronrud has set up a cdl_shelxl web server that can read a PDB file and output improved conformation-dependent DFIX, DANG and CHIV restraints for SHELXL refinements of proteins. For further details see Tronrud and Karplus, Acta Cryst. D67 (2011) 699-706. For small molecules, the PRODRG web server can generate SHELXL format geometrical restraints. The Global Phasing GRADE web server probably gives the most accurate restraints and can also create them directly in SHELXL format (DFIX, DANG and FLAT). These restraints can either be inserted into the .ins file or written to a separate file and read in by using the '+filename' or '++filename' options in the SHELXL .ins file.

Q29: How can I look at the electron density after a SHELXL refinement?
A:
Run a SHELXL refinement with LIST 6, then newer versions of Coot will be able to read the resulting .res and .fcf files and display the maps. Unfortunately if you edit the model with Coot and try to write it to an .ins file for the next SHELXL refinement job, some hand editing will almost certainly be required. Tim Grüne's shelx2map can be used to convert the .fcf file into a CCP4 format map that PYMOL can read (together with the .pdb file from SHELXL). This is the recommended way of obtaining publication quality diagrams that include the electron density.

Q30: How can I calculate standard deviations?
A:
This requires full-matrix refinement. Usually only the esds in the coordinates and derived geometrical parameters are needed. After refining to convergence with CGLS, a single cycle of full-matrix refinement (L.S. 1) is performed for the coordinates only (BLOC 1), with all geometrical restraints removed and zero shift multipliers (DAMP 0 0). It may be necessary to fine-tune the memory allocation with the -a and -b run-time flags. For very large structures it may be necessary to do several cycles of refinement with overlapping blocks defined by BLOC instructions. To obtain esds in various geometrical parameters taking the full correlation matrix into account, RTAB, MPLA, BOND and HTAB may be used. The number that appears under the atom name in the .lst file is the radial atomic positional esd in Ångstroms.

Q31: How do I generate a restraint across a symmetry element?
A:
You will need to specify an EQIV instruction and then refer to the symmetry generated atom(s) using _$1 etc. For example, if a disulfide bond invloving SG of Cys29 and its symmetry equivalent is bisected by a crystallographic twofold axis at (0.5, y, 0.5), you need to specify:
EQIV $1 1-x, y, 1-z 
DFIX_29 2.031 SG SG_$1
DANG_29 3.035 CB SG_$1 SG CB_$1

Q32: When should I add hydrogens?
A:
Adding hydrogens using HFIX or AFIX improves the model and does not add any extra parameters. It is best to model disorder before adding the hydrogens because the program will then add the disordered hydrogens correctly too (!). However adding hydrogens in -OH groups (Tyr, Ser and Thr) is not recommended because (a) they have little effect on the R-values, (b) it is difficult for the program to predict their positions accurately, and (c) if the program accidentally assigns two hydrogens to the same H-bond and they are then refined using the riding model (AFIX 83) with anti-bumping restraints switched on (BUMP), the repulsion between the hydrogens can introduce mechanical distortions into the structure. Hydrogens should be added individually to the ring N atoms of histidines taking possible H-bonds and metal coordination into account (or left out). Adding hydrogens to C and N atoms in proteins usually reduces R1 and R1(free) by the same amount, typically 0.5 to 1%. With accurate small molecule data, the drop can be much larger.

Q33: When I add the hydrogens, the program says:
** BAD AFIX CONNECTIVITY: CB_21 BONDS TO CA_21 **. What should I do?

A:
The program was not able to add the hydrogens to CB_21 because the rest of the side-chain was missing. This is the recommended procedure when there is no density for the rest of the side-chain, but it is necessary to add an instruction: HFIX 0 CB_21 before the offending HFIX instruction to switch off the addition of the CB hydrogens for this particular residue only. Similar error messages can also be caused by bad geometry resulting in errors in the connectivity array, these can be corrected by FREE and/or BIND instructions.

Q34: After adding hydrogens, SHELXL complains:
** BAD AFIX CONNECTIVITY: N_1 BONDS TO CA_1 **. What is happening ?

A:
The program is trying to make the terminal N into an amide instead of -NH3+, which doesn't work because it only bonds to one atom. Include HFIX 33 N_1 before the other HFIX instructions. The program always applies the first HFIX that is appropriate to a given atom.

Q35: Are special restraints necessary for the C-terminus?
A:
Yes. Usually the C-terminus is a carboxylate (-CO2-) with equal C-O bond lengths, like the end of a Glu or Asp side-chain. Then it is best to name these atoms OT1 and OT2 (so that the usual restraints are not applied) and to add suitable extra restraints, e.g.
DFIX_129 1.249 C OT1 C OT2 
DANG_129 2.194 OT1 OT2
DANG_129 2.379 CA OT1 CA OT2

Q36: What is the best way to model a disordered side-chain?
A:
Look at the map with Coot to see which is the first disordered atom (say CG). Then go back one atom (to CB) and put a PART 1 10.66 before it and a PART 0 after the last atom of the side-chain. Run another refinement job and look at the density again, the second conformation should now be much clearer and it should be possible to fit it with Coot.You may have to remove one or two waters that were trying to emulate the second conformation. The add 0.66 to the end of the last FVAR instruction to introduce a new free variable for the occupancy of the first component (say fv(5)), change the PART instruction before the first component to PART 1 51 and insert an instruction PART 2 −51 before the second component (which should also start with CB). The atoms in the second component then all have an occupancy of 1−fv(5). fv(5) will be refined from a starting value of 0.66, which is likely to be a good first approximation. It is best to do this before adding hydrogens in the normal way with HFIX so that disorder is automatically taken into account. Normally, the atoms in the different PARTs should have the same names and the restraints will be applied to both conformations, taking PART into account. If the two (or more) components differ chemically, they should be given different names. These names may then be used to add appropriate restraints for just one component.

Q37: I have a cyclic protein in which N_1 forms a normal peptide bond with C_99. How can I persuade the program to apply the peptide restraints to these residues?
A:
Simply change the RESI instruction for residue 1 (say RESI Met 1) to RESI Met 1 100. This second residue number is an alias. The _+ and _− operators used by restraints that link atoms in adjacent residues apply to both the residue numbers and their aliases (if any). The same technique can be used to bridge deletions in the sequence; insertions are possible but more tricky (not just for SHELXL!).

Q38: Rfree only goes down by 0.2% on going anisotropic, but the maps are much better. Surely I should continue with anisotropic refinement?
A:
No! Go back to isotropic. A drop of 0.2% in Rfree is usually not significant because relatively few reflections are used for Rfree. It is also important to keep the gap between R and Rfree as small as possible. Your maps look 'better' because they look more like your model - that's what overfitting (or model bias) is all about!

Q39: Is it possible to read in partial structure factors, e.g. for a more sophisticated solvent model than the primitive Babinet model that SHELXL offers?
A:
Yes. Use ABIN to read in a .fab file that contains h, k, l and the real and imaginary parts of the calculated structure factors for a better solvent model (you may need to write a little program to make this file). ABIN allows separate scale and B-factors to be refined for these contributions. This even works for twinned structures (the .fab file should be calculated for one component only).

Q40: How can I produce a histogram of omega angles?
A:
A pragmatic solution is to take the table that starts after 'Dihedral angle OMEG' in your last .lst file, paste it into another file and then use gnuplot or some Office Software to make the histogram. Not very elegant, but it works.

Q41: Molprobity and the PDB do not like the SHELXL names for hydrogen atoms and complain about the 'chirality' of -CH2- groups!
A:
Deposit the structure without the hydrogens (WPDB 2)! If you need the hydrogens for some other purpose, you can use MolProbity to regenerate them with conventional names. SHELXL does not like atom names that begin with a digit and so is not able to work with these atom names.

Q42: Coot works fine with the .fcf files from SHELXL-97 but cannot read those from the new SHELXL!?
A:
An IUCr CIF committee deprecated '_symmetry_equiv_pos_as_xyz' and replaced it with '_space_group_symop_operation_xyz', so the new name has to be used in both .cif and .fcf files. Many programs that read these files (including Coot, CIFTAB/XCIF, PLATON, CheckCIF etc.) were promptly modified by their authors to understand both names. Of course older versions of Coot etc. do not understand the new names. Please download an up-to-date version of Coot and this problem will go away. This is just one example of the enormous chaos and waste of time that unnecessary changes in the CIF definitions have caused!

I am particularly grateful to many SHELX users for asking these questions, and to Thomas R. Schneider for helping me to answer them.