SHELXC is usually started from a script or a GUI.
On the command line, 'shelxc' should be followed by the filename stem
'name' that defines the three files that it should write, which are:
name.hkl - merged native
reflection data h, k, l, I and σ(I) in SHELX HKLF4 format
name_fa.hkl - h, k, l,
FA and σ(FA) in SHELX HKLF3 format
name_fa.ins - instruction file
for substructure solution with SHELXD
The last two of these are needed for input to SHELXD
to determine the substructure and the first two are input to SHELXE
for phasing. The native reflection data are also in a suitable
format for SHELXL, but will need the free-R reflections flagged
(e.g. by XPREP). SHELXC reads keywords from standard input.
The keywords may be given in any order, and only the first four
characters are significant, so 'SIRA' is the same as 'SIRAS'.
Keywords to identify the input reflection data
At least one data input file must be named, but
there will often be more. The input data files can be in SHELX .hkl,
SCALEPACK .sca or XDS XDS_ASCII.HKL format. In order to
read more than one file in XDS format they should either be read from
different folders or they should be renamed, e.g. to XDS_PEAK.HKL
etc. SHELXC decides on the file format by reading the first few lines,
not by the filename extension. The XDS files have the advantage that they
are always unmerged, otherwise 'OUTPUT POLISH UNMERGED'
(SCALA) or 'NO MERGE ORIGINAL INDEX' (HKL2000 / SCALEPACK) should be used
to make the .sca files. If a SHELX HKLF3 format .hkl file
is read in, it should be followed by '-f' on the same line to
indicate that it contains F-values rather than intensities. If only
.mtz files are available, Tim Grüne's mtz2sca can be used
to convert them to .sca format. In general none of the data files
input to SHELXC need to be merged or sorted and may or may not contain
systematically absent reflections, because SHELXC does all the necessary
scaling etc. itself. If at all possible, unmerged data
should be input. It is best not to allow other programs to maul the
data first and upset the statistics! Examples of data input keywords for
a SAD experiment are:
- anomalous data (used as native too unless NAT is also specified)
NAT native -f - native data (optional)
The native data can be very useful for getting good
maps if the resolution is higher than for the SAD data. In this example
the default .hkl is attached to 'native' to make native.hkl.
-f specifies that F is read rather than intensity. Friedel pairs
must be present in the SAD data but are not required for the native data.
For a SIR experiment the NAT file is essential:
NAT nat.sca - native data
SIR derivative.sca - derivative data
In practice, the derivative (e.g. an iodide soak)
will give an anomalous signal, so SIRAS is normally better (the files
were renamed from XDS_ASCII.HKL here to avoid a clash of names):
- native data
- derivative data
For SHELXC, a MAD experiment is restricted to four
wavelengths, identified by the keywords PEAK, INFL, HREM
and LREM, plus optionally NAT. If only
two wavelengths are specified, they must include peak or inflection
point (or both). HREM stands for 'high energy remote' and LREM for
'low energy remote'.
instructions SHEL, SFAC, ESEL, FIND, MIND, DSUL
may be input for passing on to SHELXD;
see the SHELXD keywords for more information about them.
The CELL and SPAG keywords are always required for SHELXC.
CELL - the unit-cell parameters a, b, c,
alpha, beta, gamma. If there are seven items, the first is assumed to
be the wavelength (to be compatible with other SHELX programs).
SPAG - the name of the space group. Only Sohncke space
groups are permitted, but some common non-standard settings are allowed,
e.g. 'P22121'. Embedded spaces are ignored. SPAG is used to generate
the LATT and SYMM instructions that are written to name_fa.ins.
If the space group is specified as 'R3' or 'R32' the program checks
the cell dimensions to see whether the hexagonal or primitive
rhombohedral setting is required.
MAXM - allocates working space for reflections (for all
datsets); e.g. the default 'MAXM 2' reserves space for 2000000
DSCA - the factor (default 0.98) by which to multiply
the native data for SIR and SIRAS or the AFTER or RIPAS data for
RIP after the data have been put onto a common scale (this allows for
the extra scattering power of the heavy atoms etc.).
ASCA - a scale factor (default 1.0) that is applied to
the anomalous signal in a MAD experiment; to apply MAD to a small
molecule, ASCA and DSCA should both be between 0 and 1, the best
values have to be found by trial and error.
SMAD - (without a number) sets the dispersive term to
zero in a MAD experiment. This is equivalent to
SAD using weighted mean anomalous differences from all the MAD
datasets. This can be useful when MAD appears to fail (especially
if the wavelengths were labeled wrongly).
REM - lines beginning with
these three letters are ignored.
MAD phasing example
SHELXC can be called in different ways, but in
this example we will store the instructions in a separate file
gere_mad (so that it is also Windows compatible). SHELXC is then
Linux or Mac
users might prefer to use:
shelxc <gere_mad ¦ tee gere.lis
so that they have a permanent record
of the console output. The file gere_mad (a well-known CCP4
test structure) could contain:
CELL 109.02 61.75 71.74 90.00 97.08 90.00
Special instructions for RIP phasing
BEFORE or NAT
- the dataset collected before UV or X-ray radiation
AFTER or RIP - the
dataset collected after UV or X-ray radiation damage.
In this case RIP phasing is applied analogously to SIR, without the
use of anomalous scattering. DSCA (see
above) can be critical for RIP experiments and various values around
0.98 should be tried.
RIPAS - The dataset collected after UV
or X-ray radiation for data processing similar to SIRAS (but see
RIPW - gives the weight w
(default 0.6) to be assigned to the NAT or BEFORE data in the
estimation of the anomalous signal (a weight of 1-w is applied to
the 'RIPAS' data).