CCL: CIF format
- From: vandestreek*|*avmatsim.de
- Subject: CCL: CIF format
- Date: Sun, 26 Sep 2010 16:13:47 +0200
Sent to CCL by: vandestreek%x%avmatsim.de
Quoting "Grigoriy Zhurko reg_zhurko,+,chemcraftprog.com"
I know about this program but I need not a program but algorithm
for computing the coordinates of all atoms. I want to implement CIF
visualization in Chemcraft.
As a former Mercury developer I think I can give you some hints.
The simple answer is, that you have to generate additional atoms from
the asymmetric unit via the symmetry operators, then you have to check
which (if any) of the newly generated atoms form bonds to your
There are several implementation details that you have to be careful with:
1. When determining bonds, you must take the 3D periodicity into
account. If an atom has fractional coordinates (-0.344, 1.225,
0.8790), then symmetry-related copies of that atom must also be
present at (-0.344, 0.225, 0.8790), (0.656, 1.225, 0.8790), (0.656,
-1.775, 0.8790), (-0.344, 1.225, 6.8790) etc. In other words, you can
always add or subtract an integer from any fractional x, y or z
coordinate. This is especially important if one or more of the atoms
of the asymmetric unit are outside the unit cell, because in that case
symmetry operators like inversions or mirror planes (in their standard
forms as they appear in cifs) will generate atoms with coordinates at
the other side of the unit cell (i.e. very far apart), even if some of
their periodic copies may be close enough together to form a bond.
2. Compounds such as polymers, catena compounds or zeolites form
infinite networks rather than discrete molecules. Since infinite
structures cannot be displayed, you must choose an arbitrary cutoff.
There is no "right" or "wrong", so your target of
251 atoms for the
zeolite structure is arbitrary. One full unit cell, or one full unit
cell + the fits atom outside the unit cell along each chemical bond
are reasonable values.
3. Cifs may contain more than one molecule, for example because the
crystal structure incorporated a solvent molecule (or several solvent
molecules) or because the compound crystallised as a co-crystal or as
a salt. The compound may also have multiple molecules in the
4. Some atoms are on special positions: one or more of the symmetry
elements produce the same atom at the same position (for example if
the atom is sitting on a mirror plane). These must be detected and
removed: imagine the energies a QM program would produce for two atoms
on top of each other. The user cannot see the second atom because it
is in the exact same place as the first, so leaving in these atoms
would be a very annoying "feature" of your algorithm. Because
coordinates in cif files are not exact, rounding errors must be taken
into account. E.g an atom with coordinates (0, 0.333, 0) is probably
on a three-fold axis and its exact coordinates are probably (0, 1/3,
0). The difference between 3 * 1/3 = 1 and 3 * 0.333 = 0.999 is a
rounding error that your algorithm must cater for, otherwise you will
have atoms that are only, say, 0.01 A apart. You should probably
*first* convert to Cartesian coordinates, because that makes it easier
to judge what chemically reasonable tolerance values are: a fractional
difference of 0.001 is entirely reasonable if the unit-cell parameter
for that coordinate is 1000 A, because 0.001 * 1000 A = 1 A, which is
a C-H bond length.
5. All information about the entire crystal structure is contained in
one unit cell: if you first normalise all fractional x, y and z
coordinates to lie within [0,1) and you then apply all symmetry
operators to all atoms in the asymmetric unit and you then normalise
all fractional x, y and z coordinates of all symmetry-generated atoms
to [0,1) and you then remove all duplicates, then you are guaranteed
to have found all relevant atoms. Now you have to find bonds and to
remove all molecules that are symmetry-related to other molecules, and
you have to expand the problem cases from point 2 to something
"chemically reasonable". When normalising to [0,1), bear in
rounding errors: you are probably better off keeping all atoms between
[-d,1+d], where d is something small like 0.0001, and then removing
duplicates where you allow for 3D periodicity and rounding errors
again: so -0.000003258, 0.000001825, 0.9999988234 and 1.00000578 are
all equal within rounding errors.
Most, if not all, of this is probably described somewhere, you may try
the CCL archives or a google search.
Besides that, Mercury 2.3 was unable to open my CIF
files from the
zeolite database ("Could not read symmetry operator" is
The symmetry operators in the file that you attached look like this:
' +x +y +z '
according to the cif specification (
), commas should be used as delimiters, so the the line should have looked like
' +x, +y, +z '
(Actually, it is more usual to write it like:
The single quotes are only necessary if whitespace in the form of
spaces or tabs is present in the string.)
But even when I change that, I still get two more error messages.
First, the cif that you attached contains element symbol
this can be used to specify tritium, tritium is not recognised by
Mercury and even if it was, this is not what is meant in your crystal
structure: the atoms are meant to be Silicon (element symbol
Second, the space-group name and the space-group symmetry operators in
your file are not consistent: the space-group name is Imma, but the
symmetry operators specify space group Pmmb, a non-standard setting of
space group Pmma. So which of these two is correct? If you try both in
Mercury, you will see that symmetry operators in space group Pmmb only
generate enough additional atoms from the asymmetric unit to fill half
the unit cell: the other half is left empty. With the symmetry
operators from space group Imma, which has twice as many symmetry
operators as Pmmb, the entire unit cell is filled. This makes Imma the
correct space group. (I looked at the symmetry operators and the
I-centring has been omitted.)
So the cif file that you attached, the format of the symmetry
operators is incorrect, half of the symmetry operators are missing and
the cif contains a non-existing element.
Dr Jacco van de Streek
Avant-garde Materials Simulation
Freiburg im Breisgau, Germany