CCL: CIF format

 Sent to CCL by:
Quoting "Grigoriy Zhurko reg_zhurko,+," <>:
I know about this program but I need not a program but algorithm for computing the coordinates of all atoms. I want to implement CIF visualization in Chemcraft.
 As a former Mercury developer I think I can give you some hints.
The simple answer is, that you have to generate additional atoms from the asymmetric unit via the symmetry operators, then you have to check which (if any) of the newly generated atoms form bonds to your molecule.
 There are several implementation details that you have to be careful with:
1. When determining bonds, you must take the 3D periodicity into account. If an atom has fractional coordinates (-0.344, 1.225, 0.8790), then symmetry-related copies of that atom must also be present at (-0.344, 0.225, 0.8790), (0.656, 1.225, 0.8790), (0.656, -1.775, 0.8790), (-0.344, 1.225, 6.8790) etc. In other words, you can always add or subtract an integer from any fractional x, y or z coordinate. This is especially important if one or more of the atoms of the asymmetric unit are outside the unit cell, because in that case symmetry operators like inversions or mirror planes (in their standard forms as they appear in cifs) will generate atoms with coordinates at the other side of the unit cell (i.e. very far apart), even if some of their periodic copies may be close enough together to form a bond. 2. Compounds such as polymers, catena compounds or zeolites form infinite networks rather than discrete molecules. Since infinite structures cannot be displayed, you must choose an arbitrary cutoff. There is no "right" or "wrong", so your target of 251 atoms for the zeolite structure is arbitrary. One full unit cell, or one full unit cell + the fits atom outside the unit cell along each chemical bond are reasonable values. 3. Cifs may contain more than one molecule, for example because the crystal structure incorporated a solvent molecule (or several solvent molecules) or because the compound crystallised as a co-crystal or as a salt. The compound may also have multiple molecules in the asymmetric unit. 4. Some atoms are on special positions: one or more of the symmetry elements produce the same atom at the same position (for example if the atom is sitting on a mirror plane). These must be detected and removed: imagine the energies a QM program would produce for two atoms on top of each other. The user cannot see the second atom because it is in the exact same place as the first, so leaving in these atoms would be a very annoying "feature" of your algorithm. Because atomic coordinates in cif files are not exact, rounding errors must be taken into account. E.g an atom with coordinates (0, 0.333, 0) is probably on a three-fold axis and its exact coordinates are probably (0, 1/3, 0). The difference between 3 * 1/3 = 1 and 3 * 0.333 = 0.999 is a rounding error that your algorithm must cater for, otherwise you will have atoms that are only, say, 0.01 A apart. You should probably *first* convert to Cartesian coordinates, because that makes it easier to judge what chemically reasonable tolerance values are: a fractional difference of 0.001 is entirely reasonable if the unit-cell parameter for that coordinate is 1000 A, because 0.001 * 1000 A = 1 A, which is a C-H bond length. 5. All information about the entire crystal structure is contained in one unit cell: if you first normalise all fractional x, y and z coordinates to lie within [0,1) and you then apply all symmetry operators to all atoms in the asymmetric unit and you then normalise all fractional x, y and z coordinates of all symmetry-generated atoms to [0,1) and you then remove all duplicates, then you are guaranteed to have found all relevant atoms. Now you have to find bonds and to remove all molecules that are symmetry-related to other molecules, and you have to expand the problem cases from point 2 to something "chemically reasonable". When normalising to [0,1), bear in mind rounding errors: you are probably better off keeping all atoms between [-d,1+d], where d is something small like 0.0001, and then removing duplicates where you allow for 3D periodicity and rounding errors again: so -0.000003258, 0.000001825, 0.9999988234 and 1.00000578 are all equal within rounding errors.
Most, if not all, of this is probably described somewhere, you may try the CCL archives or a google search.
Besides that, Mercury 2.3 was unable to open my CIF files from the zeolite database ("Could not read symmetry operator" is shown).
 The symmetry operators in the file that you attached look like this:
 '   +x       +y       +z  '
according to the cif specification ( ), commas should be used as delimiters, so the the line should have looked like this:
 '   +x,       +y,       +z  '
 (Actually, it is more usual to write it like:
The single quotes are only necessary if whitespace in the form of spaces or tabs is present in the string.)
 But even when I change that, I still get two more error messages.
First, the cif that you attached contains element symbol "T". Although this can be used to specify tritium, tritium is not recognised by Mercury and even if it was, this is not what is meant in your crystal structure: the atoms are meant to be Silicon (element symbol "Si").
Second, the space-group name and the space-group symmetry operators in your file are not consistent: the space-group name is Imma, but the symmetry operators specify space group Pmmb, a non-standard setting of space group Pmma. So which of these two is correct? If you try both in Mercury, you will see that symmetry operators in space group Pmmb only generate enough additional atoms from the asymmetric unit to fill half the unit cell: the other half is left empty. With the symmetry operators from space group Imma, which has twice as many symmetry operators as Pmmb, the entire unit cell is filled. This makes Imma the correct space group. (I looked at the symmetry operators and the I-centring has been omitted.)
So the cif file that you attached, the format of the symmetry operators is incorrect, half of the symmetry operators are missing and the cif contains a non-existing element.
 Best wishes,
 Dr Jacco van de Streek
 Senior Scientist
 Avant-garde Materials Simulation
 Freiburg im Breisgau, Germany