15 File Formats

 

15.1 PCM File format

 

            The PCM file format is a free format, context driven grammar for the description of molecular structures.  This gives the user more and easier control of their structure files for additions and corrections, and leaves room for expansion of the definition of the force field.  The context also allows for greater error correction within the program. 

            At the end of this section is an example of a PCM file which describes the structure in Figure 6.1.  Key words have been printed in bold for clarity. In this structure, we have included substructures, Pi atoms, hydrogen bonding, a metal, atomic charge, and coordinate bonds. The features of the file are fairly straight forward. From the top is the file type marker {PCM which indicates that the file is a PCM type file.  This is required of all PCM files so that PCMODEL and MMX will recognize them.  Immediately following the file type marker, on the same line, is the name of the structure.  This name may be up to 58 characters long.

            On the next line appears NA followed by a number.  This tells the number of atoms in the structure.  This number is optional and is used only to check for consistency.

            Next, is SS 1 which introduces substructure number 1 and gives it a name which follows.  The name can be up to 20 characters long.  This is also optional. Currently, this statement only gives the substructure a name though at some point, it may be used to tell other aspects of the substructure.

            The fourth line contains FLags: EINT UV.  Currently, the following flags are defined:

EINT   0:         dipole interaction energy will be calculated excluding interactions            between dipoles with a common carbon.

            1:         as above but with all interactions.

            2:         dipole interaction calculation suppressed.

            3:         charge interaction energy will be calculated (excluding interactions          between atoms bound to a common atom). If this option is used you must   read in atomic charges. 

            4:         charge interaction energy used.  Charges calculated from bond dipoles   with additions for pi atom charges.

            UV       0:         if pi calculation, will be prompted before minimization:

                                    Number of electrons?

                                    Singlet, doublet, triplet, ?

                                    RHF or UHF calculation ?

                                    if singlet, RHF then restrictions on wave function?

                                    # of identical pairs of atoms?

                                    atom numbers of pairs?

                                    # of sets of identical bonds?

                                    atom numbers of bonds of bond pairs ?

                        1: default to singlet RHF pi calculation - no questions.

                        2: default doublet UHF pi calculation (for mono radicals) no questions.

 

            DIELC            Floating point value of the new dielectric constant.  Negative values give a                                  distance dependent dielectric constant.  (nonsense with dipole-dipole                                         interactions).

 

            Line 5 contains the first atom record.  After the atom specifier AT, the index number appears.  These are the atom number and are used to reference the bond numbers.  Next after some separator, usually a comma though a colon or a space will do, is that atom type number (see chapter 7 for a description of atom types). In the case of metals, the metal symbol is included in this space instead of the atom type number as in the sixth atom record.

            Next, appears some separator; we use a colon here for clarity.  Then in order, is X, Y, and Z in angstroms.  After the next separator, the bond marker, B, appears the list of bonds attached to the atom.  2,1 means a bond to atom two is a single bond.  In atom 6, you will notice 10,9, which indicates a metal coordinate bond (type 9) to atom 10.

            At the end of the first few lines is a C followed by a number.  C specifies a charge.  Again looking at atom 6, an M appears.  Following M is the specified electronic state of the metal, in this case 3 for low spin.  R introduces the covalent radius.  This number is used to determine the bond lengths of all ligand attachments to that metal.

            The record for atom 8 contains an H which indicates that it is a hydrogen bonding hydrogen. Atom 9 contains a P which marks it as a pi atom and an S, which refers to  the list of substructures of which this atom is a member.  The comma after the S is optional but is added for clarity.

            Finally, the structure file is closed with a }. This is mandatory so PCMODEL will know when to stop reading.

 

 

 

 

Figure 6.1  Structure of molecule described in PCM file below. 


PCM Example File

 

{PCM  example pcm file

NA  31

SS 1cyclo pentadiene                 

FL EINT4 UV1 PIPL1

AT 1,8:5.00395,5.41685,4.14865 B  2,1  5,1  8,1  14,1 C .04109

AT 2,1:5.35786,3.99136,4.43375 B  1,1  3,1  15,1  16,1 C .04758

AT 3,3:3.89305,3.05742,4.72868 B  2,1  4,2  7,1 C .31634

AT 4,7:2.81165,3.08691,5.38735 B  3,2  6,9  17,1  18,1 C-.29483

AT 5,1:4.01102,5.78059,5.13175 B  1,1  6,1  19,1  20,1 C .00583

AT 6,Fe:3.37201,4.37477,5.09242 B 5,1 4,9 9,9 10,9  11,9  12,9  13,9 M3 R1.26000 C1.00000

AT 7,1:4.34527,1.64177,4.88598 B  3,1  21,1  22,1  23,1 C .04175

AT 8,23:5.80025,6.06569,4.07001 B  1,1 H C .15724

AT 9,48:2.72317,5.71177,3.69643 B  10,1  13,1  6,9  24,1 S, 1 P C‑.03818

AT 10,2:2.74283,5.08259,2.49706 B  9,1  11,2  6,9  25,1 S, 1 P C‑.03815

AT 11,2:2.37909,3.77508,2.58554 B  10,2  12,1  6,9  26,1 S, 1 P C‑.03815

AT 12,2:2.02517,3.57846,4.02085 B  11,1  13,2  6,9  27,1 S, 1 P C‑.03815

AT 13,2:2.22179,4.80733,4.59105 B  12,2  9,1  6,9  28,1 S, 1 P C‑.03815

AT 14,20:4.70902,5.38735,3.60795 B  1,1 C‑.21000

AT 15,5:5.98704,3.92254,5.39718 B  2,1

AT 16,5:5.92805,3.54897,3.55880 B  2,1

AT 17,20:2.37909,2.71334,4.92530 B  4,1 C‑.05250

AT 18,20:2.75266,2.61503,5.84941 B  4,1 C‑.05250

AT 19,5:3.46049,6.68504,4.76800 B  5,1

AT 20,5:4.51240,6.04603,6.10501 B  5,1

AT 21,5:3.51947,.88478,4.78767 B  7,1

AT 22,5:5.12192,1.35667,4.14865 B  7,1

AT 23,5:4.80733,1.51396,5.89856 B  7,1

AT 24,5:2.94928,6.78335,3.89305 B  9,1 S, 1 C .03818

AT 25,5:3.04759,5.59380,1.55329 B  10,1 S, 1 C .03815

AT 26,5:2.31027,3.01810,1.76957 B  11,1 S, 1 C .03815

AT 27,5:1.67126,2.64452,4.53206 B  12,1 S, 1 C .03815

AT 28,5:2.02517,5.05310,5.65279 B  13,1 S, 1 C .03815

AT 29,21:6.64571,6.42943,4.91547 B  30,1

AT 30,6:6.48842,6.06569,4.91547 B  29,1  31,1

AT 31,21:6.90132,5.87890,4.91547 B  30,1

}

 

The PCM file syntax or grammer is given below.

 

PCMFILE        --> (

{PCM [NAME(60)]

(ENTRY)

  *

})

 

ENTRY                       --> { SSNAME | ATOM_REC | NATOM | FLAGS | CONST | FIX_REC}

 

SSNAME                    --> SS [SP] NUM(16) [SP] [NAME(20)]

NATOM                      --> NA NUM()

FLAGS                        --> FL [SP] ({ PR | UV | EINT | PIPL } [NUM()] SP) *

CONST                       --> CO {

ATOM_REC   --> AT [SP] NUM() [SP] ATYPE SP X,Y,Z ([SP] ATFIELD)*

FIX_REC        --> FIX {         ATOM NUM() ( X Y Z ) |

                                                DIS NUM() NUM() ( R FLOAT K FLOAT) }

 

ATYPE                        --> { NUM(79) | NAME(2) }

X,Y,Z              --> FLOAT SP FLOAT SP FLOAT

ATFIELD        --> { BONDS | SUBSTR | ATOMFLG }

BONDS           --> B [SP] NUMBND SP (SP ATNUM,BNDORD )NUMBND

NUMBND       --> NUM(10)

ATNUM,BNDORD

                        --> NUM() (SP) NUM(10)

SUBSTR                      --> S (NUM(16) SP)*

ATOMFLG     --> FL { P | M NUM(3) | H | C FLOAT | R FLOAT } *

 

 

NAME(n)        --> letter{letter|digit|punc}*  for size <= n

NUM(n)                       --> {digit}                                for value <= n

SP                    --> {comma|blank|colon|tab}

FLOAT                        -->

  [-](digit)*[.](digit)(digit)*[{e|E}[{-|+}(digit)(digit)*]

 

letter                 --> character a-z A-Z $ _

digit                  --> character 0-9

 

{ } - choose exactly one: all choices separated by |

( ) - grouping: multiple tokens treated as one token

*   - zero or more occurrences of previous token

[ ] - optional (one or zero occurences)

BOLD characters appear exactly

 

Starter token MUST appear in column one

 

ALL OPTIONAL SP NOTE:  If the removal of an SP(space) from a string would place two members of the same token primitive class adjacent, the SP is not optional!


15.2 MMX INPUT FILES

            It is a pleasure to acknowledge the contributions of Professor Allinger and his students to the algorithms and constants used in MMX whose progenitors are MM2 and the pi routines from MMP1.

 

PCMODEL can read:

 A) a single structure file or

 B) a file containing multiple but separate structure files.

 

For option B the first three characters on the first line of each new structure must be MMX. This tag is used by PCMODEL to recognize the start of a new structure. Each structure contains complete information about that structure.

 

The file format for an MMX and MM2 file are similar, differing only in the lack of a line of logical variables to mark pi atoms in the MM2 file format. The file formats which follow are for FORTRAN formatted files, that is column placement is critical if the file is to be interpreted correctly. The format descriptors used in fortran are:

                        a#                    character string of # length

                        i#                     integer value up to # digits in size

                        f                       fixed floating point number

 

Thus the format for line one (see below) says to read 30 fields of two character (30a2), a series of integers that are 5 digits, 2, 3, 3, and 3 digits in length (i5,i2,i3,i3,i3) and a floating point number that is 5 units wide with no number beyond the decimal point (ie 10. ). For a more complete description please see any Fortran programming guide. The most important idea to remember is that data must be in specific columns in a fortran formatted file.

 

 

First line:         format (30a2, i5,i2,i3,i2,i3,f5.0)

columns

1‑60                 ID name, title, or information

 

61‑65               N number of atoms including lone pairs.

 

67                    IPRINT print control

 

Initial Calculation Only: 0 or 1 simple; 2 or 3 full printout.

 

                        Part 1-Minimize            Part 3-Final

            0:         minimum                                   full                    (can be 100+kb)

            1:         minimum                                   simple   (useful)*

            2:         full                                            full                    (can be 100+kb)

            3:         simple                                       simple

            4:         minimum                                   minmum            (bare bones)

            5:         no print                         (Driver has its own out files)

 

* simple print omits VDW interactions <.1 kcal/mol

 

70        NRSTR            0:         no restricted atom data 1:         yes (see below) 

72        INIT                0:         standard minimization    1:         initial calculation only

 

74        IDOCK  docking — if zero then none — but MMX can provide other options including simulated annealing of intermolecular interactions. These require the beginning atom number for the substructure in the atom list. This number is on NCON line (3rd line).

 

75        NCONST        0:         no added constants                   1:         added constants

 

If NCONST = 1, added constants may be appended to input file — see below — or in the form of a file which will be read if there are no input lines after the 13th line as defined below.  WARNING, if there are added constants, the use of the multi coordinate input files will require that the added constants be in the input file NOT in an external file.Since PCMODEL now uses a text file of added constants this part of the MMX file format is not used.

 

76‑80   time - ignored in all versions

 

Second line     format(60l1,i2,2x,i2,2x,2i2,4i1,2i2)

 

1‑60     LOGARY:  t or f in each column indicates whether or not the atom number represented by the column number is to be considered a pi atom in a VESCF calculation.   If no atom numbers are marked t, no pi calculation will be performed. If there are more than 60 atoms, add more lines with these logical variables up to 4 lines total.

 

62   NPLANE  0:         planar pi calculation 1:   non planar

 

            In a planar pi calculation, the first calculation is on a Huckel matrix whose elements depend on the atom types and geometry Diagonalization gives a  density matrix which is used along with Slater orbital exponents and calculated overlap integrals to give the elements of the FOCK matrix which is diagonalized to give a new density matrix.   Subsequent iterations give a self-consistent  field with an invariant energy.

            In a non planar calculation two scf calculations are required-one on the non planar system and one on a planar projection.  The bond orders of the planar projection are used to weight the natural distances, force constants, and two-fold torsional barrier terms, see below (ADDED CONSTANTS).

 

66   JPRINT:  print out for pi calculation

 

            1st SCF                       during minimization

0:         energy only                   energy only

1:         energy + matrices         energy only

2:         energy + matrices         energy + matrices on each iteration

3:         ditto                             energy + matrices on each iteration

 

69‑70   LIMIT limit for self-consistency  (10)-this integer (in ev). default is (10)-5

 

72        ITER  maximum number of SCF iterations to be run. (ignored in this version with open shell pi calculation)

 

73   IHBD

 

0:         no read of hydrogen bonding logical array-all OH, NH, & SH hydrogens participate in hydrogen bonding.

            1:         read hydrogen bond logical array after line 7 below; first line for 80 atoms; additional lines for up to total atoms; array marks which hydrogen atoms will hydrogen bond.

 

74   ICOV

 

            0:         no metal atoms are present in input file

            1:         read line containing metal type and radius (3 different metals maximum) then read logical array of atoms coordinated to each  metal; lines must be after the hydrogen bond array or after line 7.

 

75   IUV

 

            0:         if pi calculation, MMX will ask for options before minimization:

                        number of electrons?

                        singlet, doublet, triplet, etc.?

                        RHF or UHF calculation?

                        if singlet, RHF then restrictions on wave function?

                        # of identical pairs of atoms?

                        atom numbers of pairs?

                        # of sets of identical bonds?

                        # of identical bonds and # of atom pairs?

            1:         default to singlet RHF pi calculation - no questions.

            2:         default doublet UHF pi calculation (for mono radicals) no questions.

 

76   IHUCK

 

            0:         for full VESCF calculation before first MM energy minimization

            1:         for Huckel calculation (of planar systems) before first MM; subsequent minimizations will be full SCF

 

78   NSETAT  - number of sets of equivalent pi atoms

 

80   NSETBD  - number of sets of equivalent pi bonds

 

The previous two options are over-ridden by IUV greater then zero.

 

Then lines of logary (in 80l) for atoms beyond 60.

 

            if NSETAT .gt. 0, must read ISETA(I) in 10i4, number of equivalent atoms in each set; then must read nsetat lines in (20i4) which contain the atom numbers in each set.

 

            if NSETBD .gt. 0, must read ISETB(I) in 10i4, number of equivalent bonds in each set; then must read lines in (20i4) which contain the atom numbers, in pairs, for each bond.   A new set must start a new line.

 

Third line        format(i5,1x,i1,2x,f5.1,5x,9i5,5X,2i5)

 

(if no pi atoms in molecule with more than 60 atoms — otherwise this is the 4th line)  

1‑5  NCON # of connected atom lists (not atoms) to read in 30 Max.

 

7    IDYN

            0:         no dynamics

            1:         dynamics program is run interactively — After filename is given, user will be asked:

                        Timestep in femtoseconds (1.0 fsec is default)

                        Viscosity in cp  (0.000 is default)

                        Temperature of Bath (300.K is default, initial sample temperature is 300.K so implies no heat in or out)

                        Time constant for heat transfer (1015 femtosecond is default, this implies no coupling of bath to sample).

 

            After 20 cycles, user can alter values and number of cycles. Dynamics can be used to remove KE increasing the viscosity or cooling with a time constant ~ timestep so that an unstable structure can be optimized — slowly — invariably to a local minimimum.  At a given temperature kinetic and potential energy are conserved an provide ranges of motion. A minimized structure can be heated to escape a local minimum then cooled perhaps to a global minimum.  Global optimization can also be performed by minimizing many initial structures generated by MODEL in MULTIC mode or PCMODEL  in MULTOR mode - see NDRIVE

 

11‑15   DIELE  dielectric constant.

                        If 0., default to 1.5.

            If negative, a distance dependent dielectric constant is used in the charge-charge interactions (DIELC=Rij). If a negative dielectric is used with the dipole-dipole calculations, nonsense will result.

 

16‑20   JSTART atom number begining substructure in atom list-for docking only-IDOCK of first line (col 74) must be set to >0.

 

25   IBUT        0: no

1: yes cyclopropropanes or cyclobutanes present

 

26‑30   NATTCH # atoms attached to previously defined atoms.

 

31‑35   NSYMM number of symmetry matrices to read in.

            PCMODEL writes 0 for this variable.

 

36‑40   NX number of coordinate replacement lines to read — this option is unnecessary with structure file generation programs like MODEL, and PCMODEL.  See QCPE 395 documentation for its use.  The coordinate replacement lines must immediately follow the cartesian coordinate input.

 

45‑50   LABEL

0:         names and weights of atoms defined in program are used.

                        #:         number of different atom types whose names and weights will be changed below.

            PCMODEL writes 0 for this variable.

 

55   NDC

 

            0:         dipole interaction enery will be calculated excluding interactions between dipoles with a common carbon.

            1:         as above but with all interactions.

            2:         dipole interaction calculation supressed.

            3:         charge interaction energy will be calculated (excluding interactions between atoms bound to a common atom). If this option is used you must read in atomic charges below. 

            4:         charge interaction energy used.  Charges calculated from bond dipoles with additions for pi atom charges.

            PCMODEL writes 0 or  4 (default).

 

60   NCALC determines crystal conversion and/or orientations to be performed on input coordinates.

 

            0:         no crystal options.

            1:         input coordinates are reduced crystal coordinates.  They will be converted to cartesian coordinates (see below).

            2:         input coordinates will be oriented according to instructions below.

3:         Combination of 1 and 2 above.

            PCMODEL writes 0 for this variable.

 

 

65   HFORM 

            0:         no

            1:         yes  heat of formation calculated

            2:         read new values for partition function contribution     to the heat of formation.

            PCMODEL writes 1 for this variable.

 

80   NDRIVE

             0:        no

             1:        endocyclic dihd. driver

            ‑1:        side chain dihd.driver

 

For options ‑1 and 1, a data line must be added before an added constants list at the end of the input file. See below.

 

            In MMX (a separate batch processing program) if NDRIVE >= 10000: sets of coordinates can be read in after all other input.  Each set must consist of one line in format i5 which indicates the number of the set (starting from 1), and the coordinates must be in the standard format (see SIXTH LINE below).  The previously read connectivity information will be used in the calculation.  Files with the filename + sequentially  numbered extension will be generated. A brief output file, *.SUM, with the filenames and energies will also be generated; an energy ordered file MMXE.ORD is also generated.

 

            In MMX: if  NDRIVE=‑9999:  global minimum finder by simulated annealing. Requires additional input lines after the loghbd and logmet lines of logicals.

 

OPTIONS FROM LINE ABOVE:

 

If LABEL _ 0 add # of LABEL lines  in format (i5,3x,a2,f10.5)

 

4‑5                   ITY(i)  atom type whose name and weight are to be redefined.

9‑10                 NAM(i)  two character name for new atom type.

11‑20   WGHT(i) Atomic weight for the atom.

 

If NDC = 3 add lines whose format is 8f10.5 listing the charge on each atom-for N total atoms including lone pairs.

 

Fourth line(s)  NCON lines each in format (16i5)

 

ICONN(i,j)      i=number of the line (1 to NCON), j=number of entry on line (from 2 to 16)

 

Each list contains up to 16 CONTIGUOUS atoms and each list can repeat atoms in a previous list.

 

Cyclohexane would have one list with values 1,2,3,4,5,6,1

 

Bicyclo[3.2.1]octane would have two lists — the cyclohexane list above and a second list with values 1,7,8,3

 

There are no restrictions on lists other than to avoid non contigous atoms during the sequence.

 

Fifth line(s)     NATTCH lines each in format (16i5)

number of lines sufficient to add all atom pairs up to NATTCH in format (16i5).  The pairs are:

 

            JATTCH(i), KATTCH(i)  where i equals 1 to NATTCH.

 

JATTCH(i)       is an atom already defined either in NCON or previously defined as a KATTCH atom.

            KATTCH(i) is an atom attached to JATTCH(i).

 

For cyclohexane NATTCH = 12 which would require two lines.  The values are:

 

1,7,1,8,2,9,2,10,3,11,3,12,4,13,4,14

5,15,5,16,6,17,6,18

 

Sixth line(s)    COORDINATES  format(2(3f10.5,2i5))

 

1‑10     X(1)     Coordinates in Angstroms of atom 1-These may be reduced crystal coordinates if NCALC =1 or 3.

11‑20   Y(1)

21‑30   Z(1)

31‑35   ITYPE(1)    MMX atom type( see Table of Atom types)

36‑40   MATOM(1)

            36‑38 atom number this atom is multiply bonded to.

            39    (for metal atoms only

                        0:coordinately saturated;

                        1:unsaturated

                        2:>18 electrons

                        3:square planar).

            40        Number of bonds to atom defined in 36‑38 minus 1;      Non metal atoms singly bonded to others will have matom = 00000

41‑50   X(2)

51‑60   Y(2)

61‑70   Z(2)

71‑75   ITYPE(2)

76‑80   MATOM(2)

 

            Repeat lines until all coordinates are included.

 

Seventh line(s)           NX > 0  COORDINATE REPLACEMENT or ATOM ADDITION

 

See documentation to QCPE 395 or QCPE MMP2(85) for options and data structure.

 

Seventh line(s)           If NCALC = 1 or 3 CRYSTAL CONVERSION in format (6f10.5)

 

            If reduced crystal coordinates were read in above instead of cartesians coordinated, the following parameters will convert them to cartesian.

 

1‑10     A         Dimensions of unit cell in Angstroms (a,b,c)

11‑20   B

21‑30   C

31‑40   ALPHA   Angles associated with unit cell.

41‑50   BETA

51‑60   GAMMA

 

Eighth line(s)  Hydrogen Bonding flags

 

If IHBD=1  format(80l1).  If N >80, more lines in format (80l1) are read.

            These should be "f"s unless a particular hydrogen, whose atom number is represented by the column number, will participate in hydrogen bonding-then a "t" should be in the column.

 

Ninth line(s)    If ICOV=1, Metal Names, Covalent Radii and Charge

 

            Program checks number of metal atoms present  (itypes 44,56,57) and reads line of metal characters, covalent radii, and the charge on each metal (necessary for isolated ions & might be useful for organometallics if electrostatics are used) in format (3(a2,f8.6),9f5.2); these variables are amet,rcova,bmet,rcovb,cmet,rcovc, and chrgmet(i), i=1,9. [Changing rcova will change the covalent radius used for all bonds involving the metal amet.]

            Then the number of metal logical lines of format(80l1) are read. A t or f indicates whether or not the atom number represented by the column number is coordinated (not bound as would appear in the NCON or NATTCH lists) to the first (or second etc. 9 Max) metal atom in the atom  number list.

            If more than 80 atoms are present more lines should follow - in (80l1) format-to mark the rest of the atoms.

 

Tenth line(s)   If NSYMM .ne. 0  read NSYMM lines in format(2i4,9f8.5)

 

3‑4       Is         number of the independent atom in the symmetry-related pair — it will be moved during the minimization.

 

5‑8       Ks        number of the dependent atom in the pair. Its coordinates are calculated from a symmetry matrix operating on IS.

 

  ELEMENTS OF SYMMETRY MATRIX - ON SAME LINE

 

9‑16     SXX    Amount by which X(K) must be changed when X(I) is moved by 1 Å in order to maintain molecular symmetry.

17‑24   SXY    Change in Y(K) for X(I) change =1.

25‑32   SXZ     Change in Z(K) for X(I) change =1.

33‑40   SYX    Change in X(K) for Y(I) change =1.

41‑48   SYY    Change in Y(K) for Y(I) change =1.

49‑56   SYZ     Change in Z(K) for Y(I) change =1.

57‑64   SZX     Change in X(K) for Z(I) change =1.

65‑72   SZY     Change in Y(I) for Z(I) change =1.

73‑80   SZZ     Change in Z(K) for Z(I) change =1.

 

            An atom may be used as Is as often as necessary but cannot appear in another symmetry pair as Ks.  In order to specify symmetry correctly, the coordinate axes used in the minimization must be known. Do an initial calculation with INIT=1.

This procedure is retained from MM2 but has not been tested in MMX.

 

Eleventh line(s)          Restricted atoms

 

            If NRSTR =1  first line in format (i5) is number of atoms whose motions are restricted, the remaining lines in format (16(i2,3i1) define the restrictions in any or all of the X,Y, and/or Z directions.  Thus the atom may be kept along an axis, or in a plane, or at a point.

 

1‑2                   AT       Number of the atom whose motion is to be restricted.

 

3                      X         0:         no restriction     1:        no X axis movement.

4                      Y         0:         ditto     1:         no Y axis movement.

5                      Z          0:         ditto     1:         no Z axis movement.

 

            repeat up to 16 times per line.

 

Note that this will fail if the atom number to restrict is greater than 99.

 

            This option is often considered for transition states, but its use is limited in that connection. It is better to use transition state atoms whose constants can be generated by PCMODEL.

 

 

Twelfth line     if HFORM =2 read a line in format (3f5.2,f10.3)

 

1‑5                   POPI               population increment (Enthalpy contribution due to higher energy conformations)  Default=0

6‑10                 TORI               torsional increment (Enthalpy contribution due to torsional degrees of freedom) Default=0

11‑15   TROTI translation-rotation increment (Enthalpy increment due to translational and rotational degrees of freedom)

      Default=2.4 kcal/mol.

 

21‑30   Experimental heat of formation (kcal/mole)

31‑40   Experimental error (absolute value)

41‑80   Literature reference (optional)

 

Thirteenth line            Dihedral Driver Information

            if NDRIVE is not equal to  0.

 

                        if NDRIVE < 10000 and NDRIVE .ne. ‑9999 in format(2(4i5,5x,3f5.0))

 

(Note that this line usually occurs at the end of an MM2 input file, but in MMX it must preceed an added constants list which is appended to the input file.)

 

            This is used to drive one or two dihedral angle(s) through a range of values and to minimize all other degrees of freedom at each point.

 

1‑5       M1

6‑10     M2

11‑15   M3

16‑20   M4       atom numbers for first dihedral angle.

26‑30   START1          starting angle

31‑35   FIN1                final angle

36‑40   DIFF1  step or increment

41‑45   N1

46‑50   N2

51‑55   N3

56‑60   N4       atom number for second dihedral angle.

66‑70   START2          starting angle

71‑75   FIN2                final angle

76‑80   DIFF2  step or increment

 

            Before use, remove restricted motions from atoms to be moved and symmetry.

 

            NDRIVE may be ‑1 for side chain or 1 for angles whose 2nd & 3rd atoms are confined to rings.

 

            The side chain driver is useful if the dihedral is not part of a ring.  It carries out a rigid rotation for each step in dihedral angle.

 

            The endocyclic drive is confined within 0 and 180 degrees or 0 and ‑180. The step cannot be large (5‑10 degrees is normal). The minimization will stop if either 0 or 180 degrees is achieved.  Use the side chain driver if possible.

 

            IF NDRIVE >= 10000 - see description for Third line above-requires multiple sets of cartesians. MMX option only

 

            IF NDRIVE = ‑9999 - Global minimization by simulated annealing. MMX option only

 

            1st line: format 6i5

 

                        krbnds -           number of dihedrals to rotate, 30 MAX

                        kring  - 0:         no ring

                                                1:         ring

                        irange - for rings-angle window for closure usually 20 degrees

                        krand  -            0:         randomize all dihedrals before an energy check (usual)

                                                #:         randomize # dihedrals in the sequence below before an energy check

                        igen   -  generating function type (use 0)

                                    0:         15*temperature*(tan(pi*(ran-.5)) 0.<ran<1.

                                    1:         360.*ran ntimes - Number of times structure with energy within eminct is found before  exiting (usually 4).

                        icomp  -           normally zero, if

                                                1:         read in three distances in line after dismin etc.

                                                2:         for tricoordinate analysis - not thoroughly tested.

 

            2nd line(s) format(16i5)

 

            mbonds(kk,1),mbonds(kk,2),kk=1,krbnds :  atom #s of bonds to rotate in pairs If ring is chosen above, the first atom pair bond is broken and used for evaluation closure distances and angles; the second atom pair should be  an bond adjacent to the first because it will be rotated in order to obtain the best closure angle.

 

            3rd line format(7f11.5)

 

            dismin —          for ring - minimum distance between mbonds(1,1) and mbonds(1,2) usually 1.50.

            dismax —         for ring - maximum distance between     "  "   "    "  "   "  usually 1.58

            resln —            dihedral angle to rotate through if igen = 0- usually 15

 

            tempi —           initial temperature in kcal/mole usually 30.

 

            codec —          decrement of temperature according to tempi*(1/(1.+ codec)) usually 0.5.

            eminct —         energy difference between acceptable structures to count for minimum found ntimes in order (usually 0.01)..

 

            4th line format(6f11.5) if icomp eq.2

 

            3X (distance, tolerance)  cdis1,tol1,cdis2,tol2,cdis3,tol3

 

Thirteenth line if IDOCK = 9

 

            For simulated annealing in docking-main and substructure rigid-requires that a substructure be at the end of the atoms list with the starting atom identified in cols 16‑20 on the third line (NCON and other options).

 

            format(6f10.5,i5)

 

            xyzds —           factor for substructure translation-usually 1.0

 

resln —            constant for substructure rotation-usually 15. degrees

 

            tempi —           initial temperature-usually 3. kcal/mole

 

            codec —          decrement of temperature according to tempi*(1/(1.+ codec)) usually 0.1

            tempf —           lowest temperature before quitting usually 0.01 kcal/mole

 

            eminct —         energy difference between acceptable structures to count for the minimum if found ntimes in order (usually 0.2)

 

            ntimes —          number of times a low energy is obtained to assume it is a minimum usually 5.

 

ADDED CONSTANTS  if NCONST .ne.0

 

  This may be the last set of entries in the input file or may be a separate  file which MMX will read if this list is not at the end of the input file. PCMODEL will generate this file only for transition state bonds. MODEL on the VAX will generate this file. The first line is a header listing the numbers and types of constants to be read in format (8i5,3f5.0,5x,f5.0,i5)

 

1‑5       NT  number of torsion parameter sets.

6‑10     NS  number of stretching parameter sets (includes bond dipoles).

11‑15   NV  number of van der Waals' parameters sets.

16‑20   NB  number of bending parameter sets.

21‑25   MUA number of bond dipole line (according to type) ‑ use NS.

26‑30   MUB number of bond dipole lines (according to atom numbers) ‑ use NS.

31‑35   NSB

                        0: no 

                        1: yet stretch‑bend parmeters to be read.

36‑40   NH  number of bond enthalpy parameter sets‑if NH_0, a heat of formation will be calculated regardless of HFORM.(untested in MMX)

41‑45   DLC new dielectric constant for dipole or charge interaction calculations‑if DLC=0, the default, 1.5 is used.

46‑50   CST new cubic stretch term‑default = ‑2.00

51‑55   SF  fraction of the bending constant that is to be used as sextic term in bending energy equations‑if SF=0, a default is used (7E‑8).

61‑65   RDN hydrogen vdw reduction factor‑if RDN=0, the default is used (.915)‑ if RDN=1, no reduction is used.

66‑70   KTR number of transition state bond types.

 

next line            if NT > 0  TORSION CONSTANTS

 

            ‑ NT lines in format (4i5,3f10) from:

 

1 over 2 V sub 1 left ( 1 + cos OMEGA right ) + 1 over 2 V sub 2 left ( 1 - cos (2 OMEGA) right ) + 1 over 2 V sub 3 left ( 1 + cos ( 3 OMEGA ) right )

 

  1                    JBT      4 for cyclobutane torsion 0 for all others

  2‑5     I1

  6‑10  I2

 11‑15 I3

 16‑20 I4         atom type numbers for dihedral; I2>I3 or if I2=I3, I1<I4

 21‑30 V1       One‑fold barrier; if >0,  0 deg is potential energy maximum, 180 deg is potential energy minimum.

 31‑40 V2       Two‑fold barrier; if >0, 90 deg is potential energy maximum 0 and 180 deg are energy minima.

 41‑50 V3       Three‑fold barrier;  if >0,  0 and 120 deg are energy maxima 60 and 180 deg are energy minima.

 

   Note that ethylene has four torsional interactions making up the ca 60 kcal barrier so V2=15 for type 2 ‑ type 2 torsions.   Ethane has six torsional interactions, etc.

 

next line            if NS > 0 STRETCHING CONSTANTS

 

— NS lines in format (2i5,6f10.5) from: 

 

EC = 143.88*1/2ks*(1+CST*(r‑r0))*(r‑r0)2

E sub c = 143.88 * 1 over 2 k sub s left ( 1 + cst ( r - r sub 0 ) right ) cdot ( r - r sub 0 ) sup 2

 

            where CST is cubic stretch term.

 

1‑5       I atom type number

6‑10     K atom type number;  I .le. K

11‑20   S stretching constant (ks above) in md/angstrom for bond I‑K

21‑30   T1 natural, minmum energy bond length (r0 above) in angstroms

31‑40   T2 if I and K have one or less hydrogens attached T2 is the length used; if T2 is zero T1 is used.

41‑50   BMOM  bond moment for pair I,K; if >0, K is more electronegative atom

51‑60   SLPS  for pi atom pairs: S = S‑SLPS + SLPS * rhoI,K

61‑70   SLPT  for pi atom pairs: T1 = T1+SLPT ‑ SLPT * rhoI,K  Note that the dipole‑dipole interaction potential energy is: 

 

      Emu= 14.39418*BMOM(I,K)*BMOM(M,N)*(cos X ‑ 3* cos A * cos B)/ DLC*r3

 

where  r is the distance between the midpoint of the two bonds I‑K and M‑N;

                        X is the angle between the two bonds;

                        A is the angle between bond I‑K and line represented by r;

                        B is the angle between bond M‑N and line represented by r;

 

            DLC is the dielectric constant whose default is 1.5.  The dipole moment is calculated from the vector sum of all bond moments. If a pi calculation is performed, the dipole moment from the pi system is added vectorially to the sigma dipole moment.   There is presently no  calculated interaction between the pi and sigma dipoles.

 

next line            if MUA > 0 BOND DIPOLE CONSTANTS BY ATOM TYPE

 

              MUA lines in format (2i5,f10.5)

 

4‑5       I                       atom type number

6‑10     k                      atom type number; I _ K

11‑20   BMOM            bond moment for pair I,K; if >0, K is more electronegative atom

 

It is more consistent to change the BMOM in the added Stretch constants since all six values are read from the data base.

 

next line            if MUB > 0  BOND DIPOLE CONSTANTS BY ATOM NUMBER

 

            MUB lines in format (2i5,f10.5)

 

4‑5       N1       atom number of the less electronegative atom

9‑10     N2       atom number of the more electronegative atom

11‑20   V12     new bond moment for bond N1‑N2. V12 should be negative if  N1 is more electronegative than N2.

 

(untested in MMX)

 

next line            if NV > 0  VAN DER WAALS' CONSTANTS

 

            NV lines in format (i10,2f10.5)

            EV = eps*290000*e‑12.5/p    2.25*p6

 

where   p = (RDi+RDk)/r0 < 3.311

                        eps = sqrt(EPi*EPk) "hardness"

                        r0 effective distance (for CH bonds the H is effectively 10% closer to the C than the actual C‑H distance).

 

            EV = eps*336.176*p2   if p > 3.311

 

9‑10     IT atom type number whose parameters are to be changed

11‑20   EP epsilon (kcal/mole) for type IT atom

21‑30   RD radius in angstroms for type IT atom

 

            The user has no control over the parameters used for hydrogen bonding and coordination to a metal.   These are included in the vdw parameters because the vdw attractive term for hydrogen bonding and for the lone pair or pi atom  interaction with a metal is proportional to p2 not p6. The attractive are much larger than 2.25, depend on the donor and acceptor and angles, and is matched by a larger repulsive term.

 

next line            if NB > 0  BENDING CONSTANTS

 

            NB lines in format (i1,i4,2i5,2f10.5,i5)

 

            from: 

 

            EB = 0.043828*1/2*kb*((theta0‑theta)2)*(1+SF*(theta0‑theta)4)

 

            where SF  is sextic bending constant (.00700E‑5).

1          JBN

                        3 for cyclopropane angle

                        4 for cyclobutane angle

2‑5       I1         atom type number; I1<I3; if I1 is ‑1 read OPB constants where I3 is the central atom which is OP.

9‑10     I2         atom type number

11‑15   I3         atom type number

21‑30   B          bending constant (kb from above)

31‑40   T          natural, minimum potential energy bond angle in degrees (theta)

45        J

                        0: T value applies to all I1‑I2‑I3 angles

                        1: T value applies only if I2 has no attached hydrogens

                        2: T value applies only if I2 has one attached hydrogen

                        3: T value applies only if I2 has two attached hydrogens.

 

  Out of Plane bending (OPB): trigonal atoms involved in out of plane bending have an in‑plane component, which is characterized by the standard values from above, and an out of plane component with a different constant.   The atom  types which respond to OPB constants are types 2 (SP2C), 3 (carbonyl C), 9 (amide and pyrrole nitrogen), 29 (carbon radical), 30 (carbocation), and 48 (carbanion when it is a pi atom). The default values for these are similar to that in MM2 for type 3 namely: 0.8.

 

next line            if NSB > 0  STRETCH‑BEND CONSTANTS

            NSB lines in format (4f10.5) from:

 

  ESB = 2.51118 * ksb * theta * (r1,2 + r2,3)

 

            where ksb are constants described below;

            theta is angle 1‑2‑3;

            r1,2 is distance between atoms 1 and 2;

            r2,3 is distance between atoms 2 and 3.

 

1‑10     SB1     constant for angle a‑f‑c where f is a 1st row atom; a and c are any non‑hydrogen atoms.  Both bonds are considered in ESB. Default value is 0.12.

11‑20   SB2     constant for angle a‑s‑b where s is a 2nd row atom; a and c are any non‑hydrogen atomsDefault value is 0.25.

21‑30   SB1H   constant for angle a‑f‑H where f is a 1st row atom; a is any non‑hydrogen atom; only bond a‑f is considered in ESB. Default value is 0.09.

31‑40   SB2H   constant for angle a‑s‑H where s is a 2nd rwo atom; a is any non‑hydrogen atom; only bond a‑s is considered in ESB Default value is ‑0.40

 

next line            if NH > 0

            HEAT OF FORMATION BOND AND STRUCTURAL PARAMETERS

 

            NH lines of format (2i5,2f10.5,i5,5x,5a4)

 

  1‑5     IT         atom type number defining a bond or structural feature number

 6‑10    KT       atom type number defining a bond (zero if structural feature)

11‑20   BE       bond enthalpy increment for IT‑KT bond or structural feature

21‑30   SBE     bond entropy increment for IT‑KT bond (zero if structural)

35        NEW

                        0: Default comment on bond enthalpy (use zero if structural)

                        1: change or add comment on bond enthalpy

41‑60   NTITLE           Comment on bond enthalpy (ignore if structural)

 

            (not tested in MMX)

 

            The user has no control over the parameters used for the heat of formation contribution from the pi system.

 

next line            if KTR > 0 

 

            this line has the fraction bond orders used to generate the transition state constants file in MIO or in PCMOD.   MMX does nothing with these bond orders except prints them in the output for reference  purposes.   Thus users need not supply this information from an external file.

            The default values for parameters read by MMX are in a binary file called DATA.STO. An ASCII file of the values used can be obtained by running MMXDATA and requesting a file PARAM.OUT.   It is no longer possible to submit a nul file to MMX and obtain a file with  the parameters.

 

 

15.3 Mopac Files

            Mopac input and output files are written as 'free format'  internal coordinate files.  The free format refers to the fact that data may be entered at any position on a line, there is no fixed column number in which the data must appear to be read correctly. The internal coordinate format consists of lists of bond lengths, angles and dihedrals and the defining atoms which describe the molecular geometry, as opposed to a list of cartesian coordinates for all the atoms. In internal format for any atom (i) there is an interatomic distance in Angstroms from an already defined atom (j), and interatomic angle in degrees between atoms i and j and a previously defined atom (k), where j and k are different,  and a dihedral angle in degrees between atoms j,k and a previously defined atom (l), where l is different from j and k. Atom 1 has no coordinates since this is defined as the origin, atom 2 is connected to atom 1 by an interatomic distance only, and atom 3 is connected to atom 1 or atom 2 by an interatomic distance and makes an angle (either 3-1-2 or 3-2-1) with no dihedral angle defined.

 

            A mopac file for benzene is given below :

 

am1

 

 

C     0.000000  0      0.000000     0      0.000000  0   0   0   0

C     1.400211  1      0.000000     0      0.000000  0   1   0   0

C     1.399883  1  119.996750    1      0.000000  0   2   1   0

C     1.400132  1  120.005653    1      0.000000  1   3   2   1

C     1.399876  1  119.993233    1      0.000000  1   4   3   2

C     1.400199  1  120.006828    1      0.000000  1   5   4   3

H     1.103109  1  119.998314    1  180.000000  1   1   2   3

H     1.103110  1  120.002739    1  180.000000  1   2   3   1

H     1.103109  1  119.998428    1      0.000000  1   3   2   2

H     1.103109  1  120.002640    1      0.000000  1   4   3   3

H     1.103109  1  119.997231    1      0.000000  1   5   4   4

H     1.103109  1  120.001671    1      0.000000  1   6   5   5

 

 

line 1:   keywords separated by one or more spaces such as :  am1

line 2:   title line up to 60 characters long:           blank

line 3:   second title or comment line up to sixty atoms long:        blank

line 4 to end :

Atom Symbol   Dist      Opt      Angle   Opt      Dihed   Opt      NA      NB       NC

 

The atom symbol is the atomic symbol for that atom. The dist is the interatomic distance between the current atom and the atom number given in the NA column. Thus atom 2 is referenced to atom 1, atom 7 (which is a hydrogen) is reference to atom 1 also. The Opt columns determine whether this parameter (distance, angle or dihedral) will be optimized by Mopac, with a 1 to optimize and a 0 to not optimize. The angle is the angle formed by the current atom and the atoms NA and NB. Thus atom 4 is referenced to atom 3 for the distance and makes the angle of 120.005 with atoms 3 and 2 ( angle 4-3-2).  The dihedral angle is defined for atoms 4-3-2-1 to be 0.0 degrees.

 

            For a more complete description of Mopac files and options please see the MOPAC documentation which is available from QCPE or from Serena Software.

 

15.4 X-RAY Files

 

            Since there is no universally accepted file format for x‑ray data, we have written a free format file parser that should allow you to input the data in any format. Our routines expect:

 

Line 1:              Title                  character string of up to 80 characters

Line 2:              Cell Parameters            A,B,C,Alpha, Beta and Gamma 

Line 3 to end:   X,Y,Z coordinates and atom symbol. 

 

Note that these routines do not read symmetry matrices or point groups. Thus we assume the coordinates are orthogonal and that the cell paramters are floating point numbers (including the decimal point). If you understand this then you probably have the necessary programs to output this type of data. A sample x‑ray input file is included below. A set of dummy cell parameters to use with a set of Cartesian coordinates would be: 1.  1.  1.   90.  90.  90. (note you need the decimal point, all the cell parameters are read as real numbers).

 

ui120ab.c

1.0 1.0 1.0 90.0 90.0 90.0

 0.124381   -0.144487   -0.435695    s

-1.526023    0.158658    0.185498    c

 0.818900    1.213381    0.125534    o

 0.638991   -1.256924    0.361156    o

-1.905993    1.082317   -0.230632    h

-2.152138   -0.671798   -0.115534    h

-1.491980    0.209408    1.267034    h

 1.053019    1.088258    1.043743    h