Like the semi-empirical methods already discussed, the most common ab initio calculations are LCAO-based. That is, the MOs are written as linear combinations of the atomic basis set orbitals.
- However, ab initio methods offer a variety of basis sets of varying complexity
- All integrals are evaluated; no empirical data are included, nor is the "training set" approach of semi-empirical methods employed
- On the other hand, the exponents used in the mathematical representations of the atomic orbitals in the basis sets are optimized.
In principle, these basis sets should be made up of the Slater-type orbitals described in the discussion of semi-empirical methods. Most modern MO computational packages, however, make use of basis sets in which the Slater orbitals are substituted by sets of Gaussian functions that are are computationally much easier to deal with.
- The difference between Slater orbitals and their Gaussian simulations is explained in the page on semi-empirical methods
- These basis sets have become more or less standard, allowing comparison of results generated with many different programs
- However, the GAMESS program uses some Gaussian exponents that are different than those in GAUSSIAN and SPARTAN; hence GAMESS output is not always comparable
Minimal Basis Sets
The simplest of these basis sets is that designated STO-3G, an acronym for Slater-Type-Orbitals simulated by 3 Gaussians added together.
- The coefficients of the Gaussian functions are adjusted to give as good a fit as possible to the Slater orbitals
- This is one reason why ab initio methods are not precisely what the name implies: calculations from first principles with no concessions to empirically derived parameters
- The STO-3G set is known as a minimal basis set, meaning that it has only as many orbitals as are needed to accommodate the electrons of the neutral atoms and retain spherical symmetry.
- Thus, STO-3G has only one basis function per hydrogen, five per atom from Li to Ne (1s, 2s, 2px, 2py, and 2pz), and nine for the second row elements Na - Ar (1s, 2s, 2px, 2py, 2pz, 3s, 3px, 3py, 3pz).
- Note the inclusion of the "core" orbitals, which are omitted in semi-empirical calculations.
- Only one best fit to a given type of Slater orbital is possible for a given number of Gaussian functions.
- Hence, all STO-3G basis sets for any row of the periodic table are the same, except for the exponents of the Gaussian functions.
- The exponents are expressed as scale factors, the squares of which are used as multipliers of the adjusted exponents in the original best-fit Gaussian functions.
- In this way, the ratios of exponents remain the same while the effective exponent of each orbital can be varied.
The STO-3G basis set, and other minimal basis sets, usually do reasonably well at reproducing geometries of simple organic molecules.
- However, they overestimate the p-acceptor characteristics of electropositive elements
- The do not do at all well on energies, especially for small rings
- They fail badly for such things as carbocations and carbanions.
For these reasons, the STO-3G basis set is only rarely used.
Split-Valence Basis Sets
One problem with minimal basis sets is that they do not allow alteration of the basis orbitals in response to changing molecular environment. Consider a p-orbital on oxygen in an ether compared to that same ether when protonated.
- The additional nuclear charge in the protonated ether should result in contraction of the p-orbital to bring the electrons it contains closer to the nucleus than in the unprotonated species.
- The contraction should increase the electron-nuclear attraction, as well as increasing the electron-electron repulsion.
- The inability of a basis set like STO-3G to reflect these changes makes comparison between charged and uncharged species unreliable.
Anisotropic environments are another problem for minimal basis sets. The oxygen orbitals holding unshared electrons in the ether surely are more diffuse than the orbitals holding O-C bonding pairs.
- Minimal basis sets use the same AOs for both.
- Furthermore, describing an oxygen atom, having eight electrons, with the same number of basis functions as lithium (three electrons), is likely to lead to a poorer description of the oxygen atom.
The use of split valence basis sets is one way to respond to these problems.
- In these bases, the AOs are split into two parts: an inner, compact orbital and an outer, more diffuse one.
- The coefficients of these two kinds of orbitals can be varied independently during construction of the MOs.
- Thus the size of the AO can be varied between the limits set by the inner and outer functions (below).
- Split valence basis sets treat only the valence orbitals in this way; basis sets that similarly split the core orbitals are called double zeta, DZ (implying two different exponents.)
The simplest split valence basis set provided by SPARTAN is the 3-21G.
- This description (in the Pople notation) means that the core orbitals are represented by three Gaussians, whereas the inner and outer valence orbitals consist of two and one Gaussians, respectively.
- If we were to name bases consistently, of course, this one would be labeled STO-3-21G, but the STO is customarily omitted from all split valence descriptors.
- SPARTAN offers two other split valence bases: the 6-31G and the 6-311G.
- Both have six Gaussian cores
- The 6-311G is a triply split basis, with an inner orbital represented by three Gaussians, and middle and outer orbitals represented as single Gaussians.
- The triple split improves the description of the outer valence region.
- Both GAUSSIAN and GAMESS offer a more extensive choice of split valence basis sets than does SPARTAN.
Polarization
Further improvement of basis functions is achieved by adding d-orbitals to all heavy (non-hydrogen) atoms.
- For typical organic compounds these are not used in bond formation, as are the d-orbitals of transition metals.
- They are used to allow a shift of the center of an orbital away from the position of the nucleus.
- For example, a p-orbital on carbon can be polarized away from the nucleus by mixing into it a d-orbital of lower symmetry (below).
- One obvious place where this can improve results is in the modeling of small rings; compounds of second-row elements also are more accurately described by the inclusion of polarization.
The presence of polarization functions is indicated in the Pople notation by appending an asterisk to the set designator.
- Thus, 3-21G* implies the previously described split valence basis with polarization added.
- Typically, six d-functions (x2, y2, z2, xy, xz, and yz), equivalent to five d-orbitals and one s, are used (for computational convenience)
- Most programs can also use five real d-orbitals.
- An alternative description of this kind of basis is DZP: double zeta, polarization.
- A second asterisk, as in the 6-31G** basis set implies the addition of a set of p-orbitals to each hydrogen to provide for their polarization.
- Again, an alternative notation exists: DZ2P; double zeta 2 polarization.
- An asterisk in parentheses signals that polarization functions are added only to second-row elements. This is the standard setup for the 3-21G basis set in SPARTAN.
- Another alternative to the asterisk for specifying polarization functions is (d), placed after the G.
Diffuse Functions
To provide more accurate descriptions of anions, or neutral molecules with unshared pairs, basis sets may be augmented with so-called diffuse functions.
- These are intended to improve the basis set at large distances from the nuclei, thus better describing the barely bound electrons of anions.
- Processes that involve changes in the number of unshared pairs, such as protonation, are better modeled if diffuse functions are included.
- The augmentation takes the form of a single set of very diffuse (exponents from 0.1 to 0.01) s and p orbitals.
- The presence of diffuse functions is symbolized by the addition of a plus sign, +, to the basis set designator: 6-31+G. (Since these are s and p orbitals, the symbol goes before the G.)
- Again, a second + implies diffuse functions added to hydrogens; however, little improvement in results is noted for this addition unless the system under investigation includes hydride ions.
- All of our ab initio programs offer at least one set of diffuse functions
Still more extensive basis sets exist, and are described by more complicated notation. Consult pp. 151ff in the Jensen book in the references for a more detailed description.
Selection of Basis Sets
The choice of basis set for an ab initio calculation is almost always a compromise. In general, we would like to use the largest available basis set, with the most extended set of polarization and diffuse functions, and the maximum possible consideration of electron correlation (see the Page on correlation).
Both computer hardware (memory, disk storage, processor speed), and inherent size of the calculation combine to force compromise. For constant hardware, for example, the time required for a calculation normally increases as the fourth power of the number of basis functions for small molecules, and decreases gradually toward the cube as the molecular size increases.
Some other things to keep in mind when choosing a basis set:
- STO-3G calculations are no longer considered acceptable in publication-quality work;
- Correlated methods generally provide bond length improvement over simple Hartree Fock (HF) methods, except for hypervalent molecules;
- HF and correlated models generally give comparable results for bond angles, regardless of basis set;
- All levels of HF and correlated methods describe well the energetics of isodesmic reactions, even with small basis sets. Errors for semi-empirical methods are much larger (isodesmic processes are reactions in which the numbers and types of bonding and non-bonding electron pairs are held constant);
- Inclusion of correlation is required to obtain acceptable homolytic bond dissociation energies.
These issues and others will be discussed further when we consider the broader question of which computational method (molecular mechanics, semi-empirical MO, or ab initio MO) is best suited for examining a particular kind of problem. It should be apparent, however, that one needs to work on defining one's problem in such a way as to get the best possible results with the resources at hand.
Program Limitations
The workstation version of SPARTAN imposes an arbitrary limit of 200 atoms or 2000 basis functions on ab initio calculations.
GAUSSIAN limitations are more complex:
- A Z-matrix is limited to a maximum of 1000 atoms (1200, including ghost and dummy atoms);
- For basis sets, the maximum number of primitive shells (essentially, the product of the number of Slater-type orbitals and the number of Gaussians used to represent each) is limited to 7500.
- GAUSSIAN limits are high enough that our hardware limitations are reached before program limits.
GAMESS limits include a maximum of 500 atoms and 5000 primitive shells. Again, one is likely to run out of memory or CPU before program limits are reached.
Finally, SPARTAN Help files provide a listing of the available basis sets. Tables of GAUSSIAN and GAMESS basis sets are available below:
This page last modified 1:51 PM on Tuesday May 23rd, 2006.
Webmaster, Department of Chemistry, University of Maine, Orono, ME 04469