Modeling the Structures and Electronic Properties of Uracil and Thymine in Gas Phase and Water

The molecular geometries and electronic properties of Uracil and Thymine are studied at the Restricted Hatree-Fock (RHF) and Density Functional Theory (DFT) levels of theory in gas phase and water. The optimized structures, dipole moments, total energies, charge transfer, quadrupole moments and other electronic properties of the two molecules were computed in gas and water. The two molecules showed higher polarity, stability and band gap in water. The average bond lengths of Thymine are greater than those of Uracil in gas. Thymine has significantly lower bond lengths than Uracil in solution. The average bond lengths of C-C and N-H are respectively the longest and shortest. Thymine molecule has lower total energies compared to Uracil at both levels of theory and in both media. The polarity of both molecules increases in solution. HOMO and LUMO energy values decrease slightly in solution for both molecules at both levels of theory. The Ionization Potential (IP) and Electron Affinity (EA) decreases for both molecules in solution. The chemical potential and electrophilicity of both molecules increase in solution. The chemical hardness of Uracil increases slightly in solution while that of Thymine decreases slightly in solution.


Introduction
The molecule Uracil is a common and naturally occurring pyrimidine derivative and one of the four nucleobases in the nucleic acid of RNA. It binds to adenine through two hydrogen bonds in RNA. The Uracil nucleobase is replaced by Thymine in DNA. Uracil can be generally considered as a demethylated form of Thymine (Faal et al, 2016). Comprehensive understanding of the geometries of the these nucleic acid bases (NABs) of DNA and RNA is fundamental to modeling and predicting their behavior in larger molecules, such as nucleosides,nucleotides and DNA itself. Much effort has been devoted to the spectroscopy of free NABs, where the fundamental molecular properties can be observed without the effects of solvation (Vitaliy et al, 2009). Also, Uracil is considered as an essential structure in drug discovery with vast topics of biological activities and synthetic accessibility. Systematic experimental and theoretical studies of nucleic acid bases and their derivatives are of considerable importance from the biological point of view (Yousra & Abdel-Mottaleb, 2016). Uracil and its derivatives are essential ingredients of soluble ribonucleic acid and are used in anti-carcinogenic drug synthesis against cancer and anti-HIV viruses (Singh, 2014). Antiviral and anti-tumor are the two most widely reported functions of Uracil and its derivatives (Beaula, Madhurambal & Kavitha, 2017;Johnson et al, 1996 ).
Many theoretical and experimental studies have been carried out on these two molecules to better understand their functionality and applicability. Infra Red and Raman spectra of uracil and thymine have been exclusively studied . The optimized molecular geometries, atomic polar tensor (APT) charges and vibrational characteristics have been studied using RHF and DFT methods. The Kekule ring stretching mode is found to be of comparatively higher frequency than the mode of Uracil due to the involvement of hydrogen bonding of methyl group (Singh, 2015). Johnson et al (1996) carried out full geometry optimizations at the MP2 level of theory using the 6-31G(d,p) basis set for Uracil, Thymine, Cytosine, Adenine, Guanine and three related bases. They studied several electric properties using the optimized geometries.
Local density approximation in DFT has been used to study the adsorption of Adenine, Thymine, and their radicals on the surfaces of metallic and semiconducting single-wall carbon nanotubes. The energies and equilibrium distances for several configurations were obtained. It was observed that the electronic structure of both metallic and semiconducting nanotubes is not changed significantly upon adsorption (Yaroslav, Lilia & Galina, 2007). Different electronic features of Uracil have also been studied using several methods. The combined IR and NBO study gave justification for the observed tautomeric preferences. The absolute predominance of the 2H-tautomer forms greatly changed when the substituent group possesses anionic character. Calculations in solution showed changes in the behavior of charged substituents (Faal et al, 2016).
New total scattering data in the halouracils and input from quantum chemical calculations has been used to describe the dipole bound and valence anion states in Thymine, Uracil and other DNA bases. (Scheer et al, 2004). A theoretical analysis of absorption spectra of Uracil, Thymine and Cytosine has been carried out. Structural dynamic models of these molecules in their electronically excited states has also been constructed. (Ten & Baranov, 2004). The molecular geometries, energy properties, H-bonding patterns, and electrostatic potential characters of Thymine and Uracil tetrads and the role of the potassium cation in the formation of the tetrads has been studied extensively at the B3LYP/6-311G(d,p) and HF/6-311G(d,p) levels of theory (Gu & Jerzy, 2001).
Among several experimental works carried out on these molecules, the thermodynamic potential for the abiotic synthesis of Uracil, Thymine and other common nucleobases has been studied. The study reveals that the thermodynamic drive to synthesize Uracil and Thymine varies considerably as a function of temperature (Douglas &Pierre, 2008). Also, different forms of electron binding of Uracil and Thymine with water clusters has been investigated (Schiedt et al, 1998) Though much extensive computational work has been carried out on Thymine and Uracil (Mei et al, 2010;Bipul& Medhi, 2016;Oleg, Leonid & Jerzy, 2000), no single study has carried out a comparative analysis of the basic geometrical and electronic properties of these two nucleobases in gas phase and water. In this study, Uracil and Thymine were optimized at RHF and DFT levels of theory to see the effect of the environment ( gas and water) on their molecular and electronic properties. All calculations in this work were carried out using the Restricted Hartree Fock (RHF) and Density Functional Theory (DFT) with Becke's three-parameter exchange functional along with the Lee-Yang-Parr non-local correlation functional (B3LYP). The standard 6-31G(d,p) basis set was used.

Method
Geometry optimization was done by locating both the minima and transition states on the potential surface of the molecular orbital. The molecular structures can be optimized in Cartesian coordinates that are generated automatically from the input Cartesian coordinates. The optimization process also handles fixed constraints on distances, bond angles and dihedral angles in Cartesian or (where appropriate) internal coordinates. The process is iterative; with repeated calculations of energies, gradients and calculations or estimations of Hessian in every optimization cycle until convergence is attained. One of the most computationally demanding aspects of calculating free energy using electronic structure theory is the calculation of vibrational energy and entropy contributions. The computational expense is incurred by the calculation of the matrix of second energy derivatives (i.e., the Hessian or force constant matrix), which yields harmonic vibrational frequencies upon diagonalization (Geh et al, 2015).
Geometry optimizations usually attempt to locate minima on the potential energy surface, thereby predicting equilibrium structures of molecular systems. At the minima, the first derivative of the energy (gradient) is zero. Since the gradient is the negative of the forces, the forces are also zero at such a point (stationary point). In Gaussian, a geometry optimization begins at the molecular structure specified at the input and steps along the potential energy surface. It computes the energy and the gradient at that point and determines which direction to make the next step. The gradient indicates the direction along the surface in which the energy decreases most rapidly from the current point as well as the steepness of that slope. Atoms in the molecule are numbered according to their order in the molecule specification section of the input (Becke, 1993).
The molecular structures and geometries of Uracil and Thymine are completely optimized using ab initio quantum mechanical calculations. Restricted Hartree Fock Theory (RHF) and Density Functional Theory (DFT) computational methods are used. Geometry optimizations are performed at RHF and B3LYP levels of theory using 6-31G(d,p) basis set. DFT is a cost effective method for inclusion of electron correlations. Atomic charges were calculated using the Mulliken analysis. All calculations were performed using Gaussian G09W, 2009 program package (Frisch et al, 2009).
Gas phase predictions are inadequate in describing the behavior of molecules in solution. The Self-Consistent Reaction Field (SCRF) method called the Onsager reaction field model is used to model the molecules in water. In this method, the solute occupies a fixed spherical cavity within the solvent field. A dipole in the molecule induces a dipole in the medium and the electric field applied by the solvent dipole in turn interacts with the molecular dipole, leading to stabilization (Gurku & Ndikilar, 2012).
The Energies of the Highest Occupied Molecular Orbital (HOMO) and Lowest Unoccupied Molecular Orbital (LUMO) play a major role in governing chemical reactions and electronic behavior of molecules (Galadanci et al, 2015). The HOMO and LUMO energies are identified from the output file by finding the point where the occupied/virtual code letter in the symmetry designation changes from O to V.
Koopman's theorem states that if the single particle energies are not affected by adding or removing a single electron, then the ionization energy(IP) and the electron affinity(EA) are respectively the negative energies of HOMO and LUMO (Muhammad, Taura & Ndikilar, 2015). Thus, (1) Ionization energy is the minimum energy required to remove an electron from the atom in gaseous phase. Electron affinity is the energy released upon attachment of an electron to an atom or molecule resulting in the formation of the negative ion. The energy band gap or global hardness (resistance to charge transfer) E is the energy difference between HOMO and LUMO (2) The chemical potential, , which measures the escaping tendency of an electron from equilibrium is given as (3) where is the molecular electro-negativity (the power of an atom in a molecule to attract electrons to itself). The chemical hardness for insulators and semi conductors is given as (4) The inverse of chemical hardness is softness which indicates the magnitude of electron transfer to/from a molecule when chemical potential changes (Muhammad, Taura & Ndikilar, 2015) The electrophilicity index is a measures of the stabilization in energy when the system acquires an additional electronic charge from the environment. In other words, it can be defined as a measure of energy lowering due to maximal electron flow between donor and acceptor. It is given as (5) These properties were computed for Uracil and Thymine in water and gas phase at both levels of theory.

Optimized Molecular Structures
The optimized structures of the Uracil and Thymine are shown in Fig. 1. Thymine has a methyl group attached to one of the Carbon atoms in the benzene ring ( Fig.1) and thus has respectively one and two Carbon and Hydrogen atoms more than Uracil. The structures clearly show the atom numbers, symbols and types of bonds between the atoms.
--   Table 1b for Uracil show that all the bond angles are in the range 113-126 degrees. It is observed from Table 1a that the bond lengths of C2-C3, C1-N12, C4-H8 decrease slightly in solution at both levels of theory, while the remaining bonds experience a slight increase in their bond lengths. The bonds C2-O10 and C2-O9 experience a slightly significant increase in solution due to the high electro-negativity of the Oxygen atoms involved and the Hydrogen bonding in water. The results at B3LYP level of theory agree significantly with experimental results and those obtained using the basis set 6-31G++G(d,p). The Hydrogen-Nitrogen bond has the smallest bond length and the C-C bonds have the highest values. This agrees with results obtained by Faal et al, 2016 andYousra &Abdel-Mottaleb, 2016. This shows that N-H bonds are the strongest and C-C bonds are the weakest. The bond lengths both in gas phase and water are slightly higher at DFT level of theory than at RHF. This shows that electron correlation depicts that the molecule occupies more space as compared to RHF results (Gurku & Ndikilar, 2012). Table 1a have almost the same values at both levels of theory in both gas and water. However, all bond angles are slightly altered in solution. The values at B3LYP level of theory agree with experimental values (Faal et al, 2016) and those obtained using other basis sets (Yousra &Abdel-Mottaleb, 2016). The N11-C1-N12 and C1-N11-C2 bond angles are the smallest and largest respectively. This indicates that a Carbon atom surrounded by Nitrogen atoms results in smaller bond angles as opposed to a Nitrogen atom surrounded by Carbon atoms. Table 2a and 2b shows the optimized bond lengths and bond angles of Thymine in gas and water respectively. The double bond C1-C4 is shorter and stronger than the single bond C1-C6. The C-O bonds are almost of the same length for both levels of theory.   Table 2a depicts a similar trend to that obtained for Uracil with the N-H and C-C bonds having the least and highest bond lengths respectively. The bond angles behave in a similar manner with some slight changes in solution at both levels of theory.

The bond angles shown in
A comparative analysis of the average bond lengths of the two molecules in gas phase and solution is shown in Table 3.  Table 3 shows that on an average basis, the bond lengths of Thymine are greater than those of Uracil in gas with the exception of the N-H bond. A contrary observation is seen in solution as the C-C, C-N and C-H bonds of Thymine are lower than those of Uracil. This shows that the effect of solvation is higher in Thymine compared to Uracil. In both media and for both molecules, the average bond lengths of C-C and N-H are respectively the longest and shortest. This is in agreement with experimental results (Faal et al, 2016).

Total Energies
The total energies are predicted by single point calculations.  It is observed from Table 4 that total energy decreases as we move from gas to water for both molecules at both levels of theory. Thus, the molecules become more stable in water than in gas phase. Also, the total energy values at RHF/6-31G(d,p) level of theory are larger compared to those at B3LYP/6-31G(d,p) in both media. The inclusion of electron correlation in the computational method therefore gives lower values of energy. The total energy of Uracil in gas phase at B3LYP/6-31G(d,p) agrees with those computed at MP2/6-311++G(d,p), B3LYP/6-311++G(d,p) and B3LYP/Aug-CC-pVDZ (Faal et al, 2016). Thymine molecule has lower energies compared to Uracil at both levels of theory and in both media. Thus, Thymine is more stable than Uracil both in gas and in water.

Total Dipole Moments
The dipole moment is the first derivative of the energy with respect to an applied electric field . It is a measure of the asymmetry in the molecular charge distribution and is given as a vector in three dimensions. The total dipole moments of the molecules at both levels of theory in gas and water is shown in Table 5 Table 5 Table 5 shows that both molecules are polar and the strength of the polarity increases in water. Hydrogen bonding in water accounts for the increase in polarity. RHF values are slightly higher than the B3LYP values and thus electron correlation accounts for the slightly lower values at DFT level of theory (Gurku & Ndikilar, 2012). The total dipole moments in gas phase agree with experimental results which varies between 4.4 to 4.6 Debye (Schiedt et al, 1998). The dipole moments of the two molecules are higher in water and hence they can easily bind to an electron by the dipole field in water compared to gas. This has little effect on the molecular core of the molecules (Liu et al, 2013). Table 6 shows the electronic properties and reactivity indices of the two molecules in gas and water at both levels of theory.  Vol. 11, No.12;  It can be seen from Table 6 that the HOMO and LUMO energy values decrease slightly in solution for both molecules at both levels of theory. This indicates that the Ionization Potential (IP) and Electron Affinity (EA) decreases for both molecules in solution. Thus, it is easier to add or remove an electron from these molecules in gas phase compared to water. However, the predicted values at RHF level of theory are significantly higher compared to the B3LYP values. This indicates that the non-inclusion of electron correlation in the RHF method affects the LUMO and HOMO energies and consequently all other electronic properties. The HOMO, LUMO and energy band gap values agree with those obtained using Spartan 14 software at B3LYP/6-31+G(d) level of theory (Yousra & Abdel-Mottaleb, 2016). At B3LYP/6-31G(d,p) level of theory, the chemical potential and electrophilicity of both molecules increase in solution. The chemical hardness of Uracil increases slightly in solution while that of Thymine decreases slightly in solution.

Quadrupole Moments
Quadrupole moments provide a second order approximation of the total electron distribution, providing at least a crude idea of its shape. One of the components being significantly larger than the others would represent an elongation of the sphere along that axis. If present, the off-axis components represent trans-axial distortion (stretching or compressing of the ellipsoid) (Galadanci et al, 2015). Table 7 shows the Quadrupole moments computed for the studied molecules at both RHF and B3LYP levels of theory in gas and water using the 6-31G(d,p) basis set.  Table 7, we deduce that Uracil is slightly elongated along the XX axis and its least elongation is along the YY axis. However, in solution, the elongation of Uracil along XX increases at both levels of theory. As for Thymine, the molecule is slightly elongated along the YY axis in both media and the elongation increases in solution.

Electrostatic Potential Derived Charges
Tables 8 and 9 respectively show the electrostatic potential derived charges using the CHelpG scheme of Breneman at different atomic positions of Uracil and Thymine at both levels of theory.  It can be deduced that the bulk positive charge in the Uracil molecule resides in three carbon atoms C1, C2, C3 and the four Hydrogen atoms H5, H6, H7, H8. The negative charge resides mainly with Nitrogen and Oxygen atoms. This is due to the high electro-negativity of O and N atoms in the molecule. In solution, the negative charges of the N atoms increases while that of the O atoms decreases. A similar trend is observed for Thymine in Table 9. This agrees with the Mulliken charges of the individual atoms (Yousra & Abdel-Mottaleb, 2016).

Conclusion
The two molecules are predicted to be more stable in water at both levels of theory. The polarity of the molecules equally increases in solution, indicating the formation of hydrogen bonds between the molecules and water molecules. The quadrupole moments predict that Uracil and Thymine molecules are slightly elongated along the XX and YY axes in both gas and water respectively. There is a sharp distinction of the electronic properties of the two molecules between the two levels of theory. RHF/6-31G(d,p) predicts higher values for Ionization Potential, Electron Affinity, band gap, chemical potential, chemical hardness and electrophilicity. This indicates that the neglect of electron correlation in the RHF method affects the predicted electronic properties and reactivity indices. This can be explained by the fact that, molecules and atoms go into chemical and electronic activity via their electron configuration; especially the valence electrons. The bulk positive charges reside mainly with the Carbon atoms in both molecules. On the other hand, the highly electronegative Oxygen and Nitrogen atoms carry the bulk negative charge. The results obtained for the optimized structural properties, dipole moments and some electronic properties agree satisfactorily with those obtained from other computational methods and experimental values where available.