Monomer File Preparation =========================== TODOTODOTODOTODOTOD Here, we describe how to prepare a reasonable monomer file step-by-step. Coordinate ------------- Given a monomer, like the one shown below, there are a lot of tools in computational chemistry community to build its 3D structure, like Avogadro or GaussView. The structure can be then optimized at some quantum chemical level of theory, like xTB or B3LYPD3/def2-SVP. The obtained structure should be saved in XYZ format, like ``monomer.inp`` shown below. .. figure:: _static/p2.png :align: center .. code-block:: :caption: monomer.xyz :linenos: 45 monomer C -0.00000000 0.00000000 0.00000000 C -0.99204361 0.67989624 -0.94006954 C -1.62513076 -0.10642738 -1.90138442 C -1.24520505 2.04367646 -0.93242437 C -2.50784698 0.45009463 -2.80643536 H -1.41636178 -1.16636581 -1.94027187 C -2.12863118 2.60280511 -1.84215854 H -0.74872619 2.68691666 -0.22165626 C -2.76733728 1.81056951 -2.77756466 H -2.99072901 -0.17871632 -3.54031416 H -2.31214042 3.66730700 -1.81823226 H -3.45545629 2.24898711 -3.48516956 C 0.67989624 0.99204361 0.94006954 C 2.04367646 1.24520505 0.93242437 C -0.10642738 1.62513076 1.90138442 C 2.60280511 2.12863118 1.84215854 H 2.68691666 0.74872619 0.22165626 C 0.45009463 2.50784698 2.80643536 H -1.16636581 1.41636178 1.94027187 C 1.81056951 2.76733728 2.77756466 H 3.66730700 2.31214042 1.81823226 H -0.17871632 2.99072901 3.54031416 H 2.24898711 3.45545629 3.48516956 C 0.99204361 -0.67989624 -0.94006954 C 1.24520505 -2.04367646 -0.93242437 C 1.62513076 0.10642738 -1.90138442 C 2.12863118 -2.60280511 -1.84215854 H 0.74872619 -2.68691666 -0.22165626 C 2.50784698 -0.45009463 -2.80643536 H 1.41636178 1.16636581 -1.94027187 C 2.76733728 -1.81056951 -2.77756466 H 2.31214042 -3.66730700 -1.81823226 H 2.99072901 0.17871632 -3.54031416 H 3.45545629 -2.24898711 -3.48516956 C -0.67989624 -0.99204361 0.94006954 C 0.10642738 -1.62513076 1.90138442 C -2.04367646 -1.24520505 0.93242437 C -0.45009463 -2.50784698 2.80643536 H 1.16636581 -1.41636178 1.94027187 C -2.60280511 -2.12863118 1.84215854 H -2.68691666 -0.74872619 0.22165626 C -1.81056951 -2.76733728 2.77756466 H 0.17871632 -2.99072901 3.54031416 H -3.66730700 -2.31214042 1.81823226 H -2.24898711 -3.45545629 3.48516956 Connectivity ------------- When we have obtained coordinates, the easiest way to generate remaining information is to use the tool ``topgen``, which is a part of `ABCluster `_ suite of software. For user's convenience, we have already included it in the distribution of ABGrow. Just run the following command: .. code-block:: bash $ topgen monomer.xyz You will get the following output: .. code-block:: bash Analyze topology for the molecule from "monomer.xyz" ... done. Exporting results to "monomer" + .gjf: GJF file with bonding information. -cycles.txt: Containing cycle texts that can be used for ABCluster:geom calculation. -bonding.xyz: XYZ file with bonding information for ABCluster:geom calculation. -rigid.xyz: XYZ file with bonding information for ABCluster:rigidmol calculation. .psf: PSF file with topology information for ABCluster:geom/NAMD calculations. -abgrow.xyz: XYZ file with much information for ABGrow calculation. In "monomer-rigid.xyz", "monomer-abgrow.xyz", and "monomer.psf": "X" means unknown atomic type. Total charge: -0.3900 ----------------------------------------------------------- | You can adjust charges to meet the target total charge, | | or re-fit RESP charges using, e.g., Multiwfn. | ----------------------------------------------------------- Atoms are typed using graph representation learning. Please cite: Zhang, J. Atom Typing Using Graph Representation Learning: How Do Models Learn Chemistry? J. Chem. Phys. 2022, 156, 204108 .. attention:: If you used ``topgen``, please cite the following paper: - Zhang, J. `Atom Typing Using Graph Representation Learning: How Do Models Learn Chemistry? `_ *J. Chem. Phys.* **2022**, *156*, 204108. Although ``topgen`` generates a lot of files, the only one useful to us is ``monomer-abgrow.xyz``: .. code-block:: :caption: monomer-abgrow.xyz :linenos: 45 Generated by ABCluster C -0.00000000 0.00000000 0.00000000 C -0.99204361 0.67989624 -0.94006954 C -1.62513076 -0.10642738 -1.90138442 C -1.24520505 2.04367646 -0.93242437 C -2.50784698 0.45009463 -2.80643536 H -1.41636178 -1.16636581 -1.94027187 C -2.12863118 2.60280511 -1.84215854 H -0.74872619 2.68691666 -0.22165626 C -2.76733728 1.81056951 -2.77756466 H -2.99072901 -0.17871632 -3.54031416 H -2.31214042 3.66730700 -1.81823226 H -3.45545629 2.24898711 -3.48516956 C 0.67989624 0.99204361 0.94006954 C 2.04367646 1.24520505 0.93242437 C -0.10642738 1.62513076 1.90138442 C 2.60280511 2.12863118 1.84215854 H 2.68691666 0.74872619 0.22165626 C 0.45009463 2.50784698 2.80643536 H -1.16636581 1.41636178 1.94027187 C 1.81056951 2.76733728 2.77756466 H 3.66730700 2.31214042 1.81823226 H -0.17871632 2.99072901 3.54031416 H 2.24898711 3.45545629 3.48516956 C 0.99204361 -0.67989624 -0.94006954 C 1.24520505 -2.04367646 -0.93242437 C 1.62513076 0.10642738 -1.90138442 C 2.12863118 -2.60280511 -1.84215854 H 0.74872619 -2.68691666 -0.22165626 C 2.50784698 -0.45009463 -2.80643536 H 1.41636178 1.16636581 -1.94027187 C 2.76733728 -1.81056951 -2.77756466 H 2.31214042 -3.66730700 -1.81823226 H 2.99072901 0.17871632 -3.54031416 H 3.45545629 -2.24898711 -3.48516956 C -0.67989624 -0.99204361 0.94006954 C 0.10642738 -1.62513076 1.90138442 C -2.04367646 -1.24520505 0.93242437 C -0.45009463 -2.50784698 2.80643536 H 1.16636581 -1.41636178 1.94027187 C -2.60280511 -2.12863118 1.84215854 H -2.68691666 -0.74872619 0.22165626 C -1.81056951 -2.76733728 2.77756466 H 0.17871632 -2.99072901 3.54031416 H -3.66730700 -2.31214042 1.81823226 H -2.24898711 -3.45545629 3.48516956 1 2 1.0 13 1.0 24 1.0 35 1.0 2 3 1.0 4 2.0 3 5 2.0 6 1.0 4 7 2.0 8 1.0 5 9 2.0 10 1.0 6 7 9 2.0 11 1.0 8 9 12 1.0 10 11 12 13 14 2.0 15 1.0 14 16 2.0 17 1.0 15 18 2.0 19 1.0 16 20 2.0 21 1.0 17 18 20 2.0 22 1.0 19 20 23 1.0 21 22 23 24 25 2.0 26 1.0 25 27 2.0 28 1.0 26 29 2.0 30 1.0 27 31 2.0 32 1.0 28 29 31 2.0 33 1.0 30 31 34 1.0 32 33 34 35 36 1.0 37 2.0 36 38 2.0 39 1.0 37 40 2.0 41 1.0 38 42 2.0 43 1.0 39 40 42 2.0 44 1.0 41 42 45 1.0 43 44 45 << Add Reactive Sites by User >> CG2RC7 0.0700 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 Line 49-93 is connectivity information of Gaussian job file (gjf) format, which was explained in :doc:`input`. In most cases, the connectivity is reliable. Bond orders do not matter as long as is greater than 0. For example, GaussView may generate a file that atom 35 and 37 has a bond order of 1.5, being different from what we got by ``topgen`` as line 83 suggests a bond order of ``2.0``. This will not affect the following simulation. Although ``monomer-abgrow.xyz`` has already contained much information, we still have to add or check the following information. Reactive Sites ------------------ Reactive sites are the atoms to react in the formation of amorphous materials. For a reactive site, one atom is to form a new bond and one atom is to leave, like the one shown below: .. figure:: _static/p3.png :align: center For each reactive site, they can be defined in the following manner: .. centered:: ``bonding_atom_index leaving_atom_index 0 0`` where the 2 ``0`` are reserved for future use. Since there are 4 reactive sites (see below) .. figure:: _static/p4.png :align: center we add them and arrive at the following monomer file: .. code-block:: :caption: monomer-abgrow.xyz :linenos: 45 Generated by ABCluster C -0.00000000 0.00000000 0.00000000 C -0.99204361 0.67989624 -0.94006954 C -1.62513076 -0.10642738 -1.90138442 C -1.24520505 2.04367646 -0.93242437 C -2.50784698 0.45009463 -2.80643536 H -1.41636178 -1.16636581 -1.94027187 C -2.12863118 2.60280511 -1.84215854 H -0.74872619 2.68691666 -0.22165626 C -2.76733728 1.81056951 -2.77756466 H -2.99072901 -0.17871632 -3.54031416 H -2.31214042 3.66730700 -1.81823226 H -3.45545629 2.24898711 -3.48516956 C 0.67989624 0.99204361 0.94006954 C 2.04367646 1.24520505 0.93242437 C -0.10642738 1.62513076 1.90138442 C 2.60280511 2.12863118 1.84215854 H 2.68691666 0.74872619 0.22165626 C 0.45009463 2.50784698 2.80643536 H -1.16636581 1.41636178 1.94027187 C 1.81056951 2.76733728 2.77756466 H 3.66730700 2.31214042 1.81823226 H -0.17871632 2.99072901 3.54031416 H 2.24898711 3.45545629 3.48516956 C 0.99204361 -0.67989624 -0.94006954 C 1.24520505 -2.04367646 -0.93242437 C 1.62513076 0.10642738 -1.90138442 C 2.12863118 -2.60280511 -1.84215854 H 0.74872619 -2.68691666 -0.22165626 C 2.50784698 -0.45009463 -2.80643536 H 1.41636178 1.16636581 -1.94027187 C 2.76733728 -1.81056951 -2.77756466 H 2.31214042 -3.66730700 -1.81823226 H 2.99072901 0.17871632 -3.54031416 H 3.45545629 -2.24898711 -3.48516956 C -0.67989624 -0.99204361 0.94006954 C 0.10642738 -1.62513076 1.90138442 C -2.04367646 -1.24520505 0.93242437 C -0.45009463 -2.50784698 2.80643536 H 1.16636581 -1.41636178 1.94027187 C -2.60280511 -2.12863118 1.84215854 H -2.68691666 -0.74872619 0.22165626 C -1.81056951 -2.76733728 2.77756466 H 0.17871632 -2.99072901 3.54031416 H -3.66730700 -2.31214042 1.81823226 H -2.24898711 -3.45545629 3.48516956 1 2 1.0 13 1.0 24 1.0 35 1.0 2 3 1.0 4 2.0 3 5 2.0 6 1.0 4 7 2.0 8 1.0 5 9 2.0 10 1.0 6 7 9 2.0 11 1.0 8 9 12 1.0 10 11 12 13 14 2.0 15 1.0 14 16 2.0 17 1.0 15 18 2.0 19 1.0 16 20 2.0 21 1.0 17 18 20 2.0 22 1.0 19 20 23 1.0 21 22 23 24 25 2.0 26 1.0 25 27 2.0 28 1.0 26 29 2.0 30 1.0 27 31 2.0 32 1.0 28 29 31 2.0 33 1.0 30 31 34 1.0 32 33 34 35 36 1.0 37 2.0 36 38 2.0 39 1.0 37 40 2.0 41 1.0 38 42 2.0 43 1.0 39 40 42 2.0 44 1.0 41 42 45 1.0 43 44 45 9 12 0 0 20 23 0 0 31 34 0 0 42 45 0 0 CG2RC7 0.0700 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 HGR61 0.1150 HGR61 0.1150 HGR61 0.1150 Force Field Information -------------------------- The last part of monomer input file is the force field information. For each atom, its "atom type" and "charge" are given: .. code-block:: ... CG2R61 -0.1150 CG2R61 -0.1150 HGR61 0.1150 CG2R61 -0.1150 ... Unfortunately, it is not possible to get 100% correct force field information automatically, so manually adjustment and calculation is still needed. Atom Type ++++++++++++++ In the current version of ABGrow, each atom will be given an "atom type" as indicated in CGenFF, i.e. the following part in ``par_all36_cgenff.prm``: we add them and arrive at the following monomer file: .. code-block:: :caption: par_all36_cgenff.prm :linenos: ATOMS !hydrogens MASS -1 HGA1 1.00800 ! alphatic proton, CH MASS -1 HGA2 1.00800 ! alphatic proton, CH2 MASS -1 HGA3 1.00800 ! alphatic proton, CH3 MASS -1 HGA4 1.00800 ! alkene proton; RHC= MASS -1 HGA5 1.00800 ! alkene proton; H2C=CR MASS -1 HGA6 1.00800 ! aliphatic H on fluorinated C, monofluoro MASS -1 HGA7 1.00800 ! aliphatic H on fluorinated C, difluoro MASS -1 HGAAM0 1.00800 ! aliphatic H, NEUTRAL trimethylamine (#) MASS -1 HGAAM1 1.00800 ! aliphatic H, NEUTRAL dimethylamine (#) MASS -1 HGAAM2 1.00800 ! aliphatic H, NEUTRAL methylamine (#) !(#) EXTREME care is required when doing atom typing on compounds that look like this. Use ONLY !on NEUTRAL METHYLAMINE groups, NOT Schiff Bases, but DO use on 2 out of 3 guanidine nitrogens MASS -1 HGP1 1.00800 ! polar H MASS -1 HGP2 1.00800 ! polar H, +ve charge MASS -1 HGP3 1.00800 ! polar H, thiol MASS -1 HGP4 1.00800 ! polar H, neutral conjugated -NH2 group (NA bases) MASS -1 HGP5 1.00800 ! polar H on quarternary ammonium salt (choline) MASS -1 HGPAM1 1.00800 ! polar H, NEUTRAL dimethylamine (#), terminal alkyne H MASS -1 HGPAM2 1.00800 ! polar H, NEUTRAL methylamine (#) MASS -1 HGPAM3 1.00800 ! polar H, NEUTRAL ammonia (#) !(#) EXTREME care is required when doing atom typing on compounds that look like this. Use ONLY !on NEUTRAL METHYLAMINE groups, NOT Schiff Bases, but DO use on 2 out of 3 guanidine nitrogens MASS -1 HGR51 1.00800 ! nonpolar H, neutral 5-mem planar ring C, LJ based on benzene MASS -1 HGR52 1.00800 ! Aldehyde H, formamide H (RCOH); nonpolar H, neutral 5-mem planar ring C adjacent to heteroatom or + charge MASS -1 HGR53 1.00800 ! nonpolar H, +ve charge HIS he1(+1) MASS -1 HGR61 1.00800 ! aromatic H MASS -1 HGR62 1.00800 ! nonpolar H, neutral 6-mem planar ring C adjacent to heteroatom MASS -1 HGR63 1.00800 ! nonpolar H, NAD+ nicotineamide all ring CH hydrogens MASS -1 HGR71 1.00800 ! nonpolar H, neutral 7-mem arom ring, AZUL, azulene, kevo !carbons MASS -1 CG1T1 12.01100 ! internal alkyne R-C#C MASS -1 CG1T2 12.01100 ! terminal alkyne H-C#C MASS -1 CG1N1 12.01100 ! C for cyano group MASS -1 CG2D1 12.01100 ! alkene; RHC= ; imine C MASS -1 CG2D2 12.01100 ! alkene; H2C= MASS -1 CG2D1O 12.01100 ! double bond C adjacent to heteroatom. In conjugated systems, the atom to which it is double bonded must be CG2DC1. MASS -1 CG2D2O 12.01100 ! double bond C adjacent to heteroatom. In conjugated systems, the atom to which it is double bonded must be CG2DC2. MASS -1 CG2DC1 12.01100 ! conjugated alkenes, R2C=CR2 MASS -1 CG2DC2 12.01100 ! conjugated alkenes, R2C=CR2 MASS -1 CG2DC3 12.01100 ! conjugated alkenes, H2C= MASS -1 CG2N1 12.01100 ! conjugated C in guanidine/guanidinium MASS -1 CG2N2 12.01100 ! conjugated C in amidinium cation MASS -1 CG2O1 12.01100 ! carbonyl C: amides MASS -1 CG2O2 12.01100 ! carbonyl C: esters, [neutral] carboxylic acids MASS -1 CG2O3 12.01100 ! carbonyl C: [negative] carboxylates MASS -1 CG2O4 12.01100 ! carbonyl C: aldehydes MASS -1 CG2O5 12.01100 ! carbonyl C: ketones MASS -1 CG2O6 12.01100 ! carbonyl C: urea, carbonate MASS -1 CG2O7 12.01100 ! CO2 carbon MASS -1 CG2R51 12.01100 ! 5-mem ring, his CG, CD2(0), trp MASS -1 CG2R52 12.01100 ! 5-mem ring, double bound to N, PYRZ, pyrazole MASS -1 CG2R53 12.01100 ! 5-mem ring, double bound to N and adjacent to another heteroatom, purine C8, his CE1 (0,+1), 2PDO, kevo MASS -1 CG2R57 12.01100 ! 5-mem ring, bipyrroles MASS -1 CG25C1 12.01100 ! same as CG2DC1 but in 5-membered ring with exocyclic double bond MASS -1 CG25C2 12.01100 ! same as CG2DC2 but in 5-membered ring with exocyclic double bond MASS -1 CG251O 12.01100 ! same as CG2D1O but in 5-membered ring with exocyclic double bond MASS -1 CG252O 12.01100 ! same as CG2D2O but in 5-membered ring with exocyclic double bond MASS -1 CG2R61 12.01100 ! 6-mem aromatic C MASS -1 CG2R62 12.01100 ! 6-mem aromatic C for protonated pyridine (NIC) and rings containing carbonyls (see CG2R63) (NA) MASS -1 CG2R63 12.01100 ! 6-mem aromatic amide carbon (NA) (and other 6-mem aromatic carbonyls?) MASS -1 CG2R64 12.01100 ! 6-mem aromatic amidine and guanidine carbon (between 2 or 3 Ns and double-bound to one of them), NA, PYRM MASS -1 CG2R66 12.01100 ! 6-mem aromatic carbon bound to F MASS -1 CG2R67 12.01100 ! 6-mem aromatic carbon of biphenyl MASS -1 CG2RC0 12.01100 ! 6/5-mem ring bridging C, guanine C4,C5, trp MASS -1 CG2R71 12.01100 ! 7-mem ring arom C, AZUL, azulene, kevo MASS -1 CG2RC7 12.01100 ! sp2 ring connection with single bond(!), AZUL, azulene, kevo MASS -1 CG301 12.01100 ! aliphatic C, no hydrogens, neopentane MASS -1 CG302 12.01100 ! aliphatic C, no hydrogens, trifluoromethyl MASS -1 CG311 12.01100 ! aliphatic C with 1 H, CH MASS -1 CG312 12.01100 ! aliphatic C with 1 H, difluoromethyl MASS -1 CG314 12.01100 ! aliphatic C with 1 H, adjacent to positive N (PROT NTER) (+) MASS -1 CG321 12.01100 ! aliphatic C for CH2 MASS -1 CG322 12.01100 ! aliphatic C for CH2, monofluoromethyl MASS -1 CG323 12.01100 ! aliphatic C for CH2, thiolate carbon MASS -1 CG324 12.01100 ! aliphatic C for CH2, adjacent to positive N (piperidine) (+) MASS -1 CG331 12.01100 ! aliphatic C for methyl group (-CH3) MASS -1 CG334 12.01100 ! aliphatic C for methyl group (-CH3), adjacent to positive N (PROT NTER) (+) MASS -1 CG3AM0 12.01100 ! aliphatic C for CH3, NEUTRAL trimethylamine methyl carbon (#) MASS -1 CG3AM1 12.01100 ! aliphatic C for CH3, NEUTRAL dimethylamine methyl carbon (#) MASS -1 CG3AM2 12.01100 ! aliphatic C for CH3, NEUTRAL methylamine methyl carbon (#) !(#) EXTREME care is required when doing atom typing on compounds that look like this. Use ONLY !on NEUTRAL METHYLAMINE groups, NOT ETHYL, NOT Schiff Bases, but DO use on 2 out of 3 guanidine nitrogens MASS -1 CG3C31 12.01100 ! cyclopropyl carbon MASS -1 CG3C41 12.01100 ! cyclobutyl carbon MASS -1 CG3C50 12.01100 ! 5-mem ring aliphatic quaternary C (cholesterol, bile acids) MASS -1 CG3C51 12.01100 ! 5-mem ring aliphatic CH (proline CA, furanoses) MASS -1 CG3C52 12.01100 ! 5-mem ring aliphatic CH2 (proline CB/CG/CD, THF, deoxyribose) MASS -1 CG3C53 12.01100 ! 5-mem ring aliphatic CH adjacent to positive N (proline.H+ CA) (+) MASS -1 CG3C54 12.01100 ! 5-mem ring aliphatic CH2 adjacent to positive N (proline.H+ CD) (+) MASS -1 CG3RC1 12.01100 ! bridgehead in bicyclic systems containing at least one 5-membered or smaller ring !(+) Includes protonated Shiff base (NG3D5, NG2R52 in 2HPP) but NOT amidinium (NG2R52 in IMIM), guanidinium !nitrogens MASS -1 NG1T1 14.00700 ! N for cyano group !MASS -1 NG1D1 14.00700 ! terminal N in azides, lsk MASS -1 NG2D1 14.00700 ! N for neutral imine/Schiff's base (C=N-R, acyclic amidine, gunaidine) MASS -1 NG2S0 14.00700 ! N,N-disubstituted amide, proline N (CO=NRR') MASS -1 NG2S1 14.00700 ! peptide nitrogen (CO=NHR) MASS -1 NG2S2 14.00700 ! terminal amide nitrogen (CO=NH2) MASS -1 NG2S3 14.00700 ! external amine ring nitrogen (planar/aniline), phosphoramidate !MASS -1 NG2S4 14.00700 ! neutral hydroxamic acid MASS -1 NG2O1 14.00700 ! NITB, nitrobenzene MASS -1 NG2P1 14.00700 ! N for protonated imine/Schiff's base (C=N(+)H-R, acyclic amidinium, guanidinium) MASS -1 NG2R43 14.00700 ! amide in 4-memebered ring (planar), AZDO, lsk MASS -1 NG2R50 14.00700 ! double bound neutral 5-mem planar ring, purine N7 MASS -1 NG2R51 14.00700 ! single bound neutral 5-mem planar (all atom types sp2) ring, his, trp pyrrole (fused) MASS -1 NG2R52 14.00700 ! protonated schiff base, amidinium, guanidinium in 5-membered ring, HIS, 2HPP, kevo MASS -1 NG2R53 14.00700 ! amide in 5-memebered NON-SP2 ring (slightly pyramidized), 2PDO, kevo MASS -1 NG2R57 14.00700 ! 5-mem ring, bipyrroles MASS -1 NG2R60 14.00700 ! double bound neutral 6-mem planar ring, pyr1, pyzn MASS -1 NG2R61 14.00700 ! single bound neutral 6-mem planar ring imino nitrogen; glycosyl linkage MASS -1 NG2R62 14.00700 ! double bound 6-mem planar ring with heteroatoms in o or m, pyrd, pyrm MASS -1 NG2R67 14.00700 ! 6-mem planar ring substituted with 6-mem planar ring (N-phenyl pyridinones etc.) MASS -1 NG2RC0 14.00700 ! 6/5-mem ring bridging N, indolizine, INDZ, kevo MASS -1 NG301 14.00700 ! neutral trimethylamine nitrogen MASS -1 NG311 14.00700 ! neutral dimethylamine nitrogen MASS -1 NG321 14.00700 ! neutral methylamine nitrogen MASS -1 NG331 14.00700 ! neutral ammonia nitrogen MASS -1 NG3C51 14.00700 ! secondary sp3 amine in 5-membered ring MASS -1 NG3N1 14.00700 ! N in hydrazine, HDZN MASS -1 NG3P0 14.00700 ! quarternary N+, choline MASS -1 NG3P1 14.00700 ! tertiary NH+ (PIP) MASS -1 NG3P2 14.00700 ! secondary NH2+ (proline) MASS -1 NG3P3 14.00700 ! primary NH3+, phosphatidylethanolamine !oxygens MASS -1 OG2D1 15.99940 ! carbonyl O: amides, esters, [neutral] carboxylic acids, aldehydes, uera MASS -1 OG2D2 15.99940 ! carbonyl O: negative groups: carboxylates, carbonate MASS -1 OG2D3 15.99940 ! carbonyl O: ketones MASS -1 OG2D4 15.99940 ! 6-mem aromatic carbonyl oxygen (nucleic bases) MASS -1 OG2D5 15.99940 ! CO2 oxygen MASS -1 OG2N1 15.99940 ! NITB, nitrobenzene MASS -1 OG2P1 15.99940 ! =O in phosphate or sulfate MASS -1 OG2R50 15.99940 ! FURA, furan MASS -1 OG3R60 15.99940 ! O in 6-mem cyclic enol ether (PY01, PY02) or ester MASS -1 OG301 15.99940 ! ether -O- !SHOULD WE HAVE A SEPARATE ENOL ETHER??? IF YES, SHOULD WE MERGE IT WITH OG3R60??? MASS -1 OG302 15.99940 ! ester -O- MASS -1 OG303 15.99940 ! phosphate/sulfate ester oxygen MASS -1 OG304 15.99940 ! linkage oxygen in pyrophosphate/pyrosulphate MASS -1 OG311 15.99940 ! hydroxyl oxygen MASS -1 OG312 15.99940 ! ionized alcohol oxygen MASS -1 OG3C31 15.99940 ! epoxide oxygen, 1EOX, 1BOX, sc MASS -1 OG3C51 15.99940 ! 5-mem furanose ring oxygen (ether) MASS -1 OG3C61 15.99940 ! DIOX, dioxane, ether in 6-membered ring !SHOULD WE MERGE THIS WITH OG3R60??? !sulphurs MASS -1 SG2D1 32.06000 ! thiocarbonyl S MASS -1 SG2R50 32.06000 ! THIP, thiophene MASS -1 SG311 32.06000 ! sulphur, SH, -S- MASS -1 SG301 32.06000 ! sulfur C-S-S-C type MASS -1 SG302 32.06000 ! thiolate sulfur (-1) MASS -1 SG3O1 32.06000 ! sulfate -1 sulfur MASS -1 SG3O2 32.06000 ! neutral sulfone/sulfonamide sulfur MASS -1 SG3O3 32.06000 ! neutral sulfoxide sulfur !halogens MASS -1 CLGA1 35.45300 ! CLET, DCLE, chloroethane, 1,1-dichloroethane MASS -1 CLGA3 35.45300 ! TCLE, 1,1,1-trichloroethane MASS -1 CLGR1 35.45300 ! CHLB, chlorobenzene MASS -1 BRGA1 79.90400 ! BRET, bromoethane MASS -1 BRGA2 79.90400 ! DBRE, 1,1-dibromoethane MASS -1 BRGA3 79.90400 ! TBRE, 1,1,1-dibromoethane MASS -1 BRGR1 79.90400 ! BROB, bromobenzene MASS -1 IGR1 126.90447 ! IODB, iodobenzene MASS -1 FGA1 18.99800 ! aliphatic fluorine, monofluoro MASS -1 FGA2 18.99800 ! aliphatic fluorine, difluoro MASS -1 FGA3 18.99800 ! aliphatic fluorine, trifluoro MASS -1 FGP1 18.99800 ! anionic F, for ALF4 AlF4- MASS -1 FGR1 18.99800 ! aromatic flourine The third column, like ``HGA1`` or ``CG331``, is the atom type that can be accepted by ABGrow. Statements after ``!`` are the explainations of that atom type. For example, .. code-block:: MASS -1 CG331 12.01100 ! aliphatic C for methyl group (-CH3) So, for a carbon atom that is ``aliphatic C for methyl group (-CH3)``, it should be given a type of ``CG331``. In most cases, ``topgen`` can predict the correct type, but sometimes it fails. For exmaple, in Line 100, the first carbone atom is given a type of ``CG2RC7``, which is of course wrong. This is a quaternary carbon with 4 benzene rings. Unfortunately, CGenFF does not have such an atom type. In this circumstance, **one should give it a type from CGenFF that is chemically most similar**. We can find this: .. code-block:: MASS -1 CG301 12.01100 ! aliphatic C, no hydrogens, neopentane This quaternary carbon type in neopentane seems to be the most similar one. So, this carbon atom is given a type of ``CG301``. **Remeber** to make this change in ``monomer-abgrow.xyz``. Charge ++++++++ The automatically generated atomic charges usually do not work. A quatnum chemical calculation is needed. There are 2 steps: 1. Generate wave function file with a quantum chemical program, like `Qbics `_ (free of charge) or Gaussian (commercial software). 2. Calculate RESP charges with `Multiwfn `_. Generate Wave Function ^^^^^^^^^^^^^^^^^^^^^^^^^^^ For the purpose of calculating RESP charges, a density functional theory (DFT) like B3LYP/6-31g(d) is often sufficient. An input file for Qbics ``monomer.inp`` is given below: .. code-block:: :caption: monomer.inp :linenos: basis 6-31g(d) end scf charge 0 # The total charge. spin2p1 1 # The spin multiplicity. end mol monomer-abgrow.xyz # You can add path to the file name, or just put all coordinates here. end task energy b3lyp end Then run Qbics to do the calculation: .. code-block:: bash $ qbics-linux-cpu monomer.inp -n 8 > monomer.out & If you use Windows version, just change ``qbics-linux-cpu`` to ``qbics-win-cpu``; ``-+n 8`` menas 8 cores are used, which can be changed to other suitable number. After calculation, you can get a file called ``monomer.mwfn``. This is the wave function file that can be supported by Multiwfn best. For more details of Qbics, please refer to http://qibcs.info. A tutorial can be found at http://qibcs.info/tutorial. If you prefer Gaussian, prepare the following input ``monomer.gjf``: .. code-block:: :caption: monomer.gjf :linenos: %nprocs=8 #B3LYP/6-31g(d) Output(wfn) monomer 0 1 C -0.00000000 0.00000000 0.00000000 C -0.99204361 0.67989624 -0.94006954 C -1.62513076 -0.10642738 -1.90138442 C -1.24520505 2.04367646 -0.93242437 C -2.50784698 0.45009463 -2.80643536 H -1.41636178 -1.16636581 -1.94027187 C -2.12863118 2.60280511 -1.84215854 H -0.74872619 2.68691666 -0.22165626 C -2.76733728 1.81056951 -2.77756466 H -2.99072901 -0.17871632 -3.54031416 H -2.31214042 3.66730700 -1.81823226 H -3.45545629 2.24898711 -3.48516956 C 0.67989624 0.99204361 0.94006954 C 2.04367646 1.24520505 0.93242437 C -0.10642738 1.62513076 1.90138442 C 2.60280511 2.12863118 1.84215854 H 2.68691666 0.74872619 0.22165626 C 0.45009463 2.50784698 2.80643536 H -1.16636581 1.41636178 1.94027187 C 1.81056951 2.76733728 2.77756466 H 3.66730700 2.31214042 1.81823226 H -0.17871632 2.99072901 3.54031416 H 2.24898711 3.45545629 3.48516956 C 0.99204361 -0.67989624 -0.94006954 C 1.24520505 -2.04367646 -0.93242437 C 1.62513076 0.10642738 -1.90138442 C 2.12863118 -2.60280511 -1.84215854 H 0.74872619 -2.68691666 -0.22165626 C 2.50784698 -0.45009463 -2.80643536 H 1.41636178 1.16636581 -1.94027187 C 2.76733728 -1.81056951 -2.77756466 H 2.31214042 -3.66730700 -1.81823226 H 2.99072901 0.17871632 -3.54031416 H 3.45545629 -2.24898711 -3.48516956 C -0.67989624 -0.99204361 0.94006954 C 0.10642738 -1.62513076 1.90138442 C -2.04367646 -1.24520505 0.93242437 C -0.45009463 -2.50784698 2.80643536 H 1.16636581 -1.41636178 1.94027187 C -2.60280511 -2.12863118 1.84215854 H -2.68691666 -0.74872619 0.22165626 C -1.81056951 -2.76733728 2.77756466 H 0.17871632 -2.99072901 3.54031416 H -3.66730700 -2.31214042 1.81823226 H -2.24898711 -3.45545629 3.48516956 monomer.wfn Then run Gaussian to do the calculation: .. code-block:: bash $ g16 < monomer.gjf > monomer.out & After calculation, you can get a file called ``monomer.wfn``. This is the wave function file that can be supported by Multiwfn best. Calculate RESP Charges ^^^^^^^^^^^^^^^^^^^^^^^^^^^ With ``monomer.mwfn`` or ``monomer.wfn``, restrained electrostatic potential (RESP) can be evaluated easily with powerful and free `Multiwfn `_. You can run the following commands: .. code-block:: bash $ Multiwfn monomer.mwfn $ 7 # Choose: Population analysis and calculation of atomic charges $ 18 # Choose: Restrained ElectroStatic Potential (RESP) atomic charge $ 1 # Choose: Start standard two-stage RESP fitting calculation $ y # Save RESP charges to monomer.chg. .. attention:: If you used Multiwfn to calculate RESP charges, please cite the following papers: - Zhang, J.; Lu, T. `Efficient Evaluation of Electrostatic Potential with Computerized Optimized Code `_ *Phys. Chem. Chem. Phys.* **2021**, *23*, 20323. - Zhang, J. `libreta: Computerized Optimization and Code Synthesis for Electron Repulsion Integral Evaluation `_ *J. Chem. Theory Comput.* **2018**, *14*, 572. - Lu, T.; Chen, F. `Multiwfn: A Multifunctional Wavefunction Analyzer `_ *J. Comput. Chem.* **2012**, *33*, 580. After a few minutes, RESP charges will be saved to ``monomer.chg``. Open it you will see: .. code-block:: :caption: monomer.chg :linenos: C 0.000000 0.000000 0.000000 -1.5653348594 C -0.992044 0.679896 -0.940070 0.5486338023 C -1.625131 -0.106427 -1.901384 -0.1807938321 C -1.245205 2.043676 -0.932424 -0.1757742630 C -2.507847 0.450095 -2.806435 -0.1528019340 H -1.416362 -1.166366 -1.940272 0.1381451620 C -2.128631 2.602805 -1.842159 -0.1687077518 H -0.748726 2.686917 -0.221656 0.1232876422 C -2.767337 1.810570 -2.777565 -0.1093811603 H -2.990729 -0.178716 -3.540314 0.1236800912 ... The last column is the charges. Use them to replace the charges in ``monomer-abgrow.xyz``, you will get a good monomer file: .. code-block:: :caption: monomer-abgrow.xyz :linenos: 45 RESP at B3LYP/6-31g(d) C -0.00000000 0.00000000 0.00000000 C -0.99204361 0.67989624 -0.94006954 C -1.62513076 -0.10642738 -1.90138442 C -1.24520505 2.04367646 -0.93242437 C -2.50784698 0.45009463 -2.80643536 H -1.41636178 -1.16636581 -1.94027187 C -2.12863118 2.60280511 -1.84215854 H -0.74872619 2.68691666 -0.22165626 C -2.76733728 1.81056951 -2.77756466 H -2.99072901 -0.17871632 -3.54031416 H -2.31214042 3.66730700 -1.81823226 H -3.45545629 2.24898711 -3.48516956 C 0.67989624 0.99204361 0.94006954 C 2.04367646 1.24520505 0.93242437 C -0.10642738 1.62513076 1.90138442 C 2.60280511 2.12863118 1.84215854 H 2.68691666 0.74872619 0.22165626 C 0.45009463 2.50784698 2.80643536 H -1.16636581 1.41636178 1.94027187 C 1.81056951 2.76733728 2.77756466 H 3.66730700 2.31214042 1.81823226 H -0.17871632 2.99072901 3.54031416 H 2.24898711 3.45545629 3.48516956 C 0.99204361 -0.67989624 -0.94006954 C 1.24520505 -2.04367646 -0.93242437 C 1.62513076 0.10642738 -1.90138442 C 2.12863118 -2.60280511 -1.84215854 H 0.74872619 -2.68691666 -0.22165626 C 2.50784698 -0.45009463 -2.80643536 H 1.41636178 1.16636581 -1.94027187 C 2.76733728 -1.81056951 -2.77756466 H 2.31214042 -3.66730700 -1.81823226 H 2.99072901 0.17871632 -3.54031416 H 3.45545629 -2.24898711 -3.48516956 C -0.67989624 -0.99204361 0.94006954 C 0.10642738 -1.62513076 1.90138442 C -2.04367646 -1.24520505 0.93242437 C -0.45009463 -2.50784698 2.80643536 H 1.16636581 -1.41636178 1.94027187 C -2.60280511 -2.12863118 1.84215854 H -2.68691666 -0.74872619 0.22165626 C -1.81056951 -2.76733728 2.77756466 H 0.17871632 -2.99072901 3.54031416 H -3.66730700 -2.31214042 1.81823226 H -2.24898711 -3.45545629 3.48516956 1 2 1.0 13 1.0 24 1.0 35 1.0 2 3 1.0 4 2.0 3 5 2.0 6 1.0 4 7 2.0 8 1.0 5 9 2.0 10 1.0 6 7 9 2.0 11 1.0 8 9 12 1.0 10 11 12 13 14 2.0 15 1.0 14 16 2.0 17 1.0 15 18 2.0 19 1.0 16 20 2.0 21 1.0 17 18 20 2.0 22 1.0 19 20 23 1.0 21 22 23 24 25 2.0 26 1.0 25 27 2.0 28 1.0 26 29 2.0 30 1.0 27 31 2.0 32 1.0 28 29 31 2.0 33 1.0 30 31 34 1.0 32 33 34 35 36 1.0 37 2.0 36 38 2.0 39 1.0 37 40 2.0 41 1.0 38 42 2.0 43 1.0 39 40 42 2.0 44 1.0 41 42 45 1.0 43 44 45 9 12 0 0 20 23 0 0 31 34 0 0 42 45 0 0 CG301 -1.5653348594 CG2R61 0.5486338023 CG2R61 -0.1807938321 CG2R61 -0.1757742630 CG2R61 -0.1528019340 HGR61 0.1381451620 CG2R61 -0.1687077518 HGR61 0.1232876422 CG2R61 -0.1093811603 HGR61 0.1236800912 HGR61 0.1285771475 HGR61 0.1146450489 CG2R61 0.5537613777 CG2R61 -0.1776415775 CG2R61 -0.1804809560 CG2R61 -0.1693787789 HGR61 0.1236576037 CG2R61 -0.1536901781 HGR61 0.1379626861 CG2R61 -0.1094535255 HGR61 0.1289189567 HGR61 0.1239239689 HGR61 0.1147394315 CG2R61 0.5556504140 CG2R61 -0.1789253746 CG2R61 -0.1816910129 CG2R61 -0.1694062811 HGR61 0.1245726781 CG2R61 -0.1538736680 HGR61 0.1381249369 CG2R61 -0.1091800778 HGR61 0.1289147407 HGR61 0.1241384965 HGR61 0.1146368563 CG2R61 0.5523475596 CG2R61 -0.1836196251 CG2R61 -0.1777055806 CG2R61 -0.1519120037 HGR61 0.1384408466 CG2R61 -0.1680540950 HGR61 0.1240363984 CG2R61 -0.1104475731 HGR61 0.1240069731 HGR61 0.1285444303 HGR61 0.1149068592 At Line 2, we have made the title more informative. This is the final monomer file we need.