Example: Build Parameter File for (NHCH3)2CO -------------------------------------------- When your interested molecule is absent in ``misc/charmm36``, you will have to build its parameter file by yourself. Here we will take N,N'-dimethylurea, i.e., :math:`\left(\mathrm{NHCH}_3\right)_2\mathrm{CO}`, as an example. New Way: Automatically Building ===================================== .. attention:: When using ``topgen`` for this job, you are actually using some graph representation learning algorithm for automatic atom typing. Thus please **cite** the following paper: - Zhang, J. `Atom Typing Using Graph Representation Learning: How Do Models Learn Chemistry? `_ *J. Chem. Phys.* **2022**, *156*, 204108. From ABCluster 3.1, the parameter file for an arbitrary organic molecule can be build automatically with ``topgen``. Step 1: Build the coordinate file of :math:`\left(\mathrm{NHCH}_3\right)_2\mathrm{CO}` in XYZ format. Let's call it ``nndimethylurea.xyz``. The coordinates can be optimized as a moderate level of theory. We use B3LYPD3/def2-TZVP. .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 The molecular structure and atom indices are shown below. .. image:: pics/nnurea.png :align: center :scale: 50% Step 2: Use ``topgen`` to build parameter files: .. code-block:: bash $ topgen nndimethylurea.xyz Step 3: You can find the following output: .. code-block:: bash :linenos: :caption: bash In "nndimethylurea-rigid.xyz" and "nndimethylurea.psf": "X" means unknown atomic type. Total charge: 0.5900 So, ``nndimethylurea-rigid.xyz`` is the parameter file we need. But its total charge is ``0.5900``, which is not what we want. You must adjust the charges manually, or **re-fit some physically meaningful ones, like RESP charges with Multiwfn.** See :doc:`here ` for a complete example. Old (and out-of-date) Way: Manually Building ============================================== Step 1: Build the coordinate file of :math:`\left(\mathrm{NHCH}_3\right)_2\mathrm{CO}` in XYZ format. Let's call it ``nndimethylurea.xyz``. The coordinates can be optimized as a moderate level of theory. We use B3LYPD3/def2-TZVP. .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 The molecular structure and atom indices are shown below. .. image:: pics/nnurea.png :align: center :scale: 50% Step 2: Add a line of comment, which can be anything: .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 all36_cgenff q epsilon (kJ/mol) sigma (AA) Step 3: The force field parameters have to be derived from official CHARMM documents. They can be downloaded from `here `_. .. tip:: In CHARMM force field, each atom has a **type**, which is determined by its chemical environment. Most useful parameters can be found in ``top_all36_cgenff.rtf`` and ``par_all36_cgenff.prm``. Step 4: The paramters must be given in the cooridnate order. Now we are looking for **the atom types that are similar to the ones this molecule.** Let's first dtermine atomic charges. We can find urea molecule in ``top_all36_cgenff.rtf``: .. code-block:: bash :caption: top_all36_cgenff.rtf RESI UREA 0.00 ! CH4N2O, Urea, adm GROUP ATOM N1 NG2S2 -0.69 ATOM H11 HGP1 0.34 ATOM H12 HGP1 0.34 ATOM C2 CG2O6 0.60 ATOM O2 OG2D1 -0.58 ATOM N3 NG2S2 -0.69 ATOM H31 HGP1 0.34 ATOM H32 HGP1 0.34 So, it is very reasonable to set atomic charges to the same value as they are in urea: .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 all36_cgenff q epsilon (kJ/mol) sigma (AA) +0.60 -0.58 -0.69 +0.34 -0.69 +0.34 Now we have to determine charges of the two methyl groups. After exmining ``top_all36_cgenff.rtf``, we find that the charge on :math:`\mathrm{H}` from :math:`\mathrm{CH}_3` is almost always ``+0.09``, then the carbon charges can be determined by requiring the total charge being zero. Now, all charges are available: .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 all36_cgenff q epsilon (kJ/mol) sigma (AA) +0.60 -0.58 -0.69 +0.34 -0.69 +0.34 +0.07 +0.09 +0.09 +0.09 +0.07 +0.09 +0.09 +0.09 Step 5: Let's determine the Lennard-Jones parameters. It is easier: simply search the atomic type. For example, ``1C`` in urea is of type ``CG2O6``, then search in ``par_all36_cgenff.prm`` and you will find this: .. code-block:: bash :caption: par_all36_cgenff.prm CG2O5 0.0 -0.0900 2.00 ! adm, acetone, 11/08 CG2O6 0.0 -0.0700 2.00 ! UREA, CO3 (carbonate) from acetate heat of solvation CG2O7 0.0 -0.0580 1.5630 ! carbon dioxide, JES So, the two parameters are ``-0.0700`` and ``2.00``. However, they must be **transformed** before using in ABCluster. The formulas are: +----------------------------------------------------------------------------------+ | :math:`\epsilon_{\mathrm{ABCluster}} = \epsilon_{\mathrm{CHARMM}}\times(-4.184)` | +----------------------------------------------------------------------------------+ | :math:`\sigma_{\mathrm{ABCluster}} = \sigma_{\mathrm{CHARMM}}\times 2^{5/6}` | +----------------------------------------------------------------------------------+ Therefore: .. math:: \epsilon_{\mathrm{ABCluster}} = -0.07\times(-4.184) = 0.2929 .. math:: \sigma_{\mathrm{ABCluster}} = 2.00\times 2^{5/6} = 3.5636 We can put these parameters to ``nndimethylurea.xyz``: .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 all36_cgenff q epsilon (kJ/mol) sigma (AA) +0.60 0.2929 3.5636 # CG2O6 -0.58 -0.69 +0.34 -0.69 +0.34 +0.07 +0.09 +0.09 +0.09 +0.07 +0.09 +0.09 +0.09 Note that we can add a comment at the end of parameters. Step 6: In a similar way, we can build all parameters for this molecule: .. code-block:: bash :caption: nndimethylurea.xyz 14 dimethylurea C -3.40944156863935 1.48217854479510 0.18460221407169 O -2.83617751600667 2.55594436393475 0.29160616578062 N -4.77159876631838 1.40346427888521 0.01033470214576 H -5.17936611057651 2.32373985124644 -0.02398130498834 N -2.71942098770917 0.27746301048357 0.25258402399382 H -1.74237705636477 0.48553039432599 0.40014513551919 C -5.63042059274161 0.33540415616324 0.49938800446410 H -6.45368224181311 0.76470101295480 1.07232563209061 H -6.05781648140388 -0.26211048052894 -0.31127284718113 H -5.06868507307871 -0.32252499680997 1.16139152177693 C -2.93625253356738 -0.80249737836091 -0.70510480675735 H -2.73548051116983 -0.49573138113413 -1.73834868327471 H -2.27102803030176 -1.62710593941345 -0.45152402911762 H -3.95491690030879 -1.17972228654170 -0.65012968852358 all36_cgenff q epsilon (kJ/mol) sigma (AA) +0.60 0.2929 3.5636 # CG2O6 -0.58 0.5021 3.0291 # OG2D1 -0.69 0.8368 3.2963 # NG2S2 +0.34 0.1925 0.4000 # HGP1 -0.69 0.8368 3.2963 # NG2S2 +0.34 0.1925 0.4000 # HGP1 +0.07 0.3264 3.6527 # CG331 +0.09 0.1004 2.3876 # HGA3 +0.09 0.1004 2.3876 # HGA3 +0.09 0.1004 2.3876 # HGA3 +0.07 0.3264 3.6527 # CG331 +0.09 0.1004 2.3876 # HGA3 +0.09 0.1004 2.3876 # HGA3 +0.09 0.1004 2.3876 # HGA3 Now we can use this ``nndimethylurea.xyz`` for global optimization! .. tip:: When build your own parameter files, note that the charges can be slightly adjusted according to the total charge or chemical environments, but the **Lennard-Jones parameters should be kept as they are!** Before conclusion, we point out that ``rigidmol`` supports virtual sites. For example, in TIP4P water, a virtual site exists close to the oxygen atom. The symbol of all virtual sites starts with ``q``. For example, in ``misc/charmm36/tip4p.xyz``: .. code-block:: bash :caption: misc/charmm36/tip4p.xyz 4 H2O O 0.00000000 0.00000000 0.21451132 qLP 0.00000000 0.00000000 0.06451132 H 0.00000000 0.75695033 -0.37137096 H 0.00000000 -0.75695033 -0.37137096 TIP4P q epsilon (kJ/mol) sigma (AA) 0.000 0.64852 3.1536 # OT -1.040 0.00000 0.0000 # LP +0.520 0.00000 0.0000 # HT +0.520 0.00000 0.0000 # HT ``qLP`` is the virtual site. Note that in output files, all ``q*`` will be written in ``*.xyz`` but **not** in ``*.gjf``. You can use this feature to design some special force fields.