Code Styles


Code Format

Code format in a project usually defines something like “A line length should not exceed 80 characters”, “order of including header files”, or “One space is needed before and after +”. However, in this project, one can write code in somewhat “free format”. The standard formatting is automatically done by using editors like Visual Studio Code with the format configuration file src/.clang-format.

Hint

Press Shift+Alt+F in Visual Studio Code will format the code.

C++ Version

  • C++17 is used currently.

  • Do not use C++2X features.

File Structures

General Rules

  • All C++ header and source files must be placed into src.

  • Each folder in src is a module, which contains many closely-related classes.

Hint

For example, src/geom contains all classes of pure geomertic operations.

  • Each class must have at least two files: header file *.h that contains declaration, and a source file *.cpp that contains implementation.

Hint

Do not put any implementation in header files, like the assignment of a static class member.

  • If the implementation of the class is very large, then the source file can be split into several ones, each containing closely-related functions. These files must be named as *_*.h. Splitting should not be done unless very necessary.

Hint

For example, class MultiplicationTable can have two source files MultiplicationTable_integer.cpp and MultiplicationTable_real.cpp that implement the multiplication of integers and real numbers, respectively.

Header Files

  • A header file must be protected using #ifndef/#define/#endif macro. By using tools/develop.py, this is automatically done.

  • namespace kiseki is the only one allowed in this project.

  • No additional using namespace is allowed in header files outside namespace kiseki. They can be used inside namespace kiseki.

  • <> and ".h" are used to include standard (including STL, Boost, and Eigen) and non-standard header files, respectively.

MulticationTable.h
#ifndef __MULTICATIONTABLE_H__
#define __MULTICATIONTABLE_H__

#include "Addition.h"
#include <vector>
#include <cstdio>

namespace kiseki {

using namespace std; // This "using namespace" cannot be placed outside "namespace kiseki".

class MulticaionTable {
    ...
};

}  // namespace kiseki

#endif // __MULTICATIONTABLE_H__

Naming

General Rules

  • All names must be max-information without unambiguities, even they are longer.

  • Below are some recommended words and abbreviations that used extensively in this project.

Word

Explanation

Examples

num

Number of …

num_electrons, GetNumAtoms

calc

Calculate, calculation

AccurateCalc, CalcDensity

mol

Molecule

mol_charge, BuildMol

geom

Geometry, geometric

geom_center, OptimizeGeom

coord

Coordinate

moved_coord, GetCoords

param

Parameter

num_params

fn

File name

param_fn

fd

File handle

param_fd

inp

Input

inp_fd

iter

Iterator

iter

str

String

VectorToStr

set/get

GetEnergy, SetEnergy

Module

Must be named using a short word in lower case, like algorithm.

Class, Structure, typedef and Enum

Must be named using capitalized meaningful nouns and adjectives, each word being , like MultiplicationTable.

Global and Class Member functions

  • Must be named using capitalized words and should be verb or verb+noun. For example, class Molecule has a function to get its number of atoms, then this one is named as GetNumElectrons.

  • But do not contain redundant information. For example, a function that rotate the molecule should not be called RotateMol but simply Rotate.

Global and Class/Structure Member Variables

Standard Rules

Hint

Although it is controversial that whether long variable names should be used, in this project, unambiguity is over length, that is, long but clear variable names are always preferred to short but ambiguous ones. For example, the number of alpha-spin electrons is named as num_alpha_electrons instead of nAelec.

  • Non-constant variables must be named using lower-cased words connected by _. For example, num_atoms, geom_center.

  • For well-known, professional abbreviations, cases follow the convention. For example, hydrogen_NMR_shift, modified_RESP_charge.

  • Constant variables must be named using capitalized words. No _ is recommended. For example, LightSpeedInVacuum, WaterEpsilon.

  • Numbers are not separated by _. For example, C12_mass, N15_mass.

  • Unless necessary, meaningless numbered variable names are not allowed. For example, value1 and value2.

  • For a lot of objects like array, vector, map or set, naming should use plural forms.

Examples of naming with plural forms.
vector<double> coords;            // A lot of things called coord.
double         coord = coords[0]; // A single coord.
  • For class/structure member assignment purpose, one the use t to represent temperary arguments that are to be assigned to the ones with the same name.

Examples of naming arguments for class/structure member assignment.
class Point {
public:
    void SetX(double tx); // "tx" is to set to "x".
    void SetY(double ty); // "ty" is to set to "y".

private:
    double x;
    double y;
};

void Point::SetX(double tx)
{
    x = tx;
}

void Point::SetY(double ty)
{
    y = ty;
}

Exceptions

  • For unimportant loop variables, one can use i, j, k, p or other convenient names.

Examples of unimportant loop variables.
for (int i = 0; i < MaxI; ++i)
{
    // ...
}
for (int i = 0; i < num_atoms; ++i)
{
    for (int j = 0; j <= i; ++j)
    {
        // ...
    }
}
  • For variables that are pure mathematical or physical quantities, a convenient name can be used.

Examples of mathematical or physical quantities.
const double PISquared4 = 39.47841760435743; // 3.14^2*4
// This implements Lennard-Jones energy.
double CalcLennardJonesEnergy(double r, double epsilon, double sigma)
{
    // ...
}
  • When third-party code is called, variable names can follow its styles.

Examples of variable names following third-party code styles.
// This is a Win32 API.
HWND   hWnd;
WPARAM wParam;
LPARAM lParam;
SendMessage(hWnd, WM_FONTCHANGE, wParam, lParam);

Class and Structure Format

  • public declarations go before private ones.

  • Self-defined classes, structures, typedef, or enums go first, then member functions, then member variables.

  • Static members go before non-static ones.

  • Constant members go before non-constant ones.

Examples of a class declarations.
class StructureGenerator {
public:
    class Placement {
    public:
        enum Way { fix, box };

        Placement(const string& text);
        ~Placement(void);

        static const int Assigned;
        static const int Random;

        string note;
        Way    way;

    private:
        void ParseFix(const vector<string>& words););
    };

    StructureGenerator(const Cluster& clu);
    ~StructureGenerator(void);

    void SampleOne(Cluster& clu, bool do_coarse_optimization) const;
    void RandomOne(Cluster& clu) const;

private:
    void CoarseOptimize(Cluster& clu) const;

    static const double PI2;
    static const double ZeroNorm;
    static const double dS;

    vector<Molecule>  mols;
    vector<int>       num_mols;
};
  • Initializer list is always preferred.

Flow Control

++ and --

  • Always use ++x or --y.

  • Never use x++ or y--.

if and while Statement

  • Do not use integers to represent true or false.

  • Do not compare bool variables with true or false directly!

  • Explictly use Boolean expressions!

Examples of Boolean expressions.
while (true) // Do not use "while (1)"!
{
    // ...
}

if (flag == 0) // Do not use "if (flag)"!
{
    // ...
}

bool calc_eigen_vectors = true;
if (!calc_eigen_vectors) // Do not use "if (calc_eigen_vectors == false)"!
{
    // ...
}
  • Characters must be compared with '\0'.

  • Pointers must be compared with nullptr.

Examples of pointer comparison.
if (fd != nullptr) // Do not use if (fd) !
{
    // ...
}

for, switch, and goto Statement

  • In condition statement of for, left-close-right-open is preferred, like for (int i = 0; i < 5; ++i).

  • In switch, there must be break in all case and default blocks unless on purpose. default must be present!

  • goto should be used with cautions, and goto label must have a meaningful name.

Examples of goto.
if (key == "rhf")
{
    // ...
    goto finish_parsing;
}
if (key == "rmp2")
{
    // ...
    goto finish_parsing;
}
ErrorTermination("Unknown keyword \"%s\".", key.c_str());
finish_parsing_assignment_map:;

Usage of const

Hint

Always use const whenever possible! It will save one a large amount of debugging time!

  • Never use #define to define constants.

  • Use const for constants. Dependencies should be used.

Examples of constants.
const double PI2 = 6.28318530717958623200;
const double PI4 = PI2 * 2; // Dependence of this constant to PI2 is used.
  • Use const for variables that will not be changed in the following, even it is not a constant!

Examples of variables that will not be changed in the following.
const double r = sqrt(x * x + y * y); // r is never changed in this function.
const doulbe r_rec = 1. / r;

double energy  = 0.;
energy += k * pow(r - r0, 1.5);
energy += r_rec * q;
  • For pointer or reference functions arguments, always use const when they are only input!

Examples of pointer or reference functions arguments that are only input!
void CopyArray(int num, const double* source_array, double* target_array)
{
    // ...
}

string Reverse(const string& str)
{
    // ...
}
  • For member functions that will not modify member variables, always use const.

Examples of member functions that will not modify member variables
class Point {
public:
    Point(void);
    ~Point(void);

    double GetX(void) const; // It will not change anything, so const is added.
    double GetY(void) const; // It will not change anything, so const is added.

    void SetX(double tx);
    void SetY(double ty);

private:
    double x;
    double y;
};
  • When go over an STL container in a read-only way, use const_iterator instead of iterator.

Function Design

Arguments and Return

  • Arguments should be named and ordered meaningfully. Input arguments go first.

  • It there is no argument, use void.

  • Do not use default arguments.

  • For input pointer or reference arguments, add const.

  • For member functions that do not modify the class or structure, add const.

  • For non-native types, always use pass by reference or pointers.

  • Pass by reference is preferred.

  • Pass by pointer is only used when necessary.

  • Function should not return a complicated class or structure. Use output arguments to do that.

Examples of function arguments and return design.
// "source_array" and "target_array" are meaningful.
// If one writes "CopyArray(int num, const double* array1, double* array2)", it will take a moment
// to distinguish which one is the source or the target.
// Note that there is a "const" before "source_array".
void CopyArray(int num, const double* source_array, double* target_array);

// Note that there is a "const" after the function.
// Do not lose "void".
const string& GetName(void) const;

// Do not use "vecter<double> coords" (pass by value)!
// One can also use vecter<double>* coords, but reference is always preferred.
void Rotate(vector<double>& coords, double angle);

// Do not use "vector<double> GeneratePoints(int num);"!
void GeneratePoints(int num, vector<double>& points);
  • If there are many arguments, consider to design a class or structure of input or output arguments.

Examples of a function containing many arguments.
// This is a bad design!
void Parse(const string& mass_fn, const string& position_fn, const string& velocity_fn,
           const string& experimental_condition_fn, vector<double>& masses, vector<double>& positions,
           vector<double>& velocities, double& temperature, double& pressure);

// This is a good design.
struct ParseInputArgs {
    string mass_fn;
    string position_fn;
    string velocity_fn;
    string experimental_condition_fn;
};
struct ParseOutputArgs {
    vector<double> masses;
    vector<double> positions;
    vector<double> velocities;
    double&        temperature;
    double&        pressure;
};
void Parse(const ParseInputArgs& input_args, ParseOutputArgs& output_args);
  • Function arguments should have the highest compatibility, especially for low-level functions.

Examples of function argument compatibility.
// Bad compatibility. Can only be used for "vector".
void Add(int num, const vector<double>& a, const vector<double>& b, vector<double>& c)
{
    for (int i = 0; i < num; ++i)
    {
        c[i] = a[i] + b[i];
    }
}

// Better compatibility.
void Add(int num, const double* a, const double* b, double* c)
{
    for (int i = 0; i < num; ++i)
    {
        c[i] = a[i] + b[i];
    }
}
const int num = 100;
vector<double> a(num, 1.);
vector<double> b(num, 2.);
vector<double> c(num, 0.);
// For "vector<double>", their addresses can be obtained by ".data()".
Add(num, a.data(), b.data(), c.data());

Inside Function

  • Do not use static local variables unless necessary!

  • Do not use inline functions.

  • One function focuses one thing!

Numbers

  • Never use unsigned like unsigned int, unless third-party functions require.

  • For assignment of floating-point numbers, always use . or E.

  • For accurate floating-point numbers, at most 25 decimals are enough.

Examples of floating-point numbers.
double x        = 5.93;
double y        = -1.; // Do not use "y = -1"!
double z        = 5.E-5;
const double PI = 3.1415926535897932384626433; // 25 decimals.

Forbidden C++ Features

  • Do not use C++ exceptions.

  • Do not use class friends.

  • auto is only used for lambda functions.

Comments

  • In this project, Doxygen is used for docmentation.

  • Comments should be as readable as narrative text. Important information or algorithms that are not easy for others to understand should be commented in detail. If necessary, use Latex code in \f$...\f$ or \f[...\f] to give mathematical details.

  • In all files (head and source files),at the beginning authors must be listed in the following Doxygen format:

Format of commenting authors.
/**
 * @file
 * @author
 * - Isaac Newton
 * - Leonhard Euler
 */
  • In head files, all classes, structures, typedef, and enum must be commented with @brief and @details (optional) in the following format:

Format of commenting classes, structures, typedef, and enum.
/**
 * @brief A collection of geometrical operations.
 * @details
 * If not necessary, details can be omitted.
 */
class SpaceTransformation {
public:
    /**
     * @brief A pre-computed data structure of \ref Cluster::RigidCoord.
     */
    class ProcessedRigidCoord {
        // ...
    };
    // ...
};
  • In head files, all functions must be commented with @brief. All parameters must be commented with @param (with proper [in], [out], or [in,out]). Returns must be commented with @return in the following format. If necessary, @details, @note, @warning can be given.

Format of commenting functions.
/**
 * @brief Calculate space-fixed coordinates.
 * @details
 * Note that each molecule has its own rigid coordinate, so they only have 6*3*number of atoms in the molecule
 * components.
 * @param[out] x The space-fixed coordinates: x0, y0, z0, x1, y1, z1, ...
 * @param[out] dxdrc If it is not \c nullptr, the derivatives of \p x over rigid coordinates will be calculated:
 * dx0/dphiZ1, dx0/dphiY2, dx0/dphiZ3, dx0/dmX, dx0/dmY, dx0/dmZ, dy0/dphiZ1, ...
 * @warning
 * - The size of \p x should be at least 3*\c GetNumAllAtoms.
 * - The size of \p dxdrc should be at least 3*6*\c GetNumAllAtoms if it is not \c nullptr.
 */
void CalcSpaceFixedCoords(double* x, double* dxdrc) const;
  • In head files, all class, structure, and enum members must be commented with ///<:

Format of commenting class, structure, and enum members.
/**
 * @brief Possible ways of placement.
 */
enum Way {
    fix,     ///< See \ref ParseFix for details.
    box,     ///< See \ref ParseBox for details.
};

string note;                ///< Some human-understandable description of how to place a component.
Way    way;                 ///< How to place a component.
int    integer_params[32];  ///< The integer parameters for placing the molecules. Their meanings depend on \ref
                            ///< way.
  • Use @ref, \p, \c, @code ... @endcode whenever possible and necessary.

Exceptions

In real coding projects, there may be some cases that changing the rules above can significantly improve the performance of the program. In this case, one can consider breaking the rules. But please make sure that this is absolutely necessary!