# Free energy calculation

### A simple sampling of a free energy barrier using WHAM

**The objective of this tutorial** is to measure the free
energy profile across a barrier potential using two methods:
free sampling and umbrella sampling. For the sake of simplicity
and in order to reduce computation time, the barrier potential
will be imposed artificially to the atoms, see the image below.
The procedure is valid for more complex systems.

Left: potentiel \(U\) applied to the atoms to create an
exclusion zone in the middle. Right: force \(F\) derivative of \(U\).

There are two main parts to this tutorial:

- Free sampling - First, the free energy will be calculated using the relatively intuitive free sampling method.
- Umbrella sampling - Second, the more advanced method called umbrella sampling will be used.

If you are new to LAMMPS, I recommend you to follow this simpler tutorial first.

Click here if you are looking for help with your project, if you want to support me (for free or not), or if you have any suggestions for these tutorials.

## Method 1: Free sampling

**Introduction**

One way to calculate the free energy profile is to extract the
partition function from a classic (unbiased) molecular dynamics
simulation, and then to estimate the Gibbs free energy using
\[\Delta G = -RT \ln(p/p_0),\]
where \(\Delta G\) is the free energy difference, \(R\) the
gas constant, \(T\) the temperature, \(p\) the
pressure, and \(p_0\) the reference pressure.

As an illustration, let us apply this method to an extremely
simple configuration that consists in a few particles diffusing
in a box in presence of a position-dependent repealing force that
makes the centre of the box a relatively unfavourable area
to explore.

**Basic LAMMPS parameters**

Create an input script, and copy the following lines:

```
variable sigma equal 3.405 # Angstrom
variable epsilon equal 0.238 # Kcal/mol
variable U0 equal 2*${epsilon} # Kcal/mol
variable dlt equal 0.5 # Angstrom
variable x0 equal 5 # Angstrom
# --------------------- initialise the simulation
units real
atom_style atomic
pair_style lj/cut 3.822 # 2^(1/6) * 3.405 WCA potential
pair_modify shift yes
boundary p p p
```

Here we start by defining variables for the Lennard-Jones interaction (\(\sigma\) and \(\epsilon\))
and for the repulsive potential \(U (x)\) : \(U_0\), \(\delta\), and \(x_0\), see the analytical expression
below and the plot at the top of this page.

If you followed tutorial 1, you
must be familiar with these commands. The system
of unit 'real' (for which energy is
in kcal/mol, distance in Ã…ngstrom, time in femtosecond)
has been chosen for practical reason, as the
WHAM algorithm we are going to use in the second part of the
tutorial automatically assumes the energy to be in kcal/mol.
Atoms will interact through a Lennard-Jones potential with
a cut-off equal to \( \sigma \times 2 ^ {1/6} \) (i.e. a WCA repulsive potential).
The potential is shifted to be equal to 0 at the cut-off using the pair_modify.

**System creation and settings**

Let us define the simulation block and randomly add atoms:

```
# --------------------- define the system
region myreg block -25 25 -20 20 -20 20
create_box 1 myreg
create_atoms 1 random 60 341341 myreg
# --------------------- settings
mass * 39.95
pair_coeff * * ${epsilon} ${sigma}
neigh_modify every 1 delay 4 check yes
```

Argon has been chosen as the gas of interest, which explains the values of the Lennard-Jones parameters \(\sigma\) and \(\epsilon\), as well as the mass \(m = 39.95\) grams/mole. The variables \(U_0\), \(\delta\), and \(x_0\) are used to create the potential. I have chosen it to be of the form: \[U(x) / U_0 = \arctan \left( \dfrac{x+x_0}{\delta} \right)- \arctan \left( \dfrac{x-x_0}{\delta} \right),\] (see the image at the beginning of this page). From the derivative of the potential with respect to \(x\), we obtain the expression for the force that we are going to impose to the particles in the simulation, \[F(x)=U_0/((x-x_0)^2/\delta^2+1)/\delta-U_0/((x+x_0)^2/\delta^2+1)/\delta.\]

**Energy minimization and equilibration**

Let us minimize the energy, and then impose \(F(x)\) to all of the atoms in the simulation using the 'addforce' command:

```
# --------------------- Run
minimize 1e-4 1e-6 100 1000
reset_timestep 0
variable U atom ${U0}*atan((x+${x0})/${dlt})-${U0}*atan((x-${x0})/${dlt})
variable F atom ${U0}/((x-${x0})^2/${dlt}^2+1)/${dlt}-${U0}/((x+${x0})^2/${dlt}^2+1)/${dlt}
fix myadf all addforce v_F 0.0 0.0 energy v_U
```

Finally, let us combine the fix nve with a Langevin thermostat to run a molecular dynamics simulation. With these two commands, the MD simulation is effectively in the NVT ensemble (constant number of atoms \(N\), constant volume \(V\), and constant temperature \(T\)). Let us perform an equilibration step of 2000000 timestep (4 nanoseconds). To make sure that 4 ns is long enough, let us record the evolution of the number of atoms in the central (energetically unfavorable) region called 'mymes':

```
fix mynve all nve
fix mylgv all langevin 119.8 119.8 100 1530917
region mymes block -5 5 INF INF INF INF
variable n_center equal count(all,mymes)
fix myat all ave/time 100 500 50000 v_n_center file density_evolution.dat
timestep 2.0
thermo 100000
run 2000000
```

**Run and data aquisition**

Finally, let us record the density profile of the atoms along the \(x\) axis using the 'ave/chunk' command. A total of ten density profiles will be printed. Timestep is reset to 0 to synchronise with the output times of density/number, and the fix 'myat' is canceled (it has to me canceled before a reset time).

```
unfix myat
reset_timestep 0
compute cc1 all chunk/atom bin/1d x 0.0 1.0
fix myac all ave/chunk 10 100000 1000000 cc1 density/number file density_profile_10run.dat
dump mydmp all atom 100000 dump.lammpstrj
run 10000000
```

The simulation needs a few minutes to complete. You can visualize the dump file using VMD.

**Data analysis**

First, let us make sure that the equilibration duration of 4 ns is long enough by looking at the 'density_evolution.dat' file (left panel):

Left: evolution of the number of atoms in the central region
during equilibration. Right: averaged density profile during the run (\( \rho_0 = 0.0011 \)).

Here we can clearly see that the number of atom in the central region quickly evolves to an equilibrium value, and that the equilibration is long enough. We can then plot the averaged density profile (full line) and error (dashed line) (right panel). I used \(\rho_0 = 0.0011 \) to get the partiion function \(\rho / \rho_0\).

Then, let us plot \(-R T \ln(\rho/\rho_0)\) and compare with the imposed (reference) potential \(U\). The results show a good agreement with the expected energy profile (despite a bit of noise in the central part):

Left: calculated potential after 1 run (2 ns).
Right:calculated potential after 10 runs (20 ns).

**The limits of free sampling**

If we increase the value of \(U_0\), the average number of atoms in the central region will decrease, making it difficult to obtain a good resolution for the free energy profile. For instance, multiplying \(F\) by a factor of 5, one gets an average concentration \( \rho \sim 0\) in the central part, which makes it impossible to estimate \(U\) (unless running the simulation for a much longer time (possibly days)):

Results for larger value of \(F\). Left: averaged density profile 10 run (20 ns).
Right:calculated potential after 10 runs (20 ns).

In that case, it is necessary to use more evolved methods, such as umbrella sampling, to extract free energy profiles. This is what we are going to do next.

## Method 2: Umbrella sampling

**Introduction**

Umbrella sampling is a 'biased molecular dynamics' method, i.e a method in which additional forces are added to the atoms in order to make the 'unfavourable states' more likely to be explored: here, we are going to force a single atom to explore the central region. Starting from the same system as previously, we are going to add a potential \(V\) to one of the particle, and force it to move along the axe \(x\). The chosen path is called the axe of reaction. The final simulation will be analysed using the weighted histogram analysis method (WHAM), which allows to remove the effect of the bias and eventually deduce the unbiased free energy profile.

**LAMMPS input script**

Create a new script and copy the following lines:

```
# --------------------- define a bunch of variables
variable sigma equal 3.405 # Angstrom
variable epsilon equal 0.238 # Kcal/mol
variable U0 equal 2*${epsilon} # Kcal/mol
variable dlt equal 0.5 # Angstrom
variable x0 equal 5.0 # Angstrom
variable k equal 0.0205 # Kcal/mol/Angstrom^2
# --------------------- initialise the simulation
units real
atom_style atomic
pair_style lj/cut 3.822 # 2^(1/6) * 3.405 WCA potential
pair_modify shift yes
boundary p p p
# --------------------- define the system
region myreg block -25 25 -20 20 -20 20
create_box 2 myreg
create_atoms 2 single 0 0 0
create_atoms 1 random 5 341341 myreg
# --------------------- settings
mass * 39.948
pair_coeff * * ${epsilon} ${sigma}
neigh_modify every 1 delay 4 check yes
group topull type 2
# --------------------- run
variable U atom ${U0}*atan((x+${x0})/${dlt})-${U0}*atan((x-${x0})/${dlt})
variable F atom ${U0}/((x-${x0})^2/${dlt}^2+1)/${dlt}-${U0}/((x+${x0})^2/${dlt}^2+1)/${dlt}
fix pot all addforce v_F 0.0 0.0 energy v_U
fix mynve all nve
fix mylgv all langevin 119.8 119.8 100 1530917
timestep 2.0
thermo 100000
run 2000000
reset_timestep 0
dump mydmp all atom 1000000 dump.lammpstrj
```

**Explanations:** This code resembles the one of Method 1, except for the
additional particle of type 2. This particle is identical to the
particles of type 1 (same mass and Lennard-Jones parameters), and will be the only one to feel the
biasing potential.

Let us create a loop with 67 steps, and move
progressively the centre of the bias potential by increment of 0.3 nm:

```
variable a loop 50
label loop
variable xdes equal ${a}-25
variable xave equal xcm(topull,x)
fix mytth topull spring tether ${k} ${xdes} 0 0 0
run 200000
fix myat1 all ave/time 10 10 100 v_xave v_xdes file position.${a}.dat
run 200000
unfix myat1
next a
jump SELF loop
```

**Explanations:** The spring command serves to impose the
additional harmonic potential with spring constant \(k\).
The centre of the harmonic potential \(x_\text{des}\) successively
takes values from -25 to 25. For each value of \(x_\text{des}\),
an equilibration step of 400 ps is performed, followed by a
step of 400 ps during which the position along \(x\) of the particle
is saved in data files (one data file per value of \(x_\text{des}\)).
You can increase the duration of the run for better samplings, but 0.4 ps
returns reasonable results despite being really fast (no more than a few minutes).

Hint: the difficult part is to choose the value of \(k\). You want the biasing potential to be strong enough to force the atom to move along the axis, and you also want the fluctuations of the atom position to be large enough to have some overlap in the density probability of two neighbor positions, like we have here.

Density probability for each run. Note the good overlapping.

**WHAM algoritm**

In order to treat the data, we are going to use the WHAM algorithm. You can download and compile the version of Alan Grossfield. In order to apply the WHAM algorithm to our simulation, we first need to create a metadata file. This file simply contains the paths of the data file, the value of \(x_\text{des}\), and the values of \(k\). To generate the file more easily, you can run this script using Octave or Matlab (assuming that the wham algorithm is located in the same folder as the LAMMPS simulations)

```
file=fopen('metadata.dat','wt');
for a=1:50
X=['./position.',num2str(a),'.dat ',num2str(a-25),' 1.5'];
fprintf(file,X);
fprintf(file,'\n');
end
```

The generated file named metadata.dat looks like that:

```
./position.1.dat -24 1.5
./position.2.dat -23 1.5
./position.3.dat -22 1.5
./position.4.dat -21 1.5
./position.5.dat -20 1.5
(...)
./position.48.dat 23 1.5
./position.49.dat 24 1.5
./position.50.dat 25 1.5
```

Then, simply run the following command

`./wham -25 25 50 1e-8 119.8 0 metadata.dat PMF.dat`

where -25 and 25 are the boundaries, 50 the number of bins, 1e-8 the tolerance, and 119.8 the temperature. A file named PMF.dat has been created, and contains the free energy profile in Kcal/mol.

**Results**

We can compare the PMF with we the imposed potential, and the agreement in again quite good despite the very short calculation time:

Potential as predicted from umbrella sampling for different run durations.
Note that their is a small
difference in width between the calculated and the imposed potential, I am not
sure why, may be this can be improved by using different values of \(k\).

## Going further with exercises

Request the solutions by email, or register here and access all the solutions + additional LAMMPS content.

**Exercise 1: Monte Carlo versus molecular dynamics**

Use a Monte Carlo procedure to equilibrate the system instead of molecular dynamics. Is it more efficient than molecular dynamics?

Hint: pure MC move can be made using the fix gcmc.

**Exercise 2 : Binary fluid simulation**

Create a molecular simulation with two species, and apply a different potential on them using addforce to create the following situation:

**Exercise 3 : Adsorption energy of a molecule at a solid wall**

Apply umbrella sampling to calculate the free energy profile of a molecule of your choice normally to a solid wall.

Hint: A non-equilibrated system contraining a few ethanol molecules can be downloaded: here (the method to create this system using molecules structures downloaded from the ATB is given in the solution to the exercice)

Click here if you are looking for help with your project, if you want to support me (for free or not), or if you have any suggestions for these tutorials.