Free energy calculation

A simple sampling of a free energy barrier using WHAM


Responsive image

The objective of this tutorial is to measure the free energy profile across a barrier potential using two methods: free sampling and umbrella sampling. For the sake of simplicity and in order to reduce computation time, the barrier potential will be imposed artificially to the atoms, see the image below. The procedure is valid for more complex systems.

Responsive image
Left: potentiel \(U\) applied to the atoms to create an exclusion zone in the middle. Right: force \(F\) derivative of \(U\).

There are two main parts to this tutorial:

If you are new to LAMMPS, I recommend you to follow this simpler tutorial first.
Contact me if you have any question about these tutorials (or about your own project).
You can support the creation of content for LAMMPS for 1 euro per month (and access extra content and tips).

Method 1: Free sampling

Introduction

One way to calculate the free energy profile is to extract the partition function from a classic (unbiased) molecular dynamics simulation, and then to estimate the Gibbs free energy using \[\Delta G = -RT \ln(p/p_0),\] where \(\Delta G\) is the free energy difference, \(R\) the gas constant, \(T\) the temperature, \(p\) the pressure, and \(p_0\) the reference pressure.

As an illustration, let us apply this method to an extremely simple configuration that consists in a few particles diffusing in a box in presence of a position-dependent repealing force that makes the centre of the box a relatively unfavourable area to explore.



Basic LAMMPS parameters

Create an input script, and copy the following lines:

variable		sigma equal 3.405 # Angstrom
variable		epsilon equal 0.238 # Kcal/mol
variable		U0 equal 2*${epsilon} # Kcal/mol
variable		dlt equal 0.5 # Angstrom
variable		x0 equal 5  # Angstrom

# --------------------- initialise the simulation
units			real
atom_style		atomic
pair_style		lj/cut 3.822 # 2^(1/6) * 3.405 WCA potential
pair_modify 		shift yes
boundary		p p p

Here we start by defining variables for the Lennard-Jones interaction (\(\sigma\) and \(\epsilon\)) and for the repulsive potential \(U (x)\) : \(U_0\), \(\delta\), and \(x_0\), see the analytical expression below and the plot at the top of this page.

If you followed tutorial 1, you must be familiar with these commands. The system of unit 'real' (for which energy is in kcal/mol, distance in Ã…ngstrom, time in femtosecond) has been chosen for practical reason, as the WHAM algorithm we are going to use in the second part of the tutorial automatically assumes the energy to be in kcal/mol. Atoms will interact through a Lennard-Jones potential with a cut-off equal to \( \sigma \times 2 ^ {1/6} \) (i.e. a WCA repulsive potential). The potential is shifted to be equal to 0 at the cut-off using the pair_modify.



System creation and settings

Let us define the simulation block and randomly add atoms:

# --------------------- define the system
region			myreg block -25 25 -20 20 -20 20
create_box		1 myreg
create_atoms		1 random 60 341341 myreg

# --------------------- settings
mass			* 39.95
pair_coeff		* * ${epsilon} ${sigma}
neigh_modify		every 1 delay 4 check yes

Argon has been chosen as the gas of interest, which explains the values of the Lennard-Jones parameters \(\sigma\) and \(\epsilon\), as well as the mass \(m = 39.95\) grams/mole. The variables \(U_0\), \(\delta\), and \(x_0\) are used to create the potential. I have chosen it to be of the form: \[U(x) / U_0 = \arctan \left( \dfrac{x+x_0}{\delta} \right)- \arctan \left( \dfrac{x-x_0}{\delta} \right),\] (see the image at the beginning of this page). From the derivative of the potential with respect to \(x\), we obtain the expression for the force that we are going to impose to the particles in the simulation, \[F(x)=U_0/((x-x_0)^2/\delta^2+1)/\delta-U_0/((x+x_0)^2/\delta^2+1)/\delta.\]



Energy minimization and equilibration

Let us minimize the energy, and then impose \(F(x)\) to all of the atoms in the simulation using the 'addforce' command:

# --------------------- Run
minimize		1e-4 1e-6 100 1000
reset_timestep		0

variable		U atom ${U0}*atan((x+${x0})/${dlt})-${U0}*atan((x-${x0})/${dlt})
variable		F atom ${U0}/((x-${x0})^2/${dlt}^2+1)/${dlt}-${U0}/((x+${x0})^2/${dlt}^2+1)/${dlt}
fix 			myadf all addforce v_F 0.0 0.0 energy v_U

Finally, let us combine the fix nve with a Langevin thermostat to run a molecular dynamics simulation. With these two commands, the MD simulation is effectively in the NVT ensemble (constant number of atoms \(N\), constant volume \(V\), and constant temperature \(T\)). Let us perform an equilibration step of 2000000 timestep (4 nanoseconds). To make sure that 4 ns is long enough, let us record the evolution of the number of atoms in the central (energetically unfavorable) region called 'mymes':

fix			mynve all nve
fix			mylgv all langevin 119.8 119.8 100 1530917

region			mymes block -5 5 INF INF INF INF 
variable		n_center equal count(all,mymes)
fix			myat all ave/time 100 500 50000 v_n_center file density_evolution.dat

timestep		2.0
thermo			100000
run			2000000



Run and data acquisition

Finally, let us record the density profile of the atoms along the \(x\) axis using the 'ave/chunk' command. A total of ten density profiles will be printed. Timestep is reset to 0 to synchronize with the output times of density/number, and the fix 'myat' is canceled (it has to me canceled before a reset time).

unfix			myat
reset_timestep		0

compute 		cc1 all chunk/atom bin/1d x 0.0 1.0
fix 			myac all ave/chunk 10 100000 1000000 cc1 density/number file density_profile_10run.dat
dump 			mydmp all atom 100000 dump.lammpstrj

run			10000000

The simulation needs a few minutes to complete. You can visualize the dump file using VMD.



Data analysis

First, let us make sure that the equilibration duration of 4 ns is long enough by looking at the 'density_evolution.dat' file (left panel):

Responsive image
Left: evolution of the number of atoms in the central region during equilibration. Right: averaged density profile during the run (\( \rho_0 = 0.0011 \)).

Here we can clearly see that the number of atom in the central region quickly evolves to an equilibrium value, and that the equilibration is long enough. We can then plot the averaged density profile (full line) and error (dashed line) (right panel). I used \(\rho_0 = 0.0011 \) to get the partition function \(\rho / \rho_0\).


Then, let us plot \(-R T \ln(\rho/\rho_0)\) and compare with the imposed (reference) potential \(U\). The results show a good agreement with the expected energy profile (despite a bit of noise in the central part):

Responsive image
Left: calculated potential after 1 run (2 ns). Right:calculated potential after 10 runs (20 ns).



The limits of free sampling

If we increase the value of \(U_0\), the average number of atoms in the central region will decrease, making it difficult to obtain a good resolution for the free energy profile. For instance, multiplying \(F\) by a factor of 5, one gets an average concentration \( \rho \sim 0\) in the central part, which makes it impossible to estimate \(U\) (unless running the simulation for a much longer time (possibly days)):

Responsive image
Results for larger value of \(F\). Left: averaged density profile 10 run (20 ns). Right:calculated potential after 10 runs (20 ns).

In that case, it is necessary to use more evolved methods, such as umbrella sampling, to extract free energy profiles. This is what we are going to do next.

Method 2: Umbrella sampling


Introduction

Umbrella sampling is a 'biased molecular dynamics' method, i.e a method in which additional forces are added to the atoms in order to make the 'unfavourable states' more likely to be explored: here, we are going to force a single atom to explore the central region. Starting from the same system as previously, we are going to add a potential \(V\) to one of the particle, and force it to move along the axe \(x\). The chosen path is called the axe of reaction. The final simulation will be analysed using the weighted histogram analysis method (WHAM), which allows to remove the effect of the bias and eventually deduce the unbiased free energy profile.



LAMMPS input script

Create a new script and copy the following lines:

# --------------------- define a bunch of variables
variable		sigma equal 3.405 # Angstrom
variable		epsilon equal 0.238 # Kcal/mol
variable		U0 equal 2*${epsilon} # Kcal/mol
variable		dlt equal 0.5 # Angstrom
variable		x0 equal 5.0  # Angstrom
variable		k equal 0.0205 # Kcal/mol/Angstrom^2

# --------------------- initialise the simulation
units			real
atom_style		atomic
pair_style		lj/cut 3.822 # 2^(1/6) * 3.405 WCA potential
pair_modify 		shift yes
boundary		p p p

# --------------------- define the system
region			myreg block -25 25 -20 20 -20 20
create_box		2 myreg
create_atoms		2 single 0 0 0
create_atoms		1 random 5 341341 myreg

# --------------------- settings
mass			* 39.948
pair_coeff		* * ${epsilon} ${sigma}
neigh_modify		every 1 delay 4 check yes
group			topull type 2

# --------------------- run
variable		U atom ${U0}*atan((x+${x0})/${dlt})-${U0}*atan((x-${x0})/${dlt})
variable		F atom ${U0}/((x-${x0})^2/${dlt}^2+1)/${dlt}-${U0}/((x+${x0})^2/${dlt}^2+1)/${dlt}
fix 			pot all addforce v_F 0.0 0.0 energy v_U

fix			mynve all nve
fix			mylgv all langevin 119.8 119.8 100 1530917
timestep		2.0
thermo			100000
run			2000000
reset_timestep		0

dump 			mydmp all atom 1000000 dump.lammpstrj

Explanations: This code resembles the one of Method 1, except for the additional particle of type 2. This particle is identical to the particles of type 1 (same mass and Lennard-Jones parameters), and will be the only one to feel the biasing potential.

Let us create a loop with 67 steps, and move progressively the centre of the bias potential by increment of 0.3 nm:

variable 		a loop 50
label 			loop
variable		xdes equal ${a}-25
variable		xave equal xcm(topull,x)
fix			mytth topull spring tether ${k} ${xdes} 0 0 0
run			200000
fix			myat1 all ave/time 10 10 100 v_xave v_xdes file position.${a}.dat
run			200000
unfix			myat1
next 			a
jump 			SELF loop

Explanations: The spring command serves to impose the additional harmonic potential with spring constant \(k\). The centre of the harmonic potential \(x_\text{des}\) successively takes values from -25 to 25. For each value of \(x_\text{des}\), an equilibration step of 400 ps is performed, followed by a step of 400 ps during which the position along \(x\) of the particle is saved in data files (one data file per value of \(x_\text{des}\)). You can increase the duration of the run for better samplings, but 0.4 ps returns reasonable results despite being really fast (no more than a few minutes).

The difficult part is to choose the value of \(k\). You want the biasing potential to be strong enough to force the atom to move along the axis, and you also want the fluctuations of the atom position to be large enough to have some overlap in the density probability of two neighbor positions, like we have here.

Responsive image
Density probability for each run. Note the good overlapping between neighbor distributions.



WHAM algorithm

In order to treat the data, we are going to use the WHAM algorithm. You can download and compile the version of Alan Grossfield. In order to apply the WHAM algorithm to our simulation, we first need to create a metadata file. This file simply contains the paths of the data file, the value of \(x_\text{des}\), and the values of \(k\). To generate the file more easily, you can run this script using Octave or Matlab (assuming that the wham algorithm is located in the same folder as the LAMMPS simulations)

file=fopen('metadata.dat','wt');
for a=1:50
	X=['./position.',num2str(a),'.dat ',num2str(a-25),' 1.5'];
	fprintf(file,X);
	fprintf(file,'\n');
end

The generated file named metadata.dat looks like that:

./position.1.dat -24 1.5
./position.2.dat -23 1.5
./position.3.dat -22 1.5
./position.4.dat -21 1.5
./position.5.dat -20 1.5
(...)
./position.48.dat 23 1.5
./position.49.dat 24 1.5
./position.50.dat 25 1.5

Then, simply run the following command

./wham -25 25 50 1e-8 119.8 0 metadata.dat PMF.dat

where -25 and 25 are the boundaries, 50 the number of bins, 1e-8 the tolerance, and 119.8 the temperature. A file named PMF.dat has been created, and contains the free energy profile in Kcal/mol.



Results

We can compare the PMF with we the imposed potential, and the agreement in again quite good despite the very short calculation time:

Responsive image
Potential as predicted from umbrella sampling for different run durations. Note that their is a small difference in width between the calculated and the imposed potential, I am not sure why, may be this can be improved by using different values of \(k\).

Going further with exercises


Request the solutions by email, or become a patreon for 1 euro per month and access all the solutions + additional LAMMPS content.


Exercise 1: Monte Carlo versus molecular dynamics

Use a Monte Carlo procedure to equilibrate the system instead of molecular dynamics. Is it more efficient than molecular dynamics?

Pure MC move can be made using the fix gcmc.


Exercise 2 : Binary fluid simulation

Create a molecular simulation with two species, and apply a different potential on them using addforce to create the following situation:

Responsive image



Exercise 3 : Adsorption energy of a molecule at a solid wall

Apply umbrella sampling to calculate the free energy profile of a molecule of your choice normally to a solid wall.

A non-equilibrated system containing a few ethanol molecules can be downloaded: here (the method to create this system using molecules structures downloaded from the ATB is given in the solutions to the exercises)
Contact me if you have any question about these tutorials (or about your own project).