Molecular simulation checklist

The objective of this page is to provide a non-exhaustive list of tasks that must be validated to ensure the reliability of a molecular simulation. This page contains 3 categories:

  • Before- What to do when preparing a simulation.
  • During - What to look at when the simulation is running.
  • After - What are the indication that the simulations ran smoothly, or not.

If you are new to LAMMPS, this website is for you, see for example this tutorial for beginners.

Click here if you are looking for help with your project, if you want to support me (for free or not), or if you have any suggestions for these tutorials.

Before starting the simulation

What is your physical question?

It is probably an obvious one, but it is much better to have a clear physical question in mind before starting your simulations, as it will help you designing your system, choose what observable to extract, etc. Of course, sometimes you don't have a clear question from the start, and its OK too. In this case you can just run the molecular dynamics simulation and analyse the trajectories post-mortem to decide what data you want to use.

Do you have the right force field?

The force field is a crutial aspect of a molecular simulation. It governs how the atoms interact with each others, how a molecule deforms, what crystal structure a solid adopts at a given temperature, etc. Reviewers often ask you to justify your choice of force field, and you better have an answer. You will have to read a few publications to identify the best choice for you. Generally, you choose a force field because it reproduces a relevant experimental result. If you can't find a force field calibrated from an experimental result that is relevant for you, try to find one calibrated from quantum mechanics. Alternatively, you can create a new force field, or re-calibrate an existing one, but that's a lot of work.

Example. If you want to simulate the dielectric properties of water near an interface, the four point model TIP4P/\(\epsilon\) is probably a good choice for the water molecules, as it has been specifically calibrated to reproduce the dielectric constant of bulk water. Still, keep in mind that TIP4P/\(\epsilon\) was calibrated for bulk water, and you have no guaranty that it will work at interfaces. Sometimes, polarisable water model are a better choice, but they are also more computationally expensive, and will make it difficult for you to simulate the system you want (see the part about computational power below).

Is your timestep suitable?

The timestep if the time interval for which the simulation progresses. The proper value for the timestep usually comes with the force field. For traditional all-atoms molecular dynamics its usually 1 femto second, sometimes 2 fs. For complex force fields such as reaxff, for which bonds can break and form, the force between interacting atoms is nonlinear and a smaller timestep is needed, typically 0.1 fs. If needed, you can use smaller values, but never use a larger value for your timestep except if you have a good reason for that.

Do you have enough computational power?

Before starting a simulation, it is good to have an idea of the CPU time (duration of the simulation x number of CPU) that is needed. To make the estimate, you can run a test simulation for a few timesteps, and make a 'rule of 3' to estimate the required duration for the full simulation. Then, whether to run the full simulation or not depends on the infrastructure you have. A quick rule-of-thumb: if you only have your PC, it will be difficult to simulate systems larger than a few nanometers for more than 1-10 nanoseconds. If you have access to a super computer, you can probably go up to tens of nanometers and hundred of nanoseconds. If you don't have enough computational power, you better compromise, for example try to use a smaller system, or try a less demanding force field.

What thermodynamic ensemble is the more adapted?

There are plenty of ensembles that can be simulated, from the more classics like NVE (constant number of atoms, volume, and energy), NVT (constant number of atoms, volume, and temperature), and NPT (constant number of atoms, pressure, and temperature) to more exotic like \(\mu\)VT (constant chemical potential, volume, and temperature). To choose the best ensemble, it mostly comes to logic, and should arise from your physical question and systems. For instance, if what you want to measure is the pressure in your system, then NPT is probably not the best choice since with NPT you impose the pressure. If you have doubt, just look for systems similar as yours in the literature, or have a look at the great book by Frenkel and Smit : Understanding Molecular Simulation.

During the run

Does the system look good?

At the start of a new simulation, open your trajectory file with VMD to see if the system resembles what you think you imposed. Verify that atoms are not moving 'too much', that molecules have the expected shape, etc. If your system is periodic (e.g. bulk solid phase), use the periodic image option in representation, it helps spot misplaced atoms.

Does the system behave well?

From the thermodynamic values printed in the terminal, make sure that the temperature, volume, etc. are not behaving erratically, and that they have values oscillating near the desired values. The total energy is also a good indicator if you are doing NVT or NPT simulations: if your system started in a 'non-natural' configuration (for example with the molecules of your fluid disposed on a square lattice), then the energy should decrease at the start, then should reach a certain equilibrium.

What are the warnings?

Your simulation can run fine even if you have some warnings at the beginning of the simulation. If it is the case, try to understand why you have these warnings. Sometimes, warnings are normal and can be ignored, but other times they can be of great help to spot an issue with your code.

After the simulation is over

Was there dangerous builds?

If you see in the terminal or in the log file that there have been dangerous builds during the simulation, it means that some atoms moved too close to each other during the simulation. Either your timestep is too large, or your system is too far from equilibrium, resulting in large velocities of the atoms. You can try to understand the origin of the issue by having a look at the simulation trajectory using VMD. One solution can be to reduce the timestep, at least at the beginning of the simulation. Another solution is to make the recalculations of the neighbour lists more frequent, as in this tutorial.

Did your system reach equilibrium?

If you are performing equilibrium measurements (for example diffusion coefficient measurement from MSD), make sure that your system was indeed at equilibrium before the data acquisition started. You can make sure of that by extracting some data over time (for example the density in a bulk liquid simulation, or the RDF in a bulk solid simulation). Such data usually converge following an exponential-like curve versus the time, from which you can evaluate the duration required for your system to reach the equilibrium. If the data seems to never reach a plateau, may be longer simulations are needed, or a smaller system, or may be there is something really wrong in the dynamics of the atoms, like an unexpected external force, or a thermostat that is not doing what it is supposed to do.

Click here if you are looking for help with your project, if you want to support me (for free or not), or if you have any suggestions for these tutorials.