410-505 5.2 Statistical Physics: Ising Model and Classical Gas Fall 2011

Home   |   Syllabus   |   Tools   |   Topic 1   |   Topic 2   |   Topic 3   |   Topic 4   |   Topic 5

This Page:   Home   >>   Topic 5   >>   Application 2

Statistical Mechanics

Statistical mechanics models the behavior of systems containing a large number of similar components using the mathematics of probability. A detailed dynamical description of a particular system contains too much information to be useful. Statistical mechanics studies properties that have well defined values when averaged over a large number of similar systems called an ensemble.


A microstate is a detailed specification of the observable properties of a particular system. Each system in the ensemble can be in one of a fixed set of microstates when its properties are measured in detail. The most important property of a microstate is its energy , which has value in the -th microstate.


Suppose that detailed measurements are made on the systems in an ensemble. The probabilty of the -th microstate is defined

where is the number of systems observed to be in the -th microstate.

Average Values and Fluctuations

The average value of a property such as the energy is predicted to be

The observed energy values will be distributed around the average with a characteristic variance

The Canonical Ensemble

A canonical ensemble is a set of systems each of which has a fixed number of constituents in a container of fixed volume and which can exchange energy with a large thermal reservoir at fixed absolute temperature .

The probability of a microstate with energy in the canonical ensemble is given by the Boltzmann distribution

where is Boltzmann's constant, and

is the partition function of the ensemble.

The Microanonical Ensemble

A microcanonical ensemble is a set of systems each of which has a fixed number of constituents with fixed total energy in a container of fixed volume .

The probability of a microstate with energy in the microcanonical ensemble is

The Ising Model of Ferromagnetism

This idealized model of a magnetic material was assigned by Wilhelm Lenz to his student Ernst Ising\cite{ising-model} as a PhD thesis problem. Ising solved the one-dimensional model exactly in 1924 and found, to his disappointment, that it was not ferromagnetic.

The 2-dimensional model was investigated by Kramers and Wannier, Onsager, and others in the following years. It was found that the model had two phases, a ferromagnetic phase at low temperatures, and a paramagnetic phase at high temperatures. The two phases were separated by a second-order phase transition at the Curie temperature .

Two-Dimensional Ising Model of Ferromagnetism

Magnetism in matter is caused by charged particles moving in closed orbits or spinning around their axes. Recall that a current loop creates a magnetic field according to Ampere's law. A spinning charged particle has a magnetic moment associated with it.

A simple classical approximation to an atomic or electronic magnetic moment is provided by an Ising spin which can take two values

A two-dimensional magnet can be modeled by a set of spins located on a fixed two-dimensional lattice of sites. For example, we can have a square lattice with spins in the direction and in the direction such that .

The force between two magnets falls off rather rapidly, like . So a reasonable approximation is to assume that any spin interacts only with its 4 nearest neighbors to the north, south, east and west. The interaction energy can be approximated by

If the interaction strength the system is ferromagnetic: the energy is minimized if the spin point in the same direction . If the system is antiferromagnetic. represents an external magnetic field which couples to the magnetization

of the system. The spins prefer to line up with the magnetic field.

Ferromagnetism, Paramagnetism, and Curie Temperature

If , the system can exist in two different phases depending on the temperature . At low temperatures, the system is permanently magnetized. At sufficiently high temperatures, the magnetization of the system is zero. There is a critical value of the temperature called the Curie temperature at which a phase transition between the ferromagnetic (permanently magnetized) and paramagnetic phases occurs.

Monte Carlo Simulation

Monte Carlo simulations involve generating a subset of configurations or samples, chosen using a random algorithm from a configuration space, according to a probability distribution or weight function. Observables are then computed as averages over the samples.

Configuration or Sample

One sample or configuration of the magnet is a particular assignment of spin values, say

in which each spin is set either "up" or "down". According to statistical mechanics, the average value of an observable is got by weighting each configuration with the Boltzmann factor. For example, the average magnetization at some fixed temperature is given by

Configuration Space

The total number of configurations of the system is enormous even for small numbers of spins. For example if , , and

If we tried to enumerate the configurations at a billion per second on a very fast computer, it would take seconds = years to compute the average magnetization exactly!

Probability Distribution or Weight Function

The basic idea of a Monte Carlo calculation is to generate a reasonable number of configurations at random. The Boltzmann factor is an exponential function of energy which can vary enormously. The random configurations are therefore generated with probability determined by this exponential factor:

Monte Carlo Average

The problem now is to generate statistically independent configurations that are distributed according to the Boltzmann factor:

the average magnetization and energy are given by

Metropolis Algorithm for the Ising Model

How does one generate configurations distributed according to the Boltzmann factor? A very ingenious algorithm to do this was discovered by N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953). Applied to this model, the algorithm generates the next configuration in the sequence as follows:

Starting from some initial configuration, a suitable number of Monte Carlo steps are taken to allow the system to thermalize. After this the production phase begins and statistical averages are measured.

Efficient evaluation of Boltzmann factors

Computing the exponential factor at each Metropolis step is expensive. However, it is not hard to see that for a simulation at fixed and there are only 10 distinct values for . Pre-computing these 10 values at the start of the simulation saves the expense of computing the exponential thousands or millions of times during the simulation!

First consider the sum of the 4 neighboring spins:

This sum takes the following possible values:

Since , the product takes the same set of five values:

If the magnetic field , the term which takes two values needs to be taken into account. Thus there are 10 different values for .

These 10 values are pre-computed for fixed and , and are stored in a 2-dimensional array with 5 rows and 2 columns. Since array indices in Python and C++ must be , the first index of the array is computed as

and the second index of the array is computed as

Taking one Metropolis step

One of the spins is chosen at random and the Metropolis step is applied to it. If the spin is at one of the 4 edges of the lattice, then we use periodic boundary conditions to determine its neighbors.

One Monte-Carlo step per spin

A single Metropolis step will yield a configuration which is highly correlated with the current configuration. In Monte Carlo simulations, it is conventional to take at least Metropolis steps to generate the next configuration. This gives each spin the opportunity to change its state.

The program starts by getting input from the user. Thermalization steps are performed to let the system come to thermal equilibrium at the specified temperature. After thermalization, production steps are performed to take data, and accumulate averages.

Ferromagnetism below the Curie temperature

In 1941, H.A. Kramers and G.H. Wannier, Phys. Rev. 60, 252 (1941), used a duality argument to compute the exact value of the Curie temperature for the two dimensional Ising model on a square lattice. They showed that

In 1944, Lars Onsager, Phys. Rev. 65, 117 (1944), solved the model exactly in the thermodynamic limit and zero magnetic field . He showed, for example that the magnetization per spin

Close to the critical temperature

where is a critical exponent. Onsager's formula shows that for the 2-D Ising model.

The data show the magnetization per spin at , which is below the Curie temperature as a function of Monte Carlo step. The system appears to be a permanent magnet with negative magnetization. However, sudden reversals of the magnetization occur from time to time. It can be shown that a finite size system cannot have a permanent non-zero magnetization in zero magnetic field. Note also the fluctuations in the magnetization, which are a measure of the magnetic susceptibility of the system.

The next plot shows the magnetization per spin at which is above the Curie temperature. As expected, the average magnetization is zero, which shows that the system is a paramagnet.

Reducing critical slowing down

The Metropolis Monte Carlo method works very well in simulating the properties of the 2-D Ising model. However, close to the Curie temperature , the simulations suffer from critical slowing down, and more efficient algorithms are needed. These cluster algorithms are useful in many applications involving graphs and lattices, and they are also very interesting to study.

Critical divergences and spin-spin correlations

At the Curie temperature, some observables like the heat capacity per spin and magnetic susceptibility per spin become divergent (infinite) in the thermodynamic limit of an infinite system. These critical divergences are due to long range correlations between spins.

Consider two spins, at the origin of coordinates and at some other lattice site labeled by the index . The correlation between the pair of spins is defined to be

If the two spins are uncorrelated then this average will be zero or very small.

At the spins are all lined up and so

However, this is a somewhat trivial correlation because flipping will hardly affect if it is not a neighbor of .

Near , the situation is very different: the spins are constantly changing, but not independently; there are large domains (droplets) of parallel spins which persist for long periods of time. Thus, spins far apart from one another are strongly correlated.

At high temperatures, the spins fluctuate rapidly but almost independently of other spins.

To describe these properties of spin correlations, it is conventional to define the pair correlation function as

that is, by subtracting out the average values of the spins considered independently. Here we have defined the distance between the two spins

where is the lattice spacing of the 2-D square lattice. For a translationally invariant system, e.g., with the use of periodic boundary conditions on a finite lattice, the choice of is not significant: any other spin would do. The important variable is the distance between the two spins. For a large system, can considered to be a continuous variable.

The pair correlation function can be parametrized as follows for large :

where is the correlation length, is the dimensionality of space, and is a critical exponent ( for the 2-D Ising model).

The correlation length diverges at the critical temperature:

which accounts for long range spin correlations as approaches . In the 2-D Ising model, . At the correlation function decays algebraically:

Critical slowing down

The Ising model does not have dynamics built into it: there is no kinetic energy term associated with the spins . The Metropolis Monte Carlo method generates successive configurations of spins, but this does not represent the real time evolution of a system of spins.

In a real system, the dynamical variables are functions of time. An interesting quantity is the relaxation time, which is the time scale over which the system approaches equilibrium. If is a quantity which relaxes towards its equilibrium value , the the relaxation time can be theoretically as

Near the critical temperature, the relaxation time becomes very large and can be shown to diverge for an infinite system:

Here is the dynamical critical exponent associated with the observable . This phenomenon is called critical slowing down.

Autocorrelation time in Metropolis simulations

In a Metropolis simulation, the successive spin configurations also exhibit a type of critical slowing down near the phase transition temperature of the finite lattice. This is not the same as relaxation in a real system. However, it is useful to measure a relaxation time for the Metropolis ``dynamics'' because it helps to determine how many steps to skip in order to generate statistically independent configurations.

Recall that one Monte Carlo step per spin is taken conventionally to be Metropolis steps. If the correlation time is of the order of a single Monte Carlo step, then every configuration can be used in measuring averages. But if the correlation time is longer, then approximately Monte Carlo steps should be discarded between every data point.

The time autocorrelation function

where labels the Monte Carlo time step. If Monte Carlo steps separated in time by intermediate steps are truly uncorrelated, then should be zero (i.e., of where is the number of steps used in computing the averages ).

If the correlation function decays exponentially

then the exponential correlation time can be computed as the average

If the decay is exponential, then

This suggests another measure of correlation

which is called the integrated correlation time.

In Monte Carlo simulations, the autocorrelation time is often measured as the simulation is running:

The figure shows measurements of the energy and magnetization autocorrelation times for a lattice. Notice that the autocorrelation times become large near the critical temperature of the infinite system.

Codes to measure autocorrelations

Cluster Algorithms to Reduce Critical Slowing Down

Monte Carlo simulations close to a phase transition are affected by critical slowing down. In the 2-D Ising system, the correlation length becomes very large, and the correlation time , which measures the number of steps between independent Monte Carlo configurations behaves like

where the dynamic critical exponent for the Metropolis algorithm.

The maximum possible value for in a finite system of spins is , because cannot be larger than the lattice size! This implies that . This makes simulations difficult because the Metropolis algorithm time scales like , so the time to generate independent Metropolis configurations scales like . If the lattice size , the simulation time increases by a factor of 100.

There is a simple physical argument which helps understand why : The Metropolis algorithm is a local algorithm, i.e., one spin is tested and flipped at a time. Near the system develops large domains of correlated spins which are difficult to break up. So the most likely change in configuration is the movement of a whole domain of spins. But one Metropolis sweep of the lattice can move a domain at most by approximately one lattice spacing in each time step. This motion is stochastic, i.e., like a random walk. The distance traveled in a random walk scales like , so to move a domain a distance of order takes Monte Carlo steps.

This argument suggests that the way to speed up a Monte Carlo simulation near is to use a non-local algorithm.

Swendsen-Wang Cluster Algorithm

The essential idea of this algorithm suggested by R.H. Swendsen and J.-S. Wang, Phys. Rev. Lett. 58, 86 (1987), is to identify clusters of like spins and treat each cluster as a giant spin to be flipped according to a random criterion. It is necessary that the algorithm obey the detailed balance condition. Swendsen and Wang found the following algorithm based on ideas from percolation theory:

Freeze/delete bonds: The 2-D square lattice, periodic boundary conditions, has spins and bonds between spins. Construct a bond lattice as follows:

Note that constructing the bond lattice takes time of because there are bonds.

Cluster Decomposition: After the bond lattice has been set up, the spins are decomposed into clusters. A cluster is simply a domain of spins connected to one another by frozen bonds. The lattice obviously decomposes into clusters in a unique way, and the decomposition is a deterministic problem.

Cluster decomposition is potentially time consuming. A naive algorithm can take time of , so it is essential to use a decomposition algorithm that scales linearly with lattice size like Metropolis!

Spin Update: So far, constructing the bond lattice and identifying clusters has not changed any of the spins. The spins in each cluster are now ``frozen'' and the bonds between different clusters have been deleted. Each cluster is now updated by assigning a random new value to all of the spins simultaneously, i.e., generate a random deviate and flip all spins in that cluster if . Note that does not play a role in this flipping decision.

The spin update step scales like the number of clusters which is .

Swendsen and Wang showed that for this algorithm in the 2-D Ising model. Assuming that each Swendsen-Wang step scales like , the running time for the simulation scales like

which is much better than with Metropolis.

The figure above from their paper shows a plot of the correlation time for the energy-energy correlation function, determined from the exponential behavior of the long-time tail, as a function of the linear lattice size.

The data show that their cluster algorithm becomes more efficient than the standard Metropolis algorithm for lattices sizes larger than .

Efficient Cluster Decomposition Algorithms

A lattice with sites and bonds can be viewed as a graph, and the problem of finding all clusters is the problem of identifying connected components in graph theory.

It is essential to find a cluster labeling algorithm that scales linearly or almost linearly with lattice size . There are two popular algorithms which have this property:

Backtracking algorithm: This is straightforward to program using a recursive function where the argument labels a lattice spin. An array of size is set up which keeps track of whether a spin has been visited by the backtracking algorithm or not. The algorithm works as follows:

A little thought shows that the recursive calls will systematically visit every spin in a cluster connected by frozen bonds. As the lattice is swept, all distinct clusters will be identified.

The recursive function can be given a second argument which can be

This backtracking algorithm is easy to understand and program, but is not the most efficient available because many sites will be tested and found to have been visited before.

Hoshen-Koppelman Algorithm: A more efficient algorithm was found by J. Hoshen and R. Kopelman, Phys. Rev. B 14, 3438 (1976). It is somewhat difficult to understand and program. The essential ideas underlying this algorithm are as follows:

Wolff Single Cluster Algorithm

Two years after Swendsen and Wang published their algorithm, U. Wolff, Phys. Rev. Lett. 62, 361 (1989) published an even more efficient algorithm based on constructing and flipping one single cluster at a time.

Table 1 from Wolff's paper shows results on the 2-D Ising model on the first three rows, the model , and the classical Heisenberg model .

For the Ising model, the simulations are run at the critical temperature , . The correlation time is measured for the magnetic susceptibility which he defines as

This is essentially the same as the conventionally defined isothermal susceptibility per spin

If , then by symmetry for a finite-sized system, so the two definitions agree up to a multiplicative factor .

The Wolff algorithm works as follows:

Note that the decision to add a spin to the cluster follows the same probability criterion to freeze a bond in the Swendsen-Wang algorithm. This implies that Wolff clusters have the same statistical properties as Swendsen-Wang clusters.

A simple argument shows that the Wolff algorithm is potentially more efficient than the Swendsen-Wang algorithm. Imagine the lattice partitioned into Swendsen-Wang clusters. The Wolff algorithm flips a single cluster got by choosing the seed site at random. This random choice obviously favors larger clusters. Flipping larger clusters is more likely to result in uncorrelated configurations!


The codes ising.py and ising.cpp model simulate the 2-D Ising model.

The Hard Disk Gas

A two-dimensional classical gas of hard disks was studied by N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, J. Chem. Phys. 21, 1087 (1953) using the Monte Carlo method.

This pioneering article introduced Metropolis Monte Carlo algorithm, which was selected as one of the Top Ten Algorithms of the 20th Century in the January-February 2000 issue of Computing in Science and Engineering. This was the first use of a digital computer to generate and analyze Monte Carlo data.

Maxwell Boltzmann Kinetic Theory

The system of hard disks was treated as a Maxwell-Boltzmann gas at fixed volume and temperature.

The energy of gas of disks is given by

with a pairwise potential energy function

where is the disk diameter.

Thermal Equilibrium

Consider an ensemble of systems. At absolute temperature the probability that a system has energy is given by the Boltzmann distribution

where is the number of microscopic configurations with energy , is Boltzmann's constant, is the entropy, and is the free energy.

The partition function is given by

The equation of state of the gas relates pressure , volume , and absolute temperature

Monte Carlo Simulation of Hard Disks

Metropolis et al. designed a volume to close-pack disks of diameter as shown in the figures below from their article.

The placed 14 particles per row 16 per column on a triangular lattice. They chose to fix and vary , hence the number of disks .

Periodic boundary conditions were used to approximate an infinite volume and avoid having to apply boundary conditions at the walls of the container.

Radial distribution histogram

The equation of state was deduced from the radial distribution function shown in the figures below.

They showed that

using the Virial Theorem of Clausius.

The Virial theorem states

The derivation is based on the figure below, which defines the variables involved in a collision between two disks.

The result of this derivation is

The Monte Carlo simulation measured the radial distribution function as a histogram:

Codes for Hard Disk

The Molecular Dynamics Method

Molecular dynamics is perhaps conceptually the simplest example of a stochastic method. Consider simulating a very large number of particles or molecules, e.g., one mole ( molecules) of monoatomic argon in thermal equilibrium. One is certainly not interested in the positions and velocities of every one of these atoms as functions of time. One would like to study certain average properties of the sample such as its temperature , pressure , or heat capacity at constant volume . If the system is in thermal equilibrium at a sufficiently high temperature, then the average kinetic energy of an argon atom is related to the temperature by the equipartition theorem

where is Boltzmann's constant. To apply this formula, we need to generate a sample of atoms in thermal equilibrium, measure their velocities and compute .

The fundamental work on this problem was done by A. Rahman, Phys. Rev. 136, A405 (1964). It was extended in many important ways by L. Verlet, Phys. Rev. 159, 98 (1967), who introduced the Verlet algorithm and the use of a neighbor list to speed up the calculation.

The molecular dynamics method to simulate this system:

An ergodic system is one for which thermal averages can be estimated by averages over time

for observables .

Approximations in Molecular Dynamics

One might anticipate serious problems with this algorithm:

The Verlet Integration Algorithms

Since molecular dynamics involves solving Newton's equations of motion, one could use the Euler or more accurate Runge-Kutta algorithms.

However, the most popular algorithm for molecular dynamics simulations is the Verlet algorithm.

If represents the positions of the molecules, their velocities, and their accelerations, the Verlet algorithm to advance the positions and velocities by one time step is

The following Velocity-Verlet algorithm fixes some of these problems:

This algorithm is one order in less accurate in than Verlet, but it is self-starting, and is updated in step with and is determined with comparable accuracy.

The Two-dimensional Lennard-Jones System

Molecular Dynamics methods are relatively straightforward to implement for a sufficiently small system. Consider a small number of ``argon atoms'' in a two-dimensional square box with periodic boundary conditions assuming that the atoms interact via the Lennard-Jones potential

where is the distance between two atomic centers, and for argon.

The figure shows the Lennard-Jones potential and force with and . The potential has a minimum at . The force is strongly repulsive at smaller values of due to the Pauli exclusion force between electrons in closed shells, and weakly attractive at larger values of due to long-range van~der~Waals interactions caused by induced electric dipoles in the neutral atoms.

The energy of the system is

where is the distance between the pair of atoms.

To solve Newton's equations of motion we need the force function derived from the Lennard-Jones potential

If the molecular dynamics simulation succeeds in bringing the system into thermal equilibrium, then the velocities of the molecules should obey the {Maxwell-Boltzmann} distribution appropriate to 2-dimensions:

Codes to simulate the 2D Lennard-Jones System

In the simulation, we choose units such that , and , molecules and a time step .

Periodic Boundary Conditions

Since we are using periodic boundary conditions, the system actually has an infinite number of copies of the particles contained in the volume . Thus there are an infinite number of pairs of particles, all of which interact with one another! The forces between a particular particle and its periodic copies actually cancel, but this is not true of pairs which are not images of one another. Since the Lennard Jones interaction is short ranged, we can safely neglect forces between particles in volumes that are not adjacent to one another. For adjacent volumes, we have to be more careful. It can happen that the separation is larger than the separation where is an image in an adjacent volume of particle . The figure illustrates this in one dimension:

When this occurs we take into account the stronger force between and the image and neglect the weaker force between and .

© 2011   Richard J. Gonsalves

Department of Physics  |  University at Buffalo