Difference between revisions of "CompSciWeek14-15"

From Predictive Chemistry
Jump to: navigation, search
(Example 3)
(The Monte Carlo Method)
Line 167: Line 167:
 
* Computing pi
 
* Computing pi
 
* Parallelizing with MPI4Py
 
* Parallelizing with MPI4Py
  +
  +
Complicated integrals are common in the hard sciences. Most of these have analytical solutions or good approximations (see [mathworld.com] and the NIST Digital Library of Mathematical Functions ([http://dlmf.nist.gov/ DLMF])). Unfortunately, sometimes there is no analytical solution. The next best thing is to use quadrature. This is just summing the function computed at a specific set of points times a specific set of weights. Some background and the relevant scipy functions are [http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.integrate.quad.html here]. There is also a great page of points and weights on [http://people.sc.fsu.edu/~jburkardt/datasets/quadrature_rules/quadrature_rules.html John Burkardt's page at FSU].
  +
  +
For integrals over high-dimensional spaces, and for those closely connected to probability distributions, the next next best thing is to use the Monte Carlo Method. This writes the integral
  +
<math>\int \frac{f(x)}{g(x)} g(x) dx</math>
  +
as
  +
<math>\int \frac{f(x)}{g(x)} g(x) dx = \lim_{N\to\infty} \frac{1}{N} \sum_i \frac{f(x_i)}{g(x_i)}</math>
  +
where the points <math>x_i</math> are generated from a probability distribution,
  +
<math>P(x_i) = g(x_i)</math>
  +
  +
Common probability distributions are the Gaussian,
  +
<math>P(x) = (2\pi \sigma^2)^{-1/2} \exp{-\frac{(x-\mu)^2}{2\sigma^2}}</math>
  +
  +
and the uniform distribution
  +
<math>P(x) = 1/L, x \in (0,L)</math>
  +
(among others).
  +
  +
With that introduction, we can now approximate the value of <math>\pi</math> as the result of
  +
<math>\pi/4 = \int_0^1\int_0^1 (x^2 + y^1 < 1) \; dx\; dy </math>
  +
The function is one when the point x,y is inside the circle and zero otherwise. Physically, we are randomly tossing
  +
a point on the unit square and adding 1 to a total if it also happens to fall within the upper right quadrant of the unit circle.
  +
  +
The code is dead simple:
  +
<source lang="python">
  +
from numpy.random import random
  +
  +
I = 0
  +
for i in xrange(10000):
  +
x = random()
  +
y = random()
  +
I += x*x + y*y < 1.0
  +
  +
print float(I)*4e-4
  +
</source>
  +
  +
Numpy's random() function draws samples from the interval <math>(0,1)</math> with equal probability.
  +
Notice that it does not give pi exactly. It converges to pi in the limit of a large number of samples.
  +
  +
Parallelizing this one is easy, since each processor can run a series of Monte Carlo trials, and the only communication between the processors is averaging the final results. If we ignore that last step for now, we can run the program in parallel without any changes just by saving it to a file, say '''calc_pi.py''' and running it with mpirun
  +
  +
mpirun -np 8 calc_pi.py
  +
  +
Running may require that you save that command to a sh file, sandwiched between batch queue commands and run with, e.g. '''sbatch run.sh'''. You'll notice that all 8 programs independently run all the commands. This is how MPI works - all processors run the same source program. You can get each processor to do something different by using MPI library calls.
  +
  +
  +
<source lang="python">
  +
from mpi4py import MPI
  +
  +
comm=MPI.COMM_WORLD
  +
size=comm.size
  +
rank=comm.rank # This is a unique number for each process
  +
name=MPI.Get_processor_name()
  +
  +
if rank == 0:
  +
print "Tap dancing"
  +
elif rank == 1:
  +
print "Shopping @ store"
  +
elif rank == 2:
  +
print "Cooking roast beef"
  +
else:
  +
print "Staying home"
  +
  +
s = comm.allreduce(rank, op=MPI.SUM)
  +
if rank == 0:
  +
print "Cumulative Node Number Sum = %d"%(s)
  +
</source>
  +
  +
The last line calls MPI's allreduce, which combines values from all nodes and sends the result to all nodes. In this case the sum of all ranks is computed (which should be size(size-1)/2). Pairing the above script with the last one (and summing the Monte Carlo trials) gives an MPI program for estimating the value of pi.

Revision as of 19:19, 3 December 2014

800px

Installation Woes

  • Python packages (ex: sympy)
  • GNU Autotools + make-based process (ex: FFTW)
  • Cmake: a developer-friendly (not user-friendly?) alternative (ex: cgal)
  • Non-standard makes (ex: NAMD2 and NWChem)
  • Tinkering with open-source
  • The DL on Software Licenses
    • Apache, BSD, GPL, Microsoft, FDL, Creative Commons
    • The open-source that isn't: Canvas

The single most important idea for compiling and installing new software is to remember that the installation works for the developer's environment, and it will for you, too if your environment is setup correctly. Often times this is easier said than done.

  1. Package Dependencies (pdftk depends on libgcj)
    • and versions of those packages - this is usually the worst part
  2. Shell variables
    • PATH, CFLAGS, LDFLAGS, LD_LIBRARY_PATH
  3. Compiler version
  4. Machine architecture

Tips:

To find a package owning a file (on linux systems with rpm)

rpm -qf /usr/lib64/libfftw3.so.3

To find all files associated with a package

rpm -ql fftw-3.2.1-3.1.el6.x86_64

Example 1

Installing pdftk (developer Makefile)

mkdir ~/build && cd ~/build
wget https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk-2.02-src.zip

Unpacking the zip file can (and often does) clobber files in your current directory, so it's always safest to unpack it in a new directory:

mkdir pdftk && cd pdftk
unzip ../pdftk-2.02-src.zip
ls

In this case, pdftk was nice enough to put all their files in a subfolder, so we can get rid of the temp directory.

cd ..
mv pdftk/pdftk-2.02-src .
rmdir pdftk

If you read the license information, you'll note that pdftk is covered by a GPL copyright. This means you have the freedom to modify and re-distribute the source (but not to change the license). The GPL is sometimes called a 'copyleft', since it is a clever hack on copyright perpetrated by Richard Stallman, the founder of the GNU project.

Make is a small language that lists the shell commands required to 'make' the 'targets'. A make target can be anything, but usually is the name of an executable file (or a file at an intermediate step) that will be built during the compilation process. Pdftk has installation instructions on its website, which outline a make-based process commonly seen for development on small projects. pdftk is large-ish because the pdf file format can do a lot, but the build process is small because it is based on java. Some standard options to make are

make -f Makefile.Redhat -j 8 -C $PWD install

The '-j 8' flag tells make to run in parallel on 8 processors. The '-C' flag tells make what directory to run make from. Here, it's for illustration, since the default is to run in the current directory. The name 'install' is the target to build. It is not a real file, but instead just has a list of commands that copy the compiled program into the filesystem. The great thing about make is that it knows what order to make all the files in.

In this instance, make will build the executable, pdftk, but will not install it - since it tries to copy it into /usr/local/bin. You should read Makefile.Redhat to find the install target. You'll notice that it's not there, but at the bottom the Makefile includes Makefile.Base. Looking at that file, you will find the install target. It is run with a single line that installs (carefully copies) pdftk from the current directory into /usr/local/bin. make install will work if you change /usr/local/bin to $(HOME)/bin. You may also consider installing the manpage (pdftk.1 from the main pdftk directory).

cd ..
cp pdftk.1 $HOME/share/man/man1
export MANPATH=$MANPATH:$HOME/share/man

Example 2

Installing fftw (autotools)

A quick search turns up the main page for fftw, the Fastest Fourier Transform in the West.

mkdir ~/build && cd ~/build
wget ftp://ftp.fftw.org/pub/fftw/fftw-3.3.4.tar.gz

This is a tar (tape archive) that has been gzipped (usually given the endearing name 'tarball'). You can read the contents with

tar tzf fftw-3.3.4.tar.gz | less
tar xzf fftw-3.3.4.tar.gz

The first command lists the contents. It is really helpful to make sure that all the files in the tarball are in their own directory. Most tarballs should be packed this way. Ones that aren't will put files in your current directory, and may clobber other files you had there before.

cd fftw-3.3.4
less README

The second line is absolutely critical, since almost every package has some installation quirk that you won't know about until it's too late -- or until you read the instructions, either way. You'll notice this file says configure, make, make install. This means the package uses gnu autotools!

Autotools are a standard set of installation scripts that follow the general install commands above. The configure part is the most important, since it gives you a chance to put in details about your own environment. In addition to reading all install instructions, you should also run

./configure --help

This will give you a list of options that tell the compile process what compilers to use and where to find libraries and include files that the package may well depend on. The standard ones (CFLAGS, etc.) are shown in the graphic at the top of this page.

./configure --with-pic --enable-float --prefix=$HOME 

The second part of autotools is (sometimes GNU) make, covered above.

make -j4 install

Example 3

Installing gromacs (CMAKE). The latest versions of gromacs use cmake - a new build tool created to replace autotools (configure). cmake uses two main files for its work:

CMakeLists.txt
CMakeCache.txt

CMakeLists.txt is written by the package developers to control the build process. It typically introduces a list of required packages, calls some functions to test whether they are on the system and how to get to them (the configure process), and then lists a set of build targets. Unlike the make process, editing this file is unwise, since it mostly contains information on dependencies (which break the package if you change them).

Sadly, cmake abstracts away all the good environment variables, getting rid of your ability to directly tweak the build process. Instead, you have a perplexing array of variables that can offer more control over the process. This comes at the expense of usually not knowing how or what variables need to be changed when you have a problem. Hopefully the developers left some documentation to clear up this point. Gromacs is good at this, and lets you know the major variables to change - one influencing the location of the fftw library and the other influencing whether or not to compile with MPI.

To install gromacs, download and unpack the source as before. I'm working with gromacs-5.0-beta2 now, but you may find a different version when you read this. Next make sure your environment variables reference the compiler you intend to use. We have a module on rcslurm for the intel compiler,

module load compilers/intel/2013sp1_cluster_xe

Then generate CMakeCache.txt with the command

cmake .

This makes the cache file, which is the perplexing list of build variables referenced above. In a large project, running cmake . may produce CMakeCache.txt files in several sub-directories as well.

find . -name CMakeCache.txt

If you're unlucky, cmake will end with a version mismatch error, in which case you have to go out and get the latest version of cmake. We have a few on the rcslurm cluster, which you can load with, e.g.

module load apps/cmake/2.8.12.2

There's also an interactive way to look at the most important variables used during cmake's build. It's

ccmake .

This will come up with a list, analogous to ./configure --help with autotools. If all goes well with the cmake process, you can proceed with the usual make process

make -j8
make test
make install

A word of warning is in order. Gromacs is highly optimized, and uses information specific to the machine it is compiled on. Rcslurm has different hardware (amd vs intel processors) on the head node than the cluster nodes -- leading to an "Illegal instruction" error when you compile on one and run on the other. To get around this, you have to compile on a node

rm -f CMakeCache.txt
srun --ntasks-per-node 8 -N 1 --pty /bin/bash
cmake .
make -j8
make install

With gromacs, you will want to compile an MPI version for the cluster and a non-mpi version for the head node (to create run-files and do simple analysis).

Once you figure out the relevant cmake variables, you can save them as a build script. I've started doing this for all my builds, since it helps me keep track of what I've tried. The relevant one is here:

<source lang="bash"> module load compilers/intel/2013sp1_cluster_xe mpi/openmpi/1.6.1 apps/cmake/2.8.12.2

cmake -DCMAKE_INSTALL_PREFIX=/shares/rogers \

       -DFFTW_INCLUDE_DIR=/shares/rogers/include \
       -DFFTW_LIBRARY=/shares/rogers/lib \
       -DGMX_MPI=ON \
       .

make -j8 make install </source>

Fourier Transforms

Read through the page on Fourier transforms.

The Monte Carlo Method

  • Integrals of the form: <math>\int \frac{f(x)}{g(x)} g(x) dx</math>
  • Computing pi
  • Parallelizing with MPI4Py

Complicated integrals are common in the hard sciences. Most of these have analytical solutions or good approximations (see [mathworld.com] and the NIST Digital Library of Mathematical Functions (DLMF)). Unfortunately, sometimes there is no analytical solution. The next best thing is to use quadrature. This is just summing the function computed at a specific set of points times a specific set of weights. Some background and the relevant scipy functions are here. There is also a great page of points and weights on John Burkardt's page at FSU.

For integrals over high-dimensional spaces, and for those closely connected to probability distributions, the next next best thing is to use the Monte Carlo Method. This writes the integral <math>\int \frac{f(x)}{g(x)} g(x) dx</math> as <math>\int \frac{f(x)}{g(x)} g(x) dx = \lim_{N\to\infty} \frac{1}{N} \sum_i \frac{f(x_i)}{g(x_i)}</math> where the points <math>x_i</math> are generated from a probability distribution, <math>P(x_i) = g(x_i)</math>

Common probability distributions are the Gaussian, <math>P(x) = (2\pi \sigma^2)^{-1/2} \exp{-\frac{(x-\mu)^2}{2\sigma^2}}</math>

and the uniform distribution <math>P(x) = 1/L, x \in (0,L)</math> (among others).

With that introduction, we can now approximate the value of <math>\pi</math> as the result of <math>\pi/4 = \int_0^1\int_0^1 (x^2 + y^1 < 1) \; dx\; dy </math> The function is one when the point x,y is inside the circle and zero otherwise. Physically, we are randomly tossing a point on the unit square and adding 1 to a total if it also happens to fall within the upper right quadrant of the unit circle.

The code is dead simple: <source lang="python"> from numpy.random import random

I = 0 for i in xrange(10000):

 x = random()
 y = random()
 I += x*x + y*y < 1.0

print float(I)*4e-4 </source>

Numpy's random() function draws samples from the interval <math>(0,1)</math> with equal probability. Notice that it does not give pi exactly. It converges to pi in the limit of a large number of samples.

Parallelizing this one is easy, since each processor can run a series of Monte Carlo trials, and the only communication between the processors is averaging the final results. If we ignore that last step for now, we can run the program in parallel without any changes just by saving it to a file, say calc_pi.py and running it with mpirun

mpirun -np 8 calc_pi.py

Running may require that you save that command to a sh file, sandwiched between batch queue commands and run with, e.g. sbatch run.sh. You'll notice that all 8 programs independently run all the commands. This is how MPI works - all processors run the same source program. You can get each processor to do something different by using MPI library calls.


<source lang="python"> from mpi4py import MPI

comm=MPI.COMM_WORLD size=comm.size rank=comm.rank # This is a unique number for each process name=MPI.Get_processor_name()

if rank == 0:

 print "Tap dancing"

elif rank == 1:

 print "Shopping @ store"

elif rank == 2:

 print "Cooking roast beef"

else:

 print "Staying home"

s = comm.allreduce(rank, op=MPI.SUM) if rank == 0:

 print "Cumulative Node Number Sum = %d"%(s)

</source>

The last line calls MPI's allreduce, which combines values from all nodes and sends the result to all nodes. In this case the sum of all ranks is computed (which should be size(size-1)/2). Pairing the above script with the last one (and summing the Monte Carlo trials) gives an MPI program for estimating the value of pi.