Difference between revisions of "CompSciWeek14-15"
(→Example 1) |
(→Example 3) |
||
Line 99: | Line 99: | ||
== Example 3 == |
== Example 3 == |
||
− | Installing gromacs (CMAKE) |
||
+ | Installing gromacs (CMAKE). The latest versions of gromacs use cmake - a new build tool created to replace autotools (configure). cmake uses two main files for its work: |
||
+ | |||
+ | CMakeLists.txt |
||
+ | CMakeCache.txt |
||
+ | |||
+ | CMakeLists.txt is written by the package developers to control the build process. It typically introduces a list of required packages, calls some functions to test whether they are on the system and how to get to them (the configure process), and then lists a set of build targets. Unlike the make process, editing this file is unwise, since it mostly contains information on dependencies (which break the package if you change them). |
||
+ | |||
+ | Sadly, cmake abstracts away all the good environment variables, getting rid of your ability to directly tweak the build process. Instead, you have a perplexing array of variables that can offer more control over the process. This comes at the expense of usually not knowing how or what variables need to be changed when you have a problem. Hopefully the developers left some documentation to clear up this point. Gromacs is good at this, and lets you know the major variables to change - one influencing the location of the fftw library and the other influencing whether or not to compile with MPI. |
||
+ | |||
+ | To install gromacs, download and unpack the source as before. I'm working with gromacs-5.0-beta2 now, but you may find a different version when you read this. Next make sure your environment variables reference the compiler you intend to use. We have a module on rcslurm for the intel compiler, |
||
+ | |||
+ | module load compilers/intel/2013sp1_cluster_xe |
||
+ | |||
+ | Then generate CMakeCache.txt with the command |
||
+ | |||
+ | cmake . |
||
+ | |||
+ | This makes the cache file, which is the perplexing list of build variables referenced above. In a large project, running cmake . may produce CMakeCache.txt files in several sub-directories as well. |
||
+ | |||
+ | find . -name CMakeCache.txt |
||
+ | |||
+ | If you're unlucky, cmake will end with a version mismatch error, in which case you have to go out and get the latest version of cmake. We have a few on the rcslurm cluster, which you can load with, e.g. |
||
+ | |||
+ | module load apps/cmake/2.8.12.2 |
||
+ | |||
+ | There's also an interactive way to look at the most important variables used during cmake's build. It's |
||
+ | |||
+ | ccmake . |
||
+ | |||
+ | This will come up with a list, analogous to ./configure --help with autotools. If all goes well with the cmake process, you can proceed with the usual make process |
||
+ | |||
+ | make -j8 |
||
+ | make test |
||
+ | make install |
||
+ | |||
+ | A word of warning is in order. Gromacs is highly optimized, and uses information specific to the machine it is compiled on. Rcslurm has different hardware (amd vs intel processors) on the head node than the cluster nodes -- leading to an "Illegal instruction" error when you compile on one and run on the other. To get around this, you have to compile on a node |
||
+ | |||
+ | rm -f CMakeCache.txt |
||
+ | srun --ntasks-per-node 8 -N 1 --pty /bin/bash |
||
+ | cmake . |
||
+ | make -j8 |
||
+ | make install |
||
+ | |||
+ | With gromacs, you will want to compile an MPI version for the cluster and a non-mpi version for the head node (to create run-files and do simple analysis). |
||
+ | |||
+ | Once you figure out the relevant cmake variables, you can save them as a build script. I've started doing this for all my builds, since it helps me keep track of what I've tried. The relevant one is here: |
||
+ | |||
+ | <source lang="bash"> |
||
+ | module load compilers/intel/2013sp1_cluster_xe mpi/openmpi/1.6.1 apps/cmake/2.8.12.2 |
||
+ | |||
+ | cmake -DCMAKE_INSTALL_PREFIX=/shares/rogers \ |
||
+ | -DFFTW_INCLUDE_DIR=/shares/rogers/include \ |
||
+ | -DFFTW_LIBRARY=/shares/rogers/lib \ |
||
+ | -DGMX_MPI=ON \ |
||
+ | . |
||
+ | |||
+ | make -j8 |
||
+ | make install |
||
+ | </source> |
||
= Fourier Transforms = |
= Fourier Transforms = |
Revision as of 18:37, 3 December 2014
Contents
Installation Woes
- Python packages (ex: sympy)
- GNU Autotools + make-based process (ex: FFTW)
- Cmake: a developer-friendly (not user-friendly?) alternative (ex: cgal)
- Non-standard makes (ex: NAMD2 and NWChem)
- Tinkering with open-source
- The DL on Software Licenses
- Apache, BSD, GPL, Microsoft, FDL, Creative Commons
- The open-source that isn't: Canvas
The single most important idea for compiling and installing new software is to remember that the installation works for the developer's environment, and it will for you, too if your environment is setup correctly. Often times this is easier said than done.
- Package Dependencies (pdftk depends on libgcj)
- and versions of those packages - this is usually the worst part
- Shell variables
- PATH, CFLAGS, LDFLAGS, LD_LIBRARY_PATH
- Compiler version
- Machine architecture
Tips:
To find a package owning a file (on linux systems with rpm)
rpm -qf /usr/lib64/libfftw3.so.3
To find all files associated with a package
rpm -ql fftw-3.2.1-3.1.el6.x86_64
Example 1
Installing pdftk (developer Makefile)
mkdir ~/build && cd ~/build wget https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk-2.02-src.zip
Unpacking the zip file can (and often does) clobber files in your current directory, so it's always safest to unpack it in a new directory:
mkdir pdftk && cd pdftk unzip ../pdftk-2.02-src.zip ls
In this case, pdftk was nice enough to put all their files in a subfolder, so we can get rid of the temp directory.
cd .. mv pdftk/pdftk-2.02-src . rmdir pdftk
If you read the license information, you'll note that pdftk is covered by a GPL copyright. This means you have the freedom to modify and re-distribute the source (but not to change the license). The GPL is sometimes called a 'copyleft', since it is a clever hack on copyright perpetrated by Richard Stallman, the founder of the GNU project.
Make is a small language that lists the shell commands required to 'make' the 'targets'. A make target can be anything, but usually is the name of an executable file (or a file at an intermediate step) that will be built during the compilation process. Pdftk has installation instructions on its website, which outline a make-based process commonly seen for development on small projects. pdftk is large-ish because the pdf file format can do a lot, but the build process is small because it is based on java. Some standard options to make are
make -f Makefile.Redhat -j 8 -C $PWD install
The '-j 8' flag tells make to run in parallel on 8 processors. The '-C' flag tells make what directory to run make from. Here, it's for illustration, since the default is to run in the current directory. The name 'install' is the target to build. It is not a real file, but instead just has a list of commands that copy the compiled program into the filesystem. The great thing about make is that it knows what order to make all the files in.
In this instance, make will build the executable, pdftk, but will not install it - since it tries to copy it into /usr/local/bin. You should read Makefile.Redhat to find the install target. You'll notice that it's not there, but at the bottom the Makefile includes Makefile.Base. Looking at that file, you will find the install target. It is run with a single line that installs (carefully copies) pdftk from the current directory into /usr/local/bin. make install will work if you change /usr/local/bin to $(HOME)/bin. You may also consider installing the manpage (pdftk.1 from the main pdftk directory).
cd .. cp pdftk.1 $HOME/share/man/man1 export MANPATH=$MANPATH:$HOME/share/man
Example 2
Installing fftw (autotools)
A quick search turns up the main page for fftw, the Fastest Fourier Transform in the West.
mkdir ~/build && cd ~/build wget ftp://ftp.fftw.org/pub/fftw/fftw-3.3.4.tar.gz
This is a tar (tape archive) that has been gzipped (usually given the endearing name 'tarball'). You can read the contents with
tar tzf fftw-3.3.4.tar.gz | less tar xzf fftw-3.3.4.tar.gz
The first command lists the contents. It is really helpful to make sure that all the files in the tarball are in their own directory. Most tarballs should be packed this way. Ones that aren't will put files in your current directory, and may clobber other files you had there before.
cd fftw-3.3.4 less README
The second line is absolutely critical, since almost every package has some installation quirk that you won't know about until it's too late -- or until you read the instructions, either way. You'll notice this file says configure, make, make install. This means the package uses gnu autotools!
Autotools are a standard set of installation scripts that follow the general install commands above. The configure part is the most important, since it gives you a chance to put in details about your own environment. In addition to reading all install instructions, you should also run
./configure --help
This will give you a list of options that tell the compile process what compilers to use and where to find libraries and include files that the package may well depend on. The standard ones (CFLAGS, etc.) are shown in the graphic at the top of this page.
./configure --with-pic --enable-float --prefix=$HOME
The second part of autotools is (sometimes GNU) make, covered above.
make -j4 install
Example 3
Installing gromacs (CMAKE). The latest versions of gromacs use cmake - a new build tool created to replace autotools (configure). cmake uses two main files for its work:
CMakeLists.txt CMakeCache.txt
CMakeLists.txt is written by the package developers to control the build process. It typically introduces a list of required packages, calls some functions to test whether they are on the system and how to get to them (the configure process), and then lists a set of build targets. Unlike the make process, editing this file is unwise, since it mostly contains information on dependencies (which break the package if you change them).
Sadly, cmake abstracts away all the good environment variables, getting rid of your ability to directly tweak the build process. Instead, you have a perplexing array of variables that can offer more control over the process. This comes at the expense of usually not knowing how or what variables need to be changed when you have a problem. Hopefully the developers left some documentation to clear up this point. Gromacs is good at this, and lets you know the major variables to change - one influencing the location of the fftw library and the other influencing whether or not to compile with MPI.
To install gromacs, download and unpack the source as before. I'm working with gromacs-5.0-beta2 now, but you may find a different version when you read this. Next make sure your environment variables reference the compiler you intend to use. We have a module on rcslurm for the intel compiler,
module load compilers/intel/2013sp1_cluster_xe
Then generate CMakeCache.txt with the command
cmake .
This makes the cache file, which is the perplexing list of build variables referenced above. In a large project, running cmake . may produce CMakeCache.txt files in several sub-directories as well.
find . -name CMakeCache.txt
If you're unlucky, cmake will end with a version mismatch error, in which case you have to go out and get the latest version of cmake. We have a few on the rcslurm cluster, which you can load with, e.g.
module load apps/cmake/2.8.12.2
There's also an interactive way to look at the most important variables used during cmake's build. It's
ccmake .
This will come up with a list, analogous to ./configure --help with autotools. If all goes well with the cmake process, you can proceed with the usual make process
make -j8 make test make install
A word of warning is in order. Gromacs is highly optimized, and uses information specific to the machine it is compiled on. Rcslurm has different hardware (amd vs intel processors) on the head node than the cluster nodes -- leading to an "Illegal instruction" error when you compile on one and run on the other. To get around this, you have to compile on a node
rm -f CMakeCache.txt srun --ntasks-per-node 8 -N 1 --pty /bin/bash cmake . make -j8 make install
With gromacs, you will want to compile an MPI version for the cluster and a non-mpi version for the head node (to create run-files and do simple analysis).
Once you figure out the relevant cmake variables, you can save them as a build script. I've started doing this for all my builds, since it helps me keep track of what I've tried. The relevant one is here:
<source lang="bash"> module load compilers/intel/2013sp1_cluster_xe mpi/openmpi/1.6.1 apps/cmake/2.8.12.2
cmake -DCMAKE_INSTALL_PREFIX=/shares/rogers \
-DFFTW_INCLUDE_DIR=/shares/rogers/include \ -DFFTW_LIBRARY=/shares/rogers/lib \ -DGMX_MPI=ON \ .
make -j8 make install </source>
Fourier Transforms
Read through the page on Fourier transforms.
The Monte Carlo Method
- Integrals of the form: <math>\int \frac{f(x)}{g(x)} g(x) dx</math>
- Computing pi
- Parallelizing with MPI4Py