Video Transcript

In exercise 5, we’ve learned how to set up local repositories for RPM packages and install them using YUM. RPM packages are great since their installation is straightforward as well as their removal. Dependencies are carefully handed by the installation tool; they won’t install without dependencies (at least by default).

However fantastic they are, nothing is perfect and also features some downsides. For instance, only one version of each package can be installed. Updating one package may lead to the upgrade of its dependencies, which may cause conflict with other packages, and by default, they have to be installed in each compute node instead of a shared location.

An RPM contains all the necessary files and configuration to make it possible to run a piece of software. That is executables binaries, libraries, headers, documentation, and others. These files are meant to be placed in specific locations; here is an example of this. /bin /lib /include /etc.

Projects that use build systems as Autotools or CMake allow custom locations to install the compiled software. It is common to use the --prefix flag to specify the location. This will create the installation folders and place every relevant file accordingly. To make the software installed in custom locations visible to the OS, we have to set environment variables.

In Shell, we can append the PATH environment variable with the compiled binaries’ location. And in the same way, we can append LD_LIBRARY_PATH to the compiled libraries’ folder. Development headers can be set with the INCLUDE_PATH or CPATH. Finally, the documentation can be made available in bash with the MANPATH environment variable.

Setting all of these variables for every software in the cluster can be very frustrating, especially if you support different versions of the same binary. For instance, Python could be Python 3.5 or Python 3.6, each with different paths and environment variables. Here is when Software Modules become important in the cluster environment.

Software Modules will make all the necessary changes to the environment variables for you. You can seamlessly change from one version of the software to another, making possible the coexistence of multiple versions of the same software. You can configure the modules to load dependencies as needed. And its usage is very well known within the HPC community.

In this exercise, you will compile software from the source code, including Python 2 and 3, and install it in a custom location. Then you will set the environment variables to make this custom installation usable using LMod by creating the corresponding module files.