Friday 19 July 2013

Technical Post - Installing HDF5 and NetCDF4 and NetCDF4 libraries for Fortran

This is a post I've been meaning to do for a while now, ever since I got it to work, but it may not be of interest to everyone. Ever since I started working on inverse modelling, I've found that I've needed to use Linux more and more, particularly because many of the atmospheric transport models are written in the Fortran programming language. Also, if you ever plan to do anything on a super computer, you're more than likely going to be dealing with the Linux operating system. I wouldn't say that I'm the greatest fan of Linux, but I have learnt to get my way around the terminal window, to the point where I in fact feel I've learnt to view the computer in an entirely different way. Not to give away too much about my age, but we did have the DOS operating system on our first computer, so as a kid I could get my way around the computer and perform a few operations (like backing up and restoring games). So the terminal was not too big of a shock for me. My biggest problem with Linux is how difficult it is (for me) to install a new program if you don't already have all the correct libraries. And then you install the missing libraries it's complaining about, and then everything breaks. I just wish that there was a little bit more self checking going on, so that it would stop you before you did something stupid. Or even better, check first what libraries you have, and then actually install automatically the correct, compatible libraries for you. But anyway ... that's just me.



So what I would like to present today is my solution to installing important libraries used for accessing very large, generally spatial, datasets. And that's the HDF5 and NetCDF4 libraries. Particularly for the use in Fortran. And as it stands, I can also use NetCDF successfully in Python (but with a few extra NetCDF Python specific packages). The reason why I'm posting this is because I routinely break Linux. I've found that it's a skill I possess. One day everything will be working like clockwork, and the next moment, there are black screens and hung computers. And it's back to the drawing board again. So I've had to install NetCDF (in particular NetCDF, but you need HDF5 if you want to use NetCDF4) on several occasions, and every single time it's been a painful exploration through the internet forum space, greedily searching for any solution which will result in the easy installation of these libraries. After the last time, I decided that enough was enough, and I documented everything that I did, so that when it happens again, I can just follow my step by step guide, without having to do any searching, and it will just work the first time. I'm sure it's a pipe dream, but here it goes anyway. Maybe this will help one person out there, and save them a week's worth of frustration.

This set of instructions is aimed at those planning to use the Intel Fortran compliers on Ubuntu. The important point is that you have to compile NetCDF with the compiler you're going to be using (you could also be using gfortran). Now if you install NetCDF from the repository you are going to run into problems here. I had attempted to do this (I always try to install from the repository if I can), but I eventually had to uninstall the NetCDF version I had installed from the repository. This is the version that kept coming up when I used the command nc-config --all, and I think this is the reason why, no matter how I tried to link the libraries, ifort would tell me that NetCDF had not been compiled with ifort. So rather don't have any other NetCDF versions lurking.

The first thing you have to do is make sure that your .bashrc file is setup correctly so that your compiler and its libraries are working right from the start of the session. And you also need to make a few other changes to point your installation in the right direction. This is by far the scariest part of the installation, because quite frankly, I have no idea what most of these lines are actually doing. Some of them may even be redundant. But these instructions I found on various webpages, trying to explain exactly how to install NetCDF for
Intel Fortran compilers (so mainly sourced from the Unidata website and from the Intel website), and for me these changes did appear to work and did no harm. From what I understand, we're basically trying to point the computer to where it needs to look for various libraries and how to treat certain file types.

I mainly followed the advice from the Unidata website and installed everything (zlib, HDF5, NetCDF4, and NetCDF4-Fortran into one folder (in my case this was /home/alecia/local), which was not the default folder.  When I didn't use the prefix command and let it install in the default location, then I couldn't get HDF5 to install.
 
These are the extra lines in my .bashrc file, which get added right at the end of the file  [To get into the .bashrc file, you type: cd, and then you type vim .bashrc if you have the vim editor]:

source /opt/intel/composer_xe_2013.3.163/bin/compilervars.sh intel64
PATH=$PATH:/opt/intel/composer_xe_2013.3.163/bin/
export PATH
source /opt/intel/composer_xe_2013.3.163/bin/intel64/idbvars.sh
PATH=$PATH:/opt/intel/composer_xe_2013.3.163/bin/intel64/ifort
export PATH
export F90=ifort
export F77=ifort
export F9X=ifort
export PATH=$PATH:/home/alecia/local/bin
export NETCDF=/home/alecia/local
export NETCDF_LIB=/home/alecia/local/lib
export NETCDF_INC=/home/alecia/local/include
NETCDFHOME=/home/alecia/local
export NETCDFHOME
NETCDF_PREFIX=$NETCDFHOME
 
Save the .bashrc file, and then type

source .bashrc
which will execute the new setup file and make the changes you've added.

The first few lines, up until the second "export PATH" are telling the computer where the Intel libraries are. From the "export F90=ifort" line, this is specifically for the HDF5 and NetCDF4 installation.

Then obtain the tar.gz installation files for curl, zlib, HDF5, NetCDF4, and NetCDF4 libraries for Fortran from the unidata website
 
I actually installed curl from the repository, because the one I got from the website didn't want to install. But this seemed to work just fine.
 
Then extract zlib into its own directory:
tar zxvf zlib-1.2.7.tar.gz
cd zlib-1.2.7/
./configure--prefix=/home/alecia/local 
make check install

Of course, everywhere that you see "alecia" you would just replace with your own home folder name. And the version number would be the latest version number. 
 
Then extract HDF5 in its own folder:
tar zxvf hdf5-1.8.9.tar.gz
cd hdf5-1.8.9/
CFLAGS=-O0 ./configure --prefix=/home/alecia/local --with-zlib=/home/alecia/local --enable-fortran
make check install

Then extract NetCDF4 into its own folder:
tar zxvf netcdf-4.3.0.tar.gz
cd netcdf-4.3.0/
CPPFLAGS=-I/home/alecia/local/include LDFLAGS=-L/home/alecia/local/lib ./configure --prefix=/home/alecia/local 
make check install
 
Then extract NetCDF4 Fortran libraries into its own folder:
tar zxvf netcdf-fortran-4.2.tar.gz
cd netcdf-fortran-4.2/ 
CPPFLAGS=-I/home/alecia/local/include LDFLAGS=-L/home/alecia/local/lib ./configure --prefix=/home/alecia/local
 

Then to compile with ifort I needed to use the following command (and apparently the order is important, but I haven't tried different combinations):
ifort -o convert.out -I/home/alecia/local/include convert_ccamtolpdm2_go.f90 -L/home/alecia/local/lib -lnetcdff -lnetcdf

Of course here you would replace the f90 file and the output file with your own.

And to get the executable to run, I had to use:
./convert.out -rpath=/home/alecia/local/lib

When I restarted the system the next day, and tried to redo this, I believe I ran into some problems, so that's when I discovered how to correctly modify the LD_LIBRARY_PATH list (yet again - thanks to the wonderful internet invention of the forum). This is once again a modification done in the .bashrc file.


LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/alecia/local/lib
export LD_LIBRARY_PATH


These two lines get added right at the end of the .bashrc file. As I understand it, LD_LIBRARY_PATH is a list of all the places that the computer needs to look when trying to find a library. When you install libraries into non-default locations then you need to add this location to the list. What you don't want to do is erase the list and then just add the new library location. That is why if you read up about LD_LIBRARY_PATH modifications, a lot of people will tell you just not to do it at all. So the format that I've used here is very important, because this way it definitely adds the library location, as opposed to wiping out the list. Just for you to check, before you make this modification, you can type  

echo $LD_LIBRARY_PATH
and run in the terminal, and it will give you the list. You can then make the modification to the .bashrc file and save, and then type

source .bashrc
and run and this will run the file and make the added changes. If you rerun the echo $LD_LIBRARY_PATH command, then it should give you the same list you previously had, but now with the new library path /home/alecia/local/lib.

Now you should be able to execute your new Fortran executables by just typing
./convert.out
(or whatever you've called your executable)


I'm sure there are several different ways of doing this, and probably many of them more elegant and efficient, but for me, using a personal computer with Intel i7-3770K machine , running Ubuntu 12.04 on a VMWare virtual machine, this worked (and I've gotten it to work twice - yes I have already broken Linux since finding this solution). I'm still compiling my Fortran code using the same line of commands, and I'm still able to open, read and write to NetCDF files. For how long this will continue I don't know, but if anything goes wrong, this is where I will start with building it up from scratch again.

I hope this helps somebody out there.

Wednesday 17 July 2013

The 9th International Carbon Dioxide Conference - Beijing

Firstly, apologies for my long absence.

Not so long ago I was fortunate enough to attend the 9th International Carbon Dioxide Conference, hosted in Beijing by the Chinese Academy of Sciences. It was truly my first big international conference, and an opportunity to present the first set of results I had obtained. Well of course that’s what I said I was going to do ... It seemed that almost as soon as I submitted my abstract and it was accepted for a poster presentation (I was at first a bit disappointed, until I saw the author list of those presenting orals and realised that perhaps just listening would be a good thing this time round) that things started to delay my ambitions of having all my results wrapped up by the time of the conference. But through many late nights and many more antacids later, I was able to assemble something with which I was rather pleased. This in part played a role in my disappearance from the carbon science blogging scene.

When I finished off my poster, I thought I was being terribly clever when I put the link to the @CapeCarbon twitter account at the end of the poster. I then discovered that Twitter is not accessible in China – which was a bit of a buzz kill. So much for that grand idea.

The conference was an eye opening experience and a carbon science nerd’s dream! I can honestly say that almost my entire PhD reference list was walking about the conference. And if someone wasn't there themselves, then their students were there presenting collaborative work. I very often had to restrain myself from asking for autographs and taking pictures with famous (in my mind) scientists whose work I had poured over for hours and hours, resulting in severely dog-eared papers with plenty of my own scribbles as I’d tried to mould it into something I could understand and grasp.

Two such scientists based at the LSCE in Paris (Laboratoire des Sciences du Climat et de l’Environnement -  http://www.lsce.ipsl.fr/en/) are Philippe Ciais and Philippe Peylin. Their work features heavily in my literature review, and at one stage I had one of them standing on my left and the other on my right while they were having a discussion with my supervisor. I found myself searching the crowd for my South African colleague so that I could telepathically communicate to him to please take a picture!

A picture of Ralph Keeling who features in an earlier post
That was the other great opportunity presented to me by the conference – I got to meet up with my Australian supervisor, Dr. Peter Rayner from the University of Melbourne, and have an entire week to discuss, in person, the technical issues I’d been wrestling with over the past weeks leading up to the conference, as well as to meet in person a fellow collaborator with whom I’d had many an email conversation. For those people starting a PhD, I can only hope for you that you have a supervisor like Dr. Rayner. He has a unique grasp of the science behind atmospheric transport as well as the statistics of inverse modelling, which he is able to elucidate to another person in such a way that you can actually see the light bulb blinking on as a once complicated and hostile battlefield of detail and facts is turned into a tangible, achievable process. Preceding the conference, I had gone through a bit of the “PhD blues”, and had avoided discussing my concerns with any of my supervisors. This was a terrible mistake, and once I’d had my first conversation with Dr. Rayner in the weeks before the conference (a marathon two hour Skype meeting), I felt the glumness lift off of me, and I was given direction once again and able to forge on ahead. A PhD is not for the weak, let me tell you. If you thought you were emotionally unhinged before, just try one of these on for size.

Anyway! What did I take away from the conference? Well many of the presentations were on trends in carbon dioxide levels – and yes, carbon dioxide concentrations are going up. We are definitely not doing enough yet to prevent a more than two degree average temperature increment into the future. It may level off now and then, but now that we are starting to have the first really long term carbon dioxide time series datasets, it’s clear that levels are rising. And from these kinks in the carbon dioxide trend we also know that we are not doing such a great job yet in predicting the complicated interaction between the climate and the carbon cycle, particularly the component related to the land surface. That means that we need more measurements and we need better models. I also learnt that carbon scientists lean a little bit towards the cynical side – you need to in order to survive the carbon science/climate science game.

A rather thought provoking cartoon posted by one of the presenters on the last day of the conference

When I wasn't gawking at famous scientists or straining to take in every bit of information I could sponge from the presentations, the conference organisers were doing a great job of keeping us busy on tours around Beijing. I even managed, along with my South African colleague, to venture the streets of Beijing and discover a bit of the history and culture of the city. 
At The Summer Palace - Residence of The Dragon Lady
The Great Wall of China

The Forgotten City
More to follow soon!

Another article on the breaching of the 400ppm carbon dioxide level in the New York Times: http://www.nytimes.com/2013/05/11/science/earth/carbon-dioxide-level-passes-long-feared-milestone.html?pagewanted=all&_r=0