Modern software-development techniques in electronic structure theory

A few weeks ago, on 23rd May I received an invitation to visit the Laboratoire de Physique des Lasers, Atomes et Molécules at the Université de Lille and present about my recent work. When it came to selecting a topic, me and my host, Andre Gomes, quickly agreed to focus on discussing more modern approaches to scientific software development and why these can be useful for electronic-structure codes. In retrospect I am very happy for this opportunity to summarise a few ideas and learned lessons from my previous and ongoing software projects. I also took the liberty to sketch possible challenges when it comes to future directions of scientific software from my personal point of view as well as propose some ideas how to tackle them.

From this angle I ended up introducing the talk by a review of the fundamental difficulties of electronic structure theory itself, namely the inherent complexity of the problem (large dimensionality and non-linearity of the respective partial-differential equations). Added to that in recent years the available high-performance computing (HPC) environments have become more heterogeneous and more complex as well. Mixed general-purpose GPU / CPU cluster setups are now standard and making use of both CPU and GPU for computations has become a strict requirement for applying for access to the top-ranked clusters of the world. Projecting into the future, where further "accelerators" such as field-programmable gate arrays (FPGAs), or even more long term quantum computers, might come up, this heterogeneity is only going to increase.

For quickly testing the usefulness of such technologies in the context of quantum chemistry, a key aspect is to be able to reuse the code, which already exists. This means that code needs to be sufficiently high-level in order to be able to adapt to changing hardware architectures. Let be be clear on this: I am not talking about achieving 100% peak performance on every hardware setup, much rather about having a good compromise between the required effort to get the code to work on new hardware and the achieved performance. The point is to quickly get an idea on what is theoretically possible and if further investigation even makes sense.

As I sketch in the talk, changing the hardware, does have an impact on questions such as the optimal basis function, numerical algorithm or even the best-suited physical model. In this "suitable" and "optimal" refers to a simulation procedure, which agrees with the available computational architecture and at the same time is physically meaningful. A consequence of this observation is that experimentation on all levels (basis functions, algorithms, numerics, linear algebra backend) is required, which often calls for interdisciplinary expertise. In an ideal world experts from different fields can thus work on the same code base, approaching the problem from various different angles.

Without a doubt this is a huge task and chances are, that this goal will in fact never be reached. Still I think, there are some opportunities to get a little closer than presently. For this purpose, key aspects are high-level and dynamic programming languages like Julia and python and a clear, modular code design, where individual aspects of the simulation procedure can be easily modified or swapped. Such an approach helps in practice to investigate different basis functions or swap computational backends with only negligible changes to the code base, as I discuss with reference to the molsturm and the DFTK codes (see below). The ability to approach the physical problem on a high level allows mathematicians and physicists to interact with the code abstractly, while computer scientists still have the freedom to tweak performance on a lower level.

I have already discussed the basis-function independent quantum chemistry package molsturm and related work a few times, so I'll skip that in this article. Instead I want to detail some aspects of a more recent project, DFTK.jl, the density-functional toolkit. The idea of this code is to be simple and minimalistic, using existing libraries and codes as much as possible. This makes the code a lot more accessible, which facilitates to construct reduced models, which can be treated in mathematical proof. The hope is that the lessons learned can than be scaled in the same code base to larger and more realistic problems and an HPC environment. The choice of programming language for this project is julia, since it is a high-level and dynamical, but strongly typed language with an impressive performance and out-of-the-box compatibility with existing libraries written in C++, Fortran, python or R. Using features such as multiple dispatch and JIT (just-in-time) compilation of code, Julia seems to be a big step forwards in the direction, where code is written once and then can be translated many times for specific computational backends or hardware. I already presented about Julia in a previous talk a few weeks ago.

All in all I am very thankful for Andre for giving me the opportunity to gather some thoughts about this matter and eventually present them in this talk. The audience in Lille was surprisingly open about the topic and to my big surprise many interesting discussions arose. Throughout the afternoon I spent time with PhD students and staff researchers discussing my ideas in their application context, leading to loads of interesting feedback, helpful ideas, questions and comments. The time in Lille truly flew by very quickly with may aspects still undiscussed after the day. Luckily we all agreed to stay in touch, such that the discussions will likely continue during other meetups in the future.

The slides and the Jupyter notebooks I used during the talk are attached below.

Modern software-development techniques in electronic structure theory (Slides)
A 5-minute introduction to Julia (Jupyter notebook)