A few weeks ago, on 23rd May I received an invitation
to visit the
Laboratoire de Physique des Lasers, Atomes et Molécules
at the Université de Lille and present about my recent work.
When it came to selecting a topic, me and my host, Andre Gomes,
quickly agreed to focus on discussing
more modern approaches to scientific software development
and why these can be useful for electronic-structure codes.
In retrospect I am very happy for this opportunity
to summarise a few ideas and learned lessons
from my previous and ongoing software projects.
I also took the liberty to sketch possible challenges
when it comes to future directions of scientific software
from my personal point of view as well as
propose some ideas how to tackle them.
From this angle I ended up introducing the talk
by a review of the fundamental difficulties of electronic structure
theory itself, namely the inherent complexity of the problem
(large dimensionality and non-linearity of the respective
partial-differential equations). Added to that in recent years
the available high-performance computing (HPC) environments have become
more heterogeneous and more complex as well. Mixed general-purpose GPU / CPU
cluster setups are now standard and making use of both CPU and GPU
for computations has become a strict requirement for applying for access
to the top-ranked clusters of the world. Projecting into the future,
where further "accelerators" such as field-programmable gate arrays (FPGAs),
or even more long term quantum computers, might come up,
this heterogeneity is only going to increase.
For quickly testing the usefulness of such technologies in the context of
quantum chemistry, a key aspect is to be able to reuse the code,
which already exists. This means that code needs to be sufficiently high-level
in order to be able to adapt to changing hardware architectures.
Let be be clear on this: I am not talking about achieving 100% peak performance
on every hardware setup, much rather about having a good compromise
between the required effort to get the code to work on new hardware
and the achieved performance. The point is to quickly get an idea
on what is theoretically possible and if further investigation even
As I sketch in the talk,
changing the hardware, does have an impact on questions such as the
optimal basis function, numerical algorithm or even the best-suited
physical model. In this "suitable" and "optimal" refers to
a simulation procedure, which agrees with the available
computational architecture and at the same time is physically meaningful.
A consequence of this observation is that experimentation on all levels
(basis functions, algorithms, numerics, linear algebra backend) is
required, which often calls for interdisciplinary expertise.
In an ideal world experts from different fields can thus work
on the same code base, approaching the problem from various different angles.
Without a doubt this is a huge task and chances are, that this goal
will in fact never be reached. Still I think,
there are some opportunities to get a little closer than presently.
For this purpose, key aspects are
high-level and dynamic programming languages
like Julia and python and a clear, modular code design,
where individual aspects of the simulation procedure
can be easily modified or swapped.
Such an approach helps in practice to investigate
different basis functions or swap computational backends
with only negligible changes to the code base,
as I discuss with reference to the
molsturm and the DFTK codes (see below).
The ability to approach the physical problem on a high level
allows mathematicians and physicists to interact
with the code abstractly, while computer scientists still have the freedom
to tweak performance on a lower level.
I have already discussed
the basis-function independent quantum chemistry package
a few times,
so I'll skip that in this article.
Instead I want to detail some aspects of
a more recent project, DFTK.jl,
the density-functional toolkit.
The idea of this code is to be simple and minimalistic,
using existing libraries and codes as much as possible.
This makes the code a lot more accessible, which facilitates
to construct reduced models, which can be treated in mathematical proof.
The hope is that the lessons learned can than be scaled in the same
code base to larger and more realistic problems and an HPC environment.
The choice of programming language for this project is julia,
since it is a high-level and dynamical, but strongly typed language
with an impressive performance and out-of-the-box compatibility with existing
libraries written in C++, Fortran, python or R.
Using features such as multiple dispatch and JIT (just-in-time)
compilation of code, Julia seems to be a big step forwards in the direction,
where code is written once and then can be translated many times
for specific computational backends or hardware.
I already presented about Julia
in a previous talk
a few weeks ago.
All in all I am very thankful for Andre for giving me the opportunity
to gather some thoughts about this matter and eventually present them
in this talk.
The audience in Lille was surprisingly open about the topic and to my
big surprise many interesting discussions arose.
Throughout the afternoon I spent time with PhD students and staff researchers
discussing my ideas in their application context,
leading to loads of interesting feedback, helpful ideas, questions and comments.
The time in Lille truly flew by very quickly with may aspects still
undiscussed after the day. Luckily we all agreed to stay in touch,
such that the discussions will likely continue during other
meetups in the future.
The slides and the Jupyter notebooks I used during the talk are attached below.