michael-herbst.com Research and projects

michael-herbst.com
  • Blog
    • Recent articles
    • Article archives
  • Research
    • Collaborations
    • Software / DFTK / adcc
    • Reliable and efficient DFT
    • Robust error control / algorithmic differentiation
    • Core-excited states
  • Publications
  • Upcoming
  • Curriculum Vitae
  • Teaching
    • Mathematical Aspects of Computational Chemistry
    • RWTH Julia workshop 2022
    • Juliacon DFTK workshop 2021
    • All teaching resources
  • Contact

[c¼h] Parallelised numerics in Python: An introduction to Bohrium

Thursday a week ago I gave a brief introductory talk in our Heidelberg Chaostreff about the Bohrium project. Especially after the HPC day at the Niels Bohr Institute during my recent visit to Copenhagen, I became rather enthusiastic about Bohrium and wanted to pass on some of my experiences.

The main idea of Bohrium is to build a fully numpy-compatible framework for high-performance computing, which can automatically parallelise numpy array operations and/or execute them on a general-purpose graphics cards. The hope is that this eradicates the step of rewriting a prototypical python implementation of a scientific model in more low-level languages like C++ or CUDA before dealing with the actual real-world problems in mind.

In practice Bohrium achieves this by translating the python code (via some intermediate steps) into small pieces of C or CUDA code. These are then automatically compiled at runtime of the script, taking into account the current hardware setup, and afterwards executed. The results of such a just-in-time compiled kernel are again available in numpy-like arrays and can be passed to other scripts for post-processing, e.g. plotting in matplotlib.

It is important to note, that the effect of Bohrium is limited to array operations. So for example the usual Python for loops are not touched. This is, however, hardly a problem if the practice of so-called array programming is followed. In array programming one avoids plain for-loops and similar traditional python language elements in preference for special syntax which works on blocks (numpy arrays) of data at once. Examples of such operations is pretty much the typical numpy workflow:

  • views and slices: array[1:3]
  • broadcasting: array[:, np.newaxis]
  • elementwise operations: array1 * array2
  • reduction: np.sum(array1, axis=1)

A slightly bigger drawback of Bohrium is, that the just-in-time compilation takes time, where no results are produced. In other words Bohrium does only start to pay of at larger problem sizes or if exactly the same sequence of instructions is to be executed many times.

In my c¼h I demonstrate Bohrium by the means of this example script

#!/usr/bin/env python3

import numpy as np
import sys
import time


def moment(n, a):
    avg = np.sum(a) / a.size
    return np.sum((a - avg)**n) / a.size


def compute(a):
    start = time.time()

    mean = np.sum(a) / a.size
    var  = moment(2, a)
    m3   = moment(3, a)
    m4   = moment(4, a)

    end = time.time()

    fmt = "After {0:8.3f}s: {1:8.3f} {2:8.3f} {3:8.3f} {4:8.3f}"
    print(fmt.format(end - start, float(mean), float(var),
                     float(m3), float(m4)))


def main(size, repeat=6):
    for i in range(repeat):
        compute(np.random.rand(size))


if __name__ == "__main__":
    size = 30
    if len(sys.argv) >= 2:
        size = int(sys.argv[1])
    main(size)

which is also available for download. The script performs a very simple analysis of the (randomly generated) input data: It computes some statistical moments and displays them to the user. For bigger arrays the single-threaded numpy starts to get very slow, whereas the multi-threaded Bohrium version wins even thought it needs to compile first. Running the script with Bohrium does not require one to change even a single line of code! Just

python3 -m bohrium ./2017.07.13_moments.py

does kick off the automatic parallelisation.

The talk has been recorded and is available on youtube. Note, that the title of the talk and the description are German, but the talk by itself is in English.

Posted on Di 25 Juli 2017 in Chaos.

Tags: talk NoName Bohrium parallelisation and HPC


  1. [c¼h] Testen mit Rapidcheck und Catch

    Last Thursday, I gave another short talk at the Heidelberg Chaostreff NoName e.V. This time I talked about writing tests in C++ using the testing libraries rapidcheck and Catch.

    In the talk I presented some ideas how to incorporate property-based testing into test suites for C++ programs. The idea …

    read more
    Posted on Mo 14 März 2016 in Chaos.

    Tags: talk NoName C++ testing and programming and scripting

  2. [c¼h] Einführung in die Elektronenstrukturtheorie

    The week before my bash scripting course I gave another short talk for the weekly meeting of the Heidelberg Chaostreff NoName e.V. Unlike last time the talk was not concerned with a traditional "Hacker" topic, but much rather I tried to give a brief introduction into my own research …

    read more
    Posted on Mi 30 September 2015 in Chaos.

    Tags: talk NoName electronic structure theory and theoretical chemistry

  3. [c¼h] Härtere Crypto für unsere Services

    Last Thursday I gave a talk (in German) in our local Chaostreff in Heidelberg — the NoName e.V. The main topic was to introduce the various cryptographic algorithms used in modern cryptography and to give practical advice how to improve the default configuration even further. The talk mainly focuses on …

    read more
    Posted on Mo 02 Februar 2015 in Chaos.

    Tags: talk NoName cryptography System Administration ssh and TLS

Social

  • Blog articles (Atom)
  • github.com/mfherbst
  • 0000-0003-0378-7921
  • arXiv.org preprints

Recent publications

  • Efficient response property calculations in DFT
  • Surrogate models for quantum spin systems
  • QCDB / QCEngine
  • Adaptive damping for SCFs
  • Q-Chem 5 paper
  • Full list of publications ...

Recent talks

  • GdR nbody meeting
  • GdR REST ML discussion
  • VMD 2021: Black-box DFT methods
  • Full list of talks ...

Recent teaching

  • DFTK workshop: Mathematics and numerics of density-functional theory
  • Mathematical Aspects of Computational Chemistry
  • An introduction to the Julia programming language
  • Full list of teaching ...

Blog categories

  • Chaos
  • Publications
  • Research
  • System Administration
  • Teaching
  • Uncategorised

    Blog tags

  • adcc
  • algebraic-diagrammatic construction
  • computer science
  • convergence
  • Coulomb Sturmians
  • DFT
  • DFTK
  • electronic structure theory
  • high-throughput
  • HPC
  • invited talk
  • Julia
  • Kohn-Sham
  • lazy matrices
  • numerical analysis
  • programming and scripting
  • solid state
  • talk
  • theoretical chemistry
  • workshop
Powered by pelican, python and Jinja2.