ctx: Key-value datastructure for organised hierarchical storage

As an aside from my plane-wave DFT project at CERMICS, I spent some time in the last few weeks polishing up the ctx context library and publishing it via github. The ctx library provides one approach how to tackle a common challenge in larger scientific simulation codes, namely how to organise simulation data.

Often each substep of a larger simulation code by itself already requires a large number of input data in order to produce its results. To increase the flexibility and modularity of the code base, the data/parameters required for one part of the procedure are best hidden from the respective others. Naturally, complete separation cannot be achieved, since the steps do need to exchange some results or input. The question is therefore where and how to find a good balance between accessing data from other simulation steps and shielding them.

Typically, simulation procedures are inherently hierarchical. E.g. the computation of a particular property of an electronic structure might be achieved by solving linear-response equation, which in turn requires to solve a linear system of equations, e.g. by a contraction-based iterative algorithm, which in turn requires the computation of matrix-vector products. In such an hierarchy of algorithms data only needs to flow up or down, but not sideways. This is to say that, e.g. computing a second property via another linear-response equation, might again be using a similar tree of methods, but does not necessarily need access to all individual details of the first tree. This immediately leads to the approach of a tree-like, hierarchical storage for data and parameters, where a simulation substep may only work on its own subtree, but not on any of its parents or siblings.

This is where the library ctx comes in. It offers a C++ implementation of a tree-like string-to-value mapping, called CtxMap, which allows to store data of arbitrary type. In line with above approach, a key in a CtxMap is a path-like string such as /this/is/a/path/to/a/value. By means of functionality such as views into subtrees or C++ iterators over ranges of keys, navigating and accessing the data from the CtxMap from different parts of the code is greatly facilitated. Since all key objects are stored as std::shared_ptr objects, memory safety as well as good integration into the C++ standard library are assured.

Originally I started working on ctx, when I wanted to connect our molsturm package to the adcman code also developed at the Dreuw group in Heidelberg. In the design of ctx I took strong inspirations from both the PamMap and the GenMap data structures I have been playing with, as well as the libctx library by E. Epifanovsky et. al.. The latter code had similar goals of providing hierarchical storage of data in mind and was used for this purpose inside the Q-Chem quantum chemistry code. From this respect I am very happy that ctx has now become the successor of libctx inside Q-Chem, providing ctx with an application in the context of a larger code base.

For further details and the ctx source code, see the ctx project page on github. ctx can be cited using DOI 10.5281/zenodo.2590706.