Research Fellows
Mladen Banovic
Early Stage Researcher 12 at Universität Paderborn
The VKI's in-house CAD tool and optimization software CADO was differentiated by ESR 13 - Ismael Sanchez Torreguitart (VKI), using the AD software tool ADOL-C. Furthermore, during the secondment of ESR 12 - Mladen Banovic (UPB) to VKI, the reverse mode of AD has been applied to LS89 axial turbine test-case, where the derivatives were successfully validated against the forward mode of AD. However, due to large memory consumption of the differentiated sources, further work is carried by ESR 12 at UPB to perform structure exploitation, i.e. to modify the reverse differentiated CADO sources to increase their efficiency.
To create a computational domain for the VKI's in-house CFD tool, CADO executes a workflow that can be divided into three main parts: (i) construct 2-D geometry and mesh blocks of the LS89 airfoil, (ii) perform mesh smoothing and (iii) extrude 2-D mesh computed in the previous step to create the 3-D mesh.
To compute derivatives of the previously described workflow using the reverse mode of AD, ADOL-C first generates an internal representation of the code to be differentiated, called trace. The trace is a binary file that contains the whole computational graph required to evaluate the derivatives.
Depending on the code complexity, the trace files can become very large. This was the case with the differentiated CADO sources, where the trace size took approx. 36GB of disk space. Evaluating such a trace with ADOL-C drivers becomes very slow, so the aim is to reduce its size such that it can fit into random-access memory in order to improve the overall performance.
The process of structure exploitation required complete understanding of the CADO sources. The cause for such big memory requirements was found in the mesh smoothing part. It is an iterative process that executes almost identical code sequence 300 times. For this reason, the CADO source code was modified to decouple the original trace into three small traces, as illustrated in the following figure:
Here, the trace T2 is used to record only the first mesh smoothing iteration and reevaluated 300 times to compute the required derivatives. All traces were coupled to correctly propagate the derivative information and the derivatives were successfully validated against the original differentiated sources.
With this code improvements, the memory requirements for the derivative computation using the reverse mode of AD have been reduced from approximately 36GB to 2GB, i.e. by a 95%. Additionally, a specific algorithmic differentiation technique called checkpointing has been integrated to speed-up the execution time. The memory and run-time overview is shown in the figure below.