M.Tech Research Thesis {Colloquium}: CDS: “Asynchronous computing and low-precision approaches towards accelerating flow simulations.”

When

18 May 26    
9:30 AM - 10:30 AM

Event Type


Speaker : Mr. Aswin Kumar A
S.R. Number : 06-18-01-10-22-24-1-24417
Title : “Asynchronous computing and low-precision approaches towards accelerating flow simulations.”
Research Supervisor : Dr. Konduri Aditya
Date & Time : May 18, 2026, 09.30 AM
Venue : # 102 CDS Seminar Hall


ABSTRACT

The increasing computational cost of high-fidelity flow simulations at extreme scales has made communication overheads arising from data movement and synchronization a major bottleneck in modern high-performance computing. At the same time, emerging GPU- and TPU-based architectures provide significantly higher throughput for low-precision arithmetic compared to traditional double-precision computations. This thesis investigates asynchronous computing approaches and low-precision numerical frameworks towards accelerating compressible and reacting flow simulations on future exascale supercomputers.

The primary focus of this work is the development and evaluation of asynchronous numerical methods that relax communication and synchronization at a mathematical level while preserving the high-order accuracy of the underlying numerical schemes. Previously developed asynchrony-tolerant (AT) schemes are incorporated into the high-order compressible flow solver COMP-SQUARE in a multi-block framework for practically relevant flow problems in complex geometries. Two asynchronous algorithms are considered: one that avoids communication over a few predetermined time steps, and another that initiates communication without enforcing synchronization. The numerical efficacy and scalability of these asynchronous algorithms are demonstrated for several benchmark problems, including isentropic advection of a vortex, the Taylor-Green vortex, and the highly sensitive case of transitional flow over a NACA0012 airfoil. Scaling experiments performed on up to 18,432 cores demonstrate speed-ups of up to four times with respect to the baseline synchronous solver while maintaining solution accuracy. These results demonstrate the applicability of AT schemes to established CFD solvers for improving scalability at extreme scales.

This work further extends the asynchronous computing framework to discontinuous Galerkin (DG) methods for compressible reacting flows. Although DG methods are attractive for their high arithmetic intensity and their ability to accurately handle discontinuities such as shocks and detonations, their scalability is also limited by communication bottlenecks arising from synchronization between processing elements (PEs). An asynchronous discontinuous Galerkin (ADG) method is developed for chemically reacting flows with detailed chemistry, and new asynchrony-tolerant weighted essentially non-oscillatory (AT-WENO) limiters are proposed to accurately capture discontinuities in the presence of communication delays near PE boundaries. The numerical properties of the ADG framework are evaluated for spontaneous ignition, premixed flame propagation, and detonation-wave propagation on a one-dimensional domain. The asynchronous solver accurately captures ignition fronts and discontinuities while incurring negligible numerical errors at PE boundaries. Preliminary scaling studies further demonstrate the potential of the ADG method as a basis for highly scalable DG-based solvers for massively parallel combustion simulations.

In addition to asynchronous algorithms, this thesis also explores low-precision approaches for reacting-flow simulations motivated by the hardware characteristics of modern accelerators. A low-precision framework is investigated in which the chemical kinetics evaluations are performed in half precision (FP16), while the nonlinear temperature solve is performed in higher precision. The framework is assessed using lean hydrogen-air autoignition with detailed chemical kinetics. Predictions of ignition delay and the evolution of temperature, heat-release rate, and species mass fractions show excellent agreement with FP64 reference solutions over a range of conditions. These preliminary findings demonstrate the feasibility of low-precision approaches for reacting-flow solvers while also identifying important considerations regarding robustness and generalizability.

Overall, this thesis demonstrates that asynchronous computing methodologies and low-precision numerical approaches provide promising and complementary pathways towards improving the scalability and computational efficiency of next-generation flow solvers for exascale scientific computing.


ALL ARE WELCOME