Ph.D. Thesis Defense: CDS: 05, August 2024 “A scalable asynchronous discontinuous Galerkin method for massively parallel flow simulations”

When

5 Aug 24    
3:00 PM - 4:00 PM

Event Type

DEPARTMENT OF COMPUTATIONAL AND DATA SCIENCES
Ph.D. Thesis Defense


Speaker : Mr. Shubham Kumar Goswami
S.R. Number : 06-18-00-10-12-19-1-17224
Title : “A scalable asynchronous discontinuous Galerkin method for massively parallel flow simulations ”
Thesis examiner: Dr. Praveen Chandrashekar, Center for Applicable Mathematics Tata Institute of Fundamental Research.
Research Supervisor: Dr. Konduri Aditya
Date & Time : August 05, 2024 (Monday) at 03:00 PM
Venue : # 102 CDS Seminar Hall


ABSTRACT

Accurate simulations of turbulent flows are crucial for understanding numerous complex phenomena in engineered systems and natural processes. Notably, under realistic conditions with high Reynolds numbers and complex geometries, the partial differential equations (PDEs) governing these fluid flows are highly nonlinear and are solved numerically using PDE solvers. Due to the presence of multiple length and time scales inherent to turbulent flows, these simulations are often computationally expensive, necessitating the use of massively parallel supercomputers. Despite several advancements in the development of scalable PDE solvers, they face scalability challenges at extreme scales due to communication overhead. To address this issue, an asynchronous computing approach that relaxes communication/synchronization at a mathematical level has been developed with finite difference schemes. However, these schemes are not amenable to capture flows in complex geometries with unstructured meshes. The objective of this thesis is to develop an asynchronous discontinuous Galerkin (ADG) method with the potential to provide high-order accurate solutions for various flow problems on structured and unstructured meshes and demonstrate its scalability. The thesis includes developing an approach to couple asynchronous schemes with low-storage Runge-Kutta schemes, then introducing the ADG method and investigating its properties, and finally implementing the proposed method into deal.II (open-source library) for scalability demonstrations.

Based on the asynchronous computing approach, several PDE solvers have been developed that use high-order asynchrony-tolerant (AT) finite difference schemes for spatial discretization to simulate reacting and non-reacting turbulent flows, achieving significant improvements in scalability. For time integration, they use either multi-step Adams-Bashforth schemes, which possess poor stability, or multi-stage Runge-Kutta (RK) schemes with an over-decomposed domain that necessitates larger message sizes for communication and redundant computations. In this work, we propose a novel method to couple asynchrony-tolerant and low-storage explicit RK (LSERK) schemes to solve time-dependent PDEs with reduced communication efforts. We develop new schemes for ghost or buffer point updates that are necessary to maintain the desired order of accuracy. The accuracy of this method has been investigated both theoretically and numerically using simple one-dimensional linear model equations. Thereafter, we demonstrate its scalability through three-dimensional simulations of decaying Burgers’ turbulence performed using two different asynchronous algorithms: communication-avoiding and synchronization-avoiding algorithms. Scalability studies up to 27,000 cores yielded a speed-up of up to 6x compared to a baseline synchronous algorithm.

In recent years, the discontinuous Galerkin (DG) method has gained considerable attention in developing PDE solvers, particularly for nonlinear hyperbolic problems, due to its ability to provide high-order accurate solutions in complex geometries, capture discontinuities, and exhibit high arithmetic intensity. However, the scalability of DG-based solvers is hindered by communication bottlenecks that arise at extreme scales. In this work, we introduce the asynchronous DG (ADG) method, which combines the benefits of the DG method with asynchronous computing by relaxing the need for data communication and synchronization at the mathematical level. The proposed ADG method ensures local conservation and effectively addresses challenges arising from asynchrony. To assess its stability, we employ Fourier-mode analysis to examine the dissipation and dispersion behavior of fully-discrete DG and ADG schemes with the Runge-Kutta (RK) time integration schemes across a wide range of wavenumbers. Furthermore, we present an error analysis demonstrating that the ADG method with standard numerical fluxes achieves at most first-order accuracy. To recover accuracy, we derived asynchrony-tolerant (AT) fluxes that utilize data from multiple time levels. Finally, extensive numerical experiments are conducted to validate the performance and accuracy of the ADG-AT scheme for both linear and nonlinear problems.

With the development of the asynchronous discontinuous Galerkin (ADG) method, we finally put our focus on implementing and evaluating its performance in solving hyperbolic equations with shocks/discontinuities.

To achieve this, we chose a highly scalable DG solver for compressible Euler equations from deal.II, which is one of the widely used open-source finite element libraries. The solver uses low-storage explicit Runge-Kutta schemes for the time integration. We implemented the ADG method in deal.II, incorporating the communication-avoiding algorithm (CAA), and performed accuracy validation and scalability benchmarks. The results showcase the accuracy limitations of standard ADG schemes and the effectiveness of newly developed asynchrony-tolerant (AT) fluxes. Strong scaling results are provided for both synchronous and asynchronous DG solvers, demonstrating a speedup of up to 80% with the ADG method at an extreme scale with 9216 cores.

This thesis focused on the development of scalable PDE solvers based on the asynchronous discontinuous Galerkin method for massively parallel flow simulations. Although these advancements were specifically geared towards the DG method, they are also applicable to the finite volume (FV) method and can be easily integrated into commercial FV-based PDE solvers. The overall work highlights the potential benefits of the asynchronous approach for the development of accurate and scalable DG and FV-based PDE solvers, paving the way for simulations of complex physical systems on massively parallel supercomputers.


ALL ARE WELCOME