45e Forum : Quelles précisions pour le HPC ?

Le formulaire d’inscription est ici.

9:15 – 9:30 Introduction

E. Audit, Maison de la simulation

9:30 – 10:00 Representation of numbers in today’s applications

David Defour (University of Perpignan)

Abstract

Scientific programs rely on floating-point numbers with comfort, thanks to the few experts in the domain, that continue to address potential issues that arise from its usage. These issues aren’t considered due to several factors. The first being that most major issues have been addressed before they have been blown out of proportion, in other words incidents like the Pentium Bug and the crash of the Ariane V rocket haven’t occurred in a long time which misleads us to believe that those problems belong in the past! Secondly, it would seem that developers and software users aren’t aware of and do not see the need to familiarize themselves with how numbers can affect their daily usage of softwares. 

For many years, in order to avoid numerical issues, the common rule of thumb used by developers of HPC software has been to rely on overestimated formats such as the double precision. This behavior was acceptable as the hardware was able to sustain acceptable performance for such applications and dataset. This corresponds to the time of the “free lunch” of numerical resources. This is no longer satisfactory for two opposites reasons. On the low-end side, some applications (ex: IA), using numbers represented on reduced formats lead to important speedup. On the high-end side, the Exascale era which is just around the corner, would mean dealing with software’s running on larger problems with longer chains of floating-point operations. In this case, the double precision might not be able to manage the important increase of numerical errors.

In this talk, we will give an overview on how numerical quantity is handled both in software and hardware, the numerical issues and trade-offs that users and developers are facing and the class of solutions that are offered to you.

Bio

David Defour is an associate professor at the University of Perpignan. He is serving as the scientific coordinator for the past 8 years at the regional HPC center MESO@LR and his research interests include computer arithmetic and computer architecture. For the past 20 years, he has been working on developing solutions for « unconventional » arithmetic’s targeting multicore architecture and more specifically GPUs. 

10:30 – 11:00 Coffee break

11:00 – 11:30 Accelerating scientific discovery with CUDA mixed precision architecture

François Courteille (NVIDIA)

Abstract

Availability of reduced precision floating-point arithmetic, which provides advantages in speed, energy, communication costs and memory usage over single and double precisions is changing the landscape of HPC. Initially motivated by deep learning the hardware support of IEEE half precision and bfloat16 arithmetic is opening a new processing path in scientific computing by enabling mixed precision algorithms that work in single or double precision but carry out part of a computation in reduced precision. In this talk after briefly touching tools to analyze application precision sensitivity we will introduce NVIDIA hardware (Tensor cores) architecture and the software environment to seamlessly implement and deploy mixed precision applications in HPC and/or Machine Learning fields; use cases will be presented.

Bio

François Courteille is a principal solution architect at NVIDIA working with customers to develop accelerated High Performance Computing and Machine Learning solutions. He is particularly focused on applications from education and research and energy industry verticals. Prior to joining NVIDIA, François spent three decades as technical leader for HPC companies, Control Data Corporation, Evans & Sutherland, Convex, NEC Corporation, where he ported and tuned a broad portfolio of HPC application software on large scale parallel and vector systems. He has a MS degree in Computer Science from Institut National des Sciences Appliquées (INSA) de Lyon, France.

12:00 – 13:00 Point Europe & Genci

Jean-Philippe Nominé & Stéphane Requena

13:00 – 14:00 Lunch

14:00 – 14:30 Precision auto-tuning and control of accuracy in high performance simulations

Fabienne Jézéquel (Panthéon-Assas University)

Abstract

In the context of high performance computing, new architectures, becoming more and more parallel, offer higher floating-point computing power. Thus, the size of the problems considered (and with it, the number of operations) increases, becoming a possible cause for increased uncertainty. As such, estimating the reliability of a result at a reasonable cost is of major importance for numerical software. In this talk we describe the principles of Discrete Stochastic Arithmetic (DSA) that enables one to estimate rounding errors in simulation codes. DSA can be used to control the accuracy of programs in half, single, double and/or quadruple precision via the CADNA library (http://cadna.lip6.fr), and also in arbitrary precision via the SAM library (http://www-pequan.lip6.fr/~jezequel/SAM). Thanks to DSA, the accuracy estimation and the detection of numerical instabilities can be performed in parallel codes on CPU and on GPU. Most numerical simulations are performed in double precision, and this can be costly in terms of computing time, memory transfer and energy consumption. We also present the PROMISE tool (PRecision OptiMISE, http://promise.lip6.fr) that aims at reducing in numerical programs the number of double precision variable declarations in favor of single precision ones, taking into account a requested accuracy of the results.

Bio

Fabienne Jézéquel is Associate Professor in Computer Science in Panthéon Assas University in Paris, France. She leads the PEQUAN (PErformance and QUality of Algorithms for Numerical applications) team in the Computer Science Laboratory LIP6 of Sorbonne University in Paris. She received from Pierre-and-Marie Curie University in Paris a PhD in 1996 and an HDR (Habilitation à Diriger des Recherches) in 2005. Her work is centered around designing efficient and reliable numerical algorithms on various parallel architectures. She is particularly interested in optimizing convergence criteria of iterative algorithms by taking into account rounding errors.

Scalable polarizable molecular dynamics using Tinker-HP: massively parallel implementationss on CPUs and GPUs

Jean-Philip Piquemal (Laboratoire de Chimie Théorique, Sorbonne Université)

Abstract

Tinker-HP is a CPU based, double precision, massively parallel package dedicated to long polarizable molecular dynamics simulations and to polarizable QM/MM. Tinker-HP is an evolution of the popular Tinker package (http://tinker-hp.ip2ct.upmc.fr/) that conserves it simplicity of use but brings new capabilities allowing performing very long molecular dynamics simulations on modern supercomputers that use thousands of cores. Tinker-HP proposes a high performance scalable computing environment for polarizable force fields giving access to large systems up to millions of atoms. I will present the performances and scalability of the software in the context of the AMOEBA force field and show the incoming new features such as the “fully polarizable” QM/MM capabilities. As the present implementation is clearly devoted to petascale applications, the applicability of such an approach to future exascale machines will be exposed and future directions of Tinker-HP discussed including the new GPUs-based implementation that uses mixed precision.

Bio

Jean-Philip Piquemal est Professeur de classe exceptionnelle en chimie théorique à Sorbonne Université et Directeur du Laboratoire de Chimie Théorique (LCT) de Sorbonne Université (UMR CNRS 7616). Il est également
membre junior de l’IUF. Récemment, il a fait partie des équipes lauréates
du Prix Atos-Joseph Fourier en calcul haute performance ainsi que d’un
financement ERC Synergy pour le projet Extreme-Scale Mathematics for
Computational Chemistry.

15:00 – 15:30 Reduced Numerical Precision in Weather and Climate Models

Peter Dueben (ECMWF)

Abstract

In atmosphere models values of relevant physical parameters are oftenuncertain by more than 100% and weather forecast skill is decreasing significantly after a couple of days. Still, numerical operations are typically calculated with 15 decimal digits of numerical precision for real numbers. If we reduce numerical precision, we can reduce power consumption and increase computational performance significantly. Savings can be reinvested to allow simulations at higher resolution.
We aim to reduce numerical precision to the minimal level that can be justified by information content in the different components of weather and climate models. But how can we identify the optimal precision for a complex model with chaotic dynamics? We found that a comparison between the impact of rounding errors and the influence of sub-grid-scale variability can provide valuable information and that the influence of rounding errors can actually be beneficial for simulations since variability is increased. We have performed multiple studies that investigate the use of reduced numerical precision for atmospheric applications of different complexity (from Lorenz’95 to global circulation models) and studied the trade of numerical precision against performance. 

Bio

Peter is the Coordinator of machine learning and AI activities at ECMWF and holds a University Research Fellowship of the Royal Society that allows him to follow his research interests in the area of numerical weather and climate modelling, machine learning, and high-performance computing. Before moving to ECMWF, he wrote his PhD thesis at the Max-Planck Institute for Meteorology in Hamburg, Germany, on the development of a finite element dynamical core for Earth System models. During the subsequent Postdoctoral Position with Professor Tim Palmer at the University of Oxford, he was focusing on the study of reduced numerical precision to speed-up simulations of Earth System models.

16:00 – 16:30 The need of precision level in medical physics simulations

Julien Bert (CHRU Brest – LaTIM)

Abstract

Monte Carlo simulations (MCS) play a key role in medical applications, both for imaging and radiotherapy by accurately modelling the different physical processes and interactions between particles and matter. However, MCS are also associated with long execution times, which is one of the major issues preventing their use in routine clinical practice for both image reconstruction and dosimetry applications. Within with context we are developing methods to speed-up and parallelize MCS, especially using GPU. Results from the MCS need different levels of precision according the target applications. In addition, precision is also depending of the algorithms used inside the MCS core engine to ensure physics calculation and particle propagation. This presentation will talk about the different needs of computing precision in medical physics applications. We will discuss the advantage and the inconvenient to have multiple or a single level of precision within the same simulation software.

Bio

J. Bert received a Ph.D. in control engineering in 2007, and a Habilitation to conduct researches (HDR) in health technologies in 2018. He holds a permanent research scientist position at the Brest Regional University Hospital and he is member of the LaTIM – INSERM UMR1101. His main research interest is in image-guided therapy especially in medical physics. This include medical applications in external beam and intra-operative radiotherapy and also in X-ray guided interventional radiology. Within this context, he is leading a research group of 20 peoples. A group that is implicated in several national and European research projects, working on multidisciplinary domains: treatment planning system, Monte-Carlo simulation, image processing, robotics, computer-vision and virtual reality.

Space-Time Parallel Strategies for the Numerical Simulation of Turbulent Flows

Thibaut Lunet (University of Geneva)

Abstract

Unsteady turbulent flow simulations using the Navier-Stokes equations are complex and computationally demanding problems, especially when using Direct Numerical Simulation (DNS) for highly accurate solution. The development of supercomputer architectures over the last century allowed to use massively space parallel computation to perform DNS of extremely large size (e.g DNS of Turbulent Channel Flow by Lee and Moser, 2015, up to 600 Billions degrees of freedoms). However, new supercomputer architectures available in the next decade will be characterized with increased computational power based on a larger number of cores rather than significantly increased CPU frequency (e.g Summit, current top super-computing system, 2.5 Millions cores). Hence most of the current generation CFD software will face critical efficiency issues if bounded to massive spatial parallelization (O(10^{7-8}) cores).

Since six decades, an alternative solution to exclusive space parallelization has been investigated, and consists on adding parallel decomposition in the time dimension, namely Parallelization in Time (PinT). It has received renewed attention in the last two decades with the invention of the Parareal algorithm (Lions, Maday and Turinici), and the development of other PinT algorithms have shown that they could be an attractive alternative to enhance efficiency on multi-cores architectures.

In this talk, we introduce the basic ideas of PinT algorithms, and present a short state of the art of current solutions with their associated results. Then, we detail the main challenges when applying PinT algorithms to enable space-time parallelization for large scale DNS of turbulent flows, and illustrate this by some applications. Finally, we conclude by giving prospects on future developments towards generalized use of PinT methods within the next generation CFD softwares.

Bio

Thibaut Lunet is a post doctorate at the University of Geneva, in the team of M. Gander. He received a Ph.D. in Applied Mathematics and Computational Fluids Dynamics, after conducting a doctorate at ISAE-Supaero and CERFACS (Toulouse), supervised by S. Gratton, J. Bodart and X. Vasseur. The thesis, focusing on the development of space-time parallel strategies for turbulent flow simulation, was awarded the Paul Caseau Prize, by EDF and the French Academy of Technology, in November 2020.