19 Mars 2019, CNRS, rue Michel Ange, Paris
HPC has always advanced science by being at the cutting edge of computer technologies. This last decade numerical computing has been growing fast in many directions: higher fidelity, multi-physics models; deluge of observational data from sensors; data analysis and post-processing; AI-based models; remote visualization, etc. Combining all these aspects results in highly complex application (software) architectures.
As a consequence scientific discovery relies more and more on implementing workflows that bring together heterogeneous components. These components deal with experimental and observational data management and processing, as well as with computing and data analytics combined with (meta-) data logistics. The workflows may include dynamicity and closed-loops where rare or unpredictable events trigger a flow of activities. They include a large ensemble of heterogeneous resources. Uncertainty in data content adds uncertainty into workflow reaction / evolution. In this vision, high performance computing has to transition from short to long running persistency of services; from static to dynamic allocation of resources, data, and tasks; from homogeneous to heterogeneous software stack and services.
The deployment of such complex workflow not only questions the HPC infrastructure (hardware and software) access and use modes, but also the networks infrastructure, and will require new approaches to security.
This 43th Forum will shed light on workflows for scientific computing.
- Introduction (15″) : Isabelle Perseil
9:15 – 9:35 Workflows, a crucial challenge for the future of HPC (François Bodin, Irisa, Univ Rennes)
The increasing use of data in HPC applications is changing the way numerical models are used in the scientific discovery process. The ”traditional” numerical models are more and more part of a long processing and storage chain that includes data analytics and other machine learning techniques. The supercomputers cannot be viewed in isolation anymore. They are becoming one of the systems used to deploy scientific application workflows. In this talk we review the evolution and impact of new approaches that combines data and computing.
Workflows, a crucial challenge for the future of HPC
Prof François BODIN (male) is presently Professor at the University of Rennes I. He was the founder and CTO of the former company Caps-Enterprise specialized in programming tools for high performance computing (HCP). He is also at the origin of another company, Tocea that focuses on control of source code quality and refactoring. His research contributions include new approaches for exploiting high performance processors in scientific computing and in embedded applications. He held various positions including Chairman of the Irisa computer science laboratory in Rennes. His activity currently focuses on the convergence between HPC and data technologies. He is the holder of the chair “Mobility in a sustainable city” awarded by the Rennes 1 Foundation.
9:35 – 10:15 Scientific Workflows: a Bottom-Up Perspective (Bruno Raffin, INRIA Research Director and leader of the DataMove team)
The term « Scientific Workflow » encompasses a large variety of realities. In this talk we take a bottom-up approach, starting from the classical parallel numerical simulation at the heart of any workflow and analysing different scenarios requiring to couple extra software pieces up to making advanced workflows. We will start from the simple two phase I/O concept, going through in situ processing, computational steering, ensemble runs, statistical data assimilation, deep learning. One challenge is to build digital twins capable of combining data from sensors, simulations and advanced data analysis processes to ease scientific discovery, better monitor and manage natural resources, living environments or industrial infrastructures. To conduct my talk I will rely on state of the art work as well as my experience with the development of frameworks like FlowVR, Melissa, Tins and various associated use cases.
Bruno Raffin is Research Director at INRIA and leader of the DataMove research team. Bruno Raffin has a PhD from the Université d’Orléans on parallel programming language design (1997). After a 2 years postdoc at Iowa State University he refocused his research on high performance interactive computing. He led the development of the data-flow oriented FlowVR and Melissa middlewares, used for scientific visualization, computational steering, in situ analytics, sensitivity analysis of large-scale parallel applications. He also worked on parallel algorithms and cache-efficient parallel data structures (cache oblivious mesh layouts, parallel adaptive sorting), strategies for task-based programming of multi-CPU and multi-GPU machines. Bruno Raffin accounts for more than 60 international publications, advice 16 PhD students and 3 postdocs. He was co-founder of the Icatis start-up company (2004-2008). He has been responsible for INRIA of more than 15 national and European grants. Bruno Raffin has been involved in more than 40 program committees of international conferences, and chairs the steering committee of the Eurographics Symposium on Parallel Graphics and Visualisation. He leads the INRIA Project Lab HPC BigData about the convergence between HPC, Big Data and AI.
10:45 – 11:45 Challenges of Managing Scientific Workflows in High-Throughput and High-Performance Computing Environments (Ewa Deelman, USC Information Sciences Institute)
Modern science often requires the processing and analysis of vast amounts of data in search of postulated phenomena, and the validation of core principles through the simulation of complex system behaviors and interactions. In order to support these computational and data needs, new knowledge must be gained on how to deliver the growing high-performance and distributed computing resources to the scientist’s desktop in an accessible, reliable, and scalable way.
In over a decade of working with domain scientists, the Pegasus project has developed tools and techniques that automate the computational processes used in data- and compute-intensive research. Among them is the scientific workflow management system, Pegasus, which is being used by researchers to discover gravitational waves, model seismic wave propagation, to discover new celestial objects, to study RNA critical to human brain development, and to investigate other important research questions.
This talk will examine data-intensive workflow-based applications and their characteristics, the execution environments that scientists use for their work, and the challenges that these applications face. The talk will also discuss the Pegasus Workflow Management System and how it approaches the execution of data-intensive workflows in distributed heterogeneous environments. The latter include HPC systems, HTCondor pools, and clouds.
Ewa Deelman is a Research Professor at the University of Southern California’s Computer Science Department and a Research Director at the USC Information Sciences Institute (ISI). Dr. Deelman’s research interests include the design and exploration of collaborative, distributed scientific environments, with particular emphasis on automation of scientific workflow and management of computing resources, as well as the management of scientific data. Her work involves close collaboration with researchers from a wide spectrum of disciplines. At ISI she leads the Science Automation Technologies group that is responsible for the development of the Pegasus Workflow Management software. In 2007, Dr. Deelman edited a book: “Workflows in e-Science: Scientific Workflows for Grids”, published by Springer. She is also the founder of the annual Workshop on Workflows in Support of Large-Scale Science, which is held in conjunction with the SC conference.
In 1997 Dr. Deelman received her PhD in Computer Science from the Rensselaer Polytechnic Institute.
11:45 – 12:45 Building A Wide Area Platform To Support Converged Storage, Networking and Computation (Micah Beck, NSF)AbstractIn every form of digital store-and-forward communication, intermediate forwarding nodes are computers, with attendant memory and processing resources. This has inevitably stimulated efforts to create a wide-area infrastructure that goes beyond simple store-and-forward to create a wide area platform for deploying distributed services that makes more general and varied use of the potential of this collection of increasingly powerful nodes. The desire for a converged infrastructure of this kind has only intensified over the last 30 years, as memory, storage, and processing resources have increased in both density and speed while simultaneously decreasing in cost. Drawing on technical analysis, historical examples, and case studies, I will present an argument for the hypothesis that in order to realize a distributed platform with the kind of convergent generality and deployment scalability that might qualify as ”future-defining,” we must build it on a small, common set of simple, generic, and limited abstractions of the local, low level resources (processing, storage and network) of its intermediate nodes.BiographyMicah Beck began his research career in distributed operating systems at Bell Laboratories and received his Ph.D. in Computer Science from Cornell University (1992) in the area of parallelizing compilers. He then joined the faculty of the Computer Science Department at the University of Tennessee, where he is an Associate Professor working in distributed and high performance computing, networking and storage. He is currently serving the U.S. National Science Foundation as a Program Director in the Office of Advanced Cyberinfrastructure.
12:45 – 13:05 Chaînes de traitements automatiques et sécurité : quels enjeux? (Ludovic Billard, Idris)
Les centres de calcul nationaux font face à une transformation rapide et profonde de l’écosystème applicatif et des usages de leurs utilisateurs. Dans cette dynamique d’évolution, l’époque des soumissions de calcul « batch / command line » a laissé la place à des interfaces homme machine de haut niveau et à de l’automatisation des calculs par de nouvelles technologies s’appuyant sur piles applicatives complexes : portails, middlewares, intégration continue, sont autant de challenges à relever pour les équipes sécurité… Je tacherai au cours de mon intervention, de mettre en évidence les problématiques de sécurité soulevées à l’IDRIS par l’automatisation et l’abstraction du traitement des données par les centres de calcul à travers des automates de soumission de travaux.
Administrateur système et réseau dans un laboratoire de recherche pendant 10 ans, j’ai entrepris il y a un an, une reconversion pour me spécialiser dans le domaine passionnant de la sécurité. Exercer au sein du centre de calcul national du CNRS en qualité d’adjoint au Responsable de la Sécurité des Systèmes d’Information est un défi quotidien que je partage avec les RSSI des centres de calcul nationaux et internationaux.
14:15 – 14:30 EuroHPC (Laurent Crouzet, Chargé de Mission Calcul Intensif et Infrastructures Numériques)
14:30 – 14:45 Genci update (Stéphane Requena, Genci)
14h45 : 15h15 WRF-GO: a workflow manager for low latency meteo predictions and applications (Antonio Parodi, CIMA)
CIMA Research Foundation, a non-profit organization committed to promote the study, research and development in engineering and environmental sciences, has developed a workflow manager for the execution of high-resolution NWP on a regional scale and the exploitation of meteorological products for further simulations (hydro, fire, energy production) supporting civil protection, and energy trading. The system has to provide results within a strict time constraint, exploiting HPC resources granted on a best-effort basis, and cloud resources. Users include regional meteo services, civil protection, SMEs active on energy trading market, insurance and re-insurance companies.
The workflow manager, developed in go and fully controlled by REST APIs, is able to exploit concurrency to minimize the latency in all steps of the workflow and to minimize the uncertainties due to exploitation of best effort resources.
It is currently adopted for managing operational WRF runs (either with and without data assimilation) and subsequent processing.
Antonio Parodi (male), PhD, Program Director at CIMA Research Foundation. Master Degree in Environmental Engineering, University of Genova, Italy (1998). Research Scholar
at MIT – EAPS, (2002). His research interests are related to the development of simplified models of dry and moist convection and to the study of the main sources of uncertainty in the high-resolution numerical modelling of deep moist convective processes. He is author and co-author of 51 publications on international peer-reviewed journals. Project director of the FP7
projects DRIHMS (www.drihms.eu), DRIHM (www.drihm.eu), DRIHM2US (www.drihm2us.eu), ESA STEAM project (http://www.cimafoundation.org/cima-foundation/projects/steam.html).
15:15 – 15:45 Climate modelling workflow at IPSL : producing (lot of) CMIP6 simulations and making sense of them (Sébastien Denvil, IPSL)
The sixth phase of the Couple Model Inter-comparison Project (CMIP6) started in 2015. By many aspects, especially the breadth and depth of the scientific questions covered, the sixth phase has been the most challenging one. About 40 modelling groups around the globe engaged with CMIP6. They all had to implement a social, a scientific and a technical workflow up the task of realising the CMIP6 experimental protocols composed of 240 experiments. The fifth phase were composed of 40 experiments. Data produced during CMIP6 are all to be respective of the Open Data and the FAIR principles and this needs to be taken care of very seriously. This talk will present the climate modelling workflow that were developed in France to fulfil those goasl. At the heart of the workflow is the idea that each of the 240 experiments has been designed with different scientific goals in mind. Different scientific objectives means different requested data that has to be produced. Failing at having our workflow precisely driven by scientific objectives would impair our capacity to produce all the requested data and to make sense of them. A single simulation can produce up to 25 TB a day and we could had tens of them running simultaneously over some years.
Sébastien Denvil is a research engineer at IPSL (CNRS) with more than 15 years experience in climate simulations, High Performance Computing and e-science infrastructure. He is the head of the IPSL Climate Modeling Centre Data platform and one of the technical chair of the CMIP like simulations platform. He is chairing the CDNOT (CMIP Data Node Operation Team), is a member of the WGCM Infrastructure Panel (WIP), is a member of the ENES (European Network of Earth System modeling) data task force and is one of the three European representatives in the ESGF Executive Committee. He is and has been PI in several international projects (ESGF, IS-ENES2, ES-DOC, IS-ENES, ETAFOR, G8 ExArch).
16:15 – 16:45 A new paradigm for the automatic generation of workflows in multidisciplinary optimization (Anne Gazaix et François Gallard IRT Saint Exupery)
Most industrial organizations in aeronautics are based on separate competence domains, each of them being expert of its discipline and having developed advanced disciplinary design capabilities and processes. This approach is rapidly proving insufficient when one seeks to integrate breakthrough technologies for which the effects of interaction between disciplines or components are important.
This leads to the need of interconnecting disciplinary models and processes and generating integrated multidisciplinary workflows allowing to find robust optimal solutions.
To this purpose, IRT Saint Exupéry has developed methods and tools to quickly generate automatic integrated workflows in distributed and heterogeneous environments such as HPC based or CATIA-Windows based. The complexity of these multidisciplinary workflows, and of the associated data management, associated to the need of an easy re-configurability of these workflows led to the development of a new paradigm for programing simulation workflows, that is neither based on prescribed work-flow or data-flows, as currently existing solutions do, but pushes a step further the automation of simulation processes generation.
GEMS, a MDO (Multidisciplinary Design Optimization) library developed at IRT, implements this new paradigm, enabling an automatic generation of multidisciplinary optimization and trade-off processes in an efficient way, thanks to generic parallelization strategies.
The methodology and solutions are demonstrated on a large scale Airbus engine pylon multi-fidelity multidisciplinary optimization.
Dr Anne Gazaix is R&T Manager-France at the Department of Flight Physics at Airbus. She received a Ph.D degree in Applied Mathematics from the University Pierre & Marie Curie, Paris 6, in 1989. She worked 18 years at ONERA in the field of aerodynamics and CFD development. Since 2007 she is working at Airbus on multidisciplinary optimization. She is seconded at IRT since 2015 where she is responsible for the Multidisciplinary Design Optimisation Competence Center and is the Head of the MDA-MDO project.
Dr François Gallard graduated from the ISAE engineering school in Toulouse and obtained a PhD for his work at Airbus and CERFACS on robust aerodynamic optimization with aero-elastic effects entitled “Aircraft shape optimization for mission performance”. He has 8 years of experience in aerodynamic shape optimization based on discrete adjoint, and Multidisciplinary Design Optimization in an industrial context.
16:45 – 17:15 ActiveEon’s Workflows: from HPC to Data Analytics to Machine Learning (Denis Caromel, ActiveEon)
Coming from an INRIA Technology transfer, ActiveEon’s solution features Workflows through Portals, API and Command Line Interface (CLI), together with Scheduling/Meta-Scheduling and Resource Manager. After an overview of the Workflow capability and Portals, several use cases will be presented, both in Industry and Academia (UK Home Office, Legal & General, SAFRAN Motors, ThalesAleniaSpace, CNES, CEA, INRA, etc.).
For many users, it is key that the exact same workflow can execute both On-premises on traditional Laptop/Servers/HPC Clusters and in the Clouds, in Hybrid mode. As our users are in need to integrate complex Data Analytics in their Workflows, they quoted our capability as « ActiveEon is the only solution capable to Schedule any Big Data Analytics, mono-threaded, multi-threaded, multi-core, parallel and distributed« .
Finally, we will present a brand new extension based on Workflows: Machine Learning Open Studio (ML-OS), an interactive graphical interface that enables developers and data scientists to quickly and easily build, train, and deploy AI-ML models at any scale, together with AutoML capabilities. Relying on ready-to-use Workflows, it is both UI based, extensible, programmable, and fully Open.
Denis has been working for more than 20 years as a researcher in parallel, concurrent and distributed programming, cloud computing, workload automation, applications and infrastructure automation for INRIA and University of Nice Sophia Antipolis. He is holding a PhD degree in computer science in distributed and parallel computing and an MBA from HEC Business School. He has written numerous research papers and books, and directed PhD students. Denis acted as keynote speaker at several major conferences (MDM, DAPSYS, CGW, Shanghai CCGrid 2009, IEEE ICCP’09, ICPADS 2009 in Hong Kong, Devoxx in Paris, Open Stack Summit in Santa Clara), and as invited conference speaker on Clouds at Expo Universal in Shanghai. At Inria Denis was leading a team of 45 persons where ActiveEon initial Open Source technology was developed. Denis is an entrepreneur, and worked during 2 years in the United States in a startup company in California. He co-founded ActiveEon in 2007 together with his team as a spinoff of Inria in Sophia-Antipolis, creating synergies between the Inria 45-persons R&D team and the startup. Since 2011, he is ActiveEon CEO and currently manages an international company of 25 people located in France, UK, Africa, Bulgaria, and United States. Denis vision and strategy is to reinforce ActiveEon position as a major player of digital transformation and automation in the cloud. ActiveEon team works in agile mode, and Denis prefers his team to work in holacracy.