39e Forum: Architectures émergentes et futures pour le HPC dans le Monde – Entrées-sorties et stockage pour le HPC

39e Forum: Architectures émergentes et futures pour le HPC dans le Monde – Entrées-sorties et stockage pour le HPC

28 Mars 2017, au CNRS, rue Michel Ange, Paris

Ce Forum ORAP propose d’explorer deux thèmes. L’évocation de projets internationaux (Chine, Japon) et européens sera notamment l’occasion d’éclairer la place que prennent les architectures à base de processeurs de conception ARM dans les stratégies de grands acteurs du HPC. Ensuite, les tendances en matière d’entrées/sorties et de stockage pour le HPC seront évoquées par différents acteurs issus de la recherche et de l’industrie, tant en matière de technologie que de concepts et d’organisation logicielle jusqu’aux applications.
Ces deux thèmes sont en fait intimement liés : l’évolution des usages dans tous les domaines interroge la relation calcul/données de façon de plus en plus imbriquée, comme en témoigneront les contenus tant techniques que scientifiques ou d’organisation de projet des différents exposés. L’architecture d’un système de calcul ou de traitement massif de données doit en fait appréhender globalement les fonctions de plus en plus cruciales de circulation et stockage des données, qui ne font plus l’objet d’un simple sous-système spécialisé.

This ORAP Forum will explore two topics. International (China, Japan) and European projects presentations will be an opportunity to emphasize the rising interest for ARM-based HPC architectures, among other considerations and visions delivered by large players.
Trends in I/O and storage for HPC will then be highlighted by academic or industrial experts, from technology to applications, emphasizing also new concepts and software organization aspects.
These two topics are not disconnected: usage evolution in all domains is questioning an increasingly intertwined vision of computing and data processing. This will be illustrated by the different technical, scientific or strategic presentations, and we will be able to discuss how more and more crucial data movement and storage features and capabilities must be globally envisioned.

Sponsors du 39e Forum

  • 9:15 – 9h30 Introduction : Alain Refloch
  • 9:30 – 10:15 Design and Implementation for Convergence of HPC and Bigdata on Tianhe2, Yutong Lu, NUDT, China

    Abstract

    Nowadays, advanced computing and visualizing tools are facilitating scientists and engineers to perform virtual experiments and analyze large-scale datasets. Computing-driven and Bigdata-driven scientific discovery has become a necessary approach in global environment, life science, nano-materials, high energy physics and other fields. Furthermore, the fast increasing computing requirements from economic and social development also call for the birth of the Exascale system. This talk will discuss the convergence of the HPC and Bigdata, and how to establish a capable application platform on Tianhe2.

    Bio

    Yutong Lu is the Director of National Supercomputing Center in Guangzhou, China. She is the professor both in Sun Yat-Sen University and National University of Defense Technology. She got her B.S, M.S, and PhD degrees from the NUDT. Her extensive research and development experience has spanned several generations of domestic supercomputers in China. Prof. Lu is deputy chief designer of Tianhe Project. Her continuing research interests include parallel operating systems (OS), high-speed communication, global file system, and advanced programming environment.

  • 10:15 – 11:00 Towards the Japanese next flagship supercomputer, Yutuka Ishikawa, Project Leader of Flagship 2020 project, RIKEN, Japon

    Abstract

    The next flagship supercomputer in Japan, replacement of the K supercomputer and thus we call it post K supercomputer, is being designed to be operated in early 2020s. Its node architecture and interconnect are an ARM HPC extension and a 6-D mesh/torus network, respectively. A three level hierarchical storage system will be installed with compute nodes. The system software developed in the post K supercomputer includes a novel operating system for general-purpose manycore architectures, low-level communication and MPI libraries, and file I/O middleware. After introducing an overview of the post K architecture, we will present the current status of the system software development.

    Bio

    Yutaka Ishikawa is in charge of developing the post K supercomputer. Ishikawa received the BS, MS, and PhD degrees in electrical engineering from Keio University. From 1987 to 2001, he was a member of AIST (former Electrotechnical Laboratory), METI. From 1993 to 2001, he was the chief of Parallel and Distributed System Software Laboratory at Real World Computing Partnership. He led development of cluster system software called SCore, which was used in several large PC cluster systems around 2004. From 2002 to 2014, he was a professor at the University Tokyo. He led a project to design a commodity-based supercomputer called T2K open supercomputer. As a result, three universities, Tsukuba, Tokyo, and Kyoto, obtained each supercomputer based on the specification in 2008. He was also involved with the design of the Oakleaf-PACS, the successor of T2K supercomputer in both Tsukuba and Tokyo, whose peak performance is 25PF.

  • 11:30 – 12:00 Mont-Blanc project: a European collaboration towards exascale HPC, Etienne Walter, ATOS

    Abstract

    The third phase of the Mont-Blanc project started in October 2015: it is coordinated by Bull, the Atos brand for technology products and software, and funded by the European Commission under the Horizon 2020 programme (grant agreement n° 671697). The third phase adopts a co-design approach to ensure that hardware and system innovations are readily translated into benefits for HPC applications. It aims at designing a new high-end HPC node that is able to deliver a new level of performance / energy ratio when executing real applications. The talk will present the initial objective of the project, the status of the work undertaken and the perspectives of the Mont-Blanc project.

    Bio

    Etienne WALTER is the coordinator of the third phase of the Mont-Blanc European project. He holds a masters’ degree in electrical engineering from French engineering school ESE (Supelec). He worked on mainframes, as sw. developer & team manager, and in the telecommunication domain, as team and project manager. Now working within Bull R&D, he led, as project manager, several collaborative projects (DataScale, ELCI, and now Mont-Blanc 3) addressing HPC and Big data.

  • 12:00 – 12:15 Focus on HPC in Europe : programmes and initiatives Survol des programmes et initiatives HPC européens, Dr. Jean Philippe Nominé, CEA – Direction des Analyses Stratégiques

    Résumé

    Nous présenterons un rapide survol des progrès les plus récents, et des perspectives, des initiatives et programmes liés à la stratégie de la Commission européenne en matière de HPC :
    – au sein d’Horizon 2020, avec les projets technologiques et centres d’excellence lancés en 2015-2016, qui progressent, tandis que le Work Programme 2018-2020 prépare leur suite ou continuation.
    – dans des visions élargies depuis avril 2016 – notamment European Cloud Infrastructure, European Open Science Cloud, IPCEI.

    Bio

    Responsable de différents projets et équipes au CEA/DAM de 1992 à 2008: développement de logiciel HPC pour les environnements des machines TERA. Chargé de mission sur les collaborations européennes en HPC du CEA de 2008 à 2016, principalement pour le développement de PRACE et des projets FP7/H2020 associés ; puis pour la création de l’association industrielle ETP4HPC, et le développement du PPP sur le HPC avec la Commission européenne. En septembre 2016, rejoint la Direction des Analyses Stratégiques du CEA pour contribuer à la stratégie numérique de l’organisme. Membre du Conseil Scientifique ORAP.

  • 12:15 – 12:45 Simulations HPC et mouvement de données, Les enjeux en sciences de l’Univers, Jean-Pierre Vilotte, IPGP

    Abstract

    Today direct scale numerical simulation, probabilistic inversion and data assimilation, data-intensive statistical analysis and machine learning methods are increasingly used in complement with large scale instruments and observation systems to address fundamental problems in understanding the dynamic and the structure of the Earth system and planets in their environment, and the evolution of the Universe. Augment applications of societal concern in particular: climate and environmental changes, natural hazard mitigation, hydrocarbon and new energy resource discovery and production. New discoveries in these domains require efficient use of exascale new technologies and place great premium on co-evolution of these technologies and research practices.

    We shall provide a guided overview of barriers and gaps in high-performance data and computing architectures and infrastructures through some typical data-intensive use cases shared across the Astronomy and Astrophysics, Climate and solid Earth research communities.

    New application-driven challenges are associated with efficient end-to-end provenance-based data movement between high-performance computing and statistical data analysis stages along the full path of data use, together with the increasing need of in-situ data analysis and of federating high-performance computing and data analysis platforms to support of those applications. This raises challenging issues in terms of data representation and storage, data stream and vertical reuse all through the end-to-end workflow.

    Bio

    Jean-Pierre is scientific deputy for high-performance computing and information technology at the Institut des Sciences de l’Univers (CNRS). He is professor at the Institut de Physique du Globe de Paris, where he is working on high-performance computing geophysics and statistical data analysis. He is contributing to the international Big Data and Exascale Computing initiative and to the Data and e-Infrastructure coordinated research action of the Belmont Forum where he is leading the e-infrastructure action theme.

  • 14:00 – 14:30 Next Generation IO @ CEA computing centres, Jacques-Charles Lafoucrière, CEA/Diff

    Abstract

    One of the challenges of next generation high performance computing centres will be the I/O and storage capabilities for exascale class systems. We will see the limitations of today solutions (Posix Parallel file systems) and how the object storage model can respond to our needs.

    Bio

    Jacques-Charles Lafoucrière is head of the CEA computing centres teams at Bruyères-le-Châtel (design, setup and operations of two large facilities, encompassing three multi-petascale supercomputers, with a strong focus on upfront R&D and co-design with key HPC technology suppliers).
    J-Charles received his Engineering degree from Ecole Centrale Paris and joined CEA teams in 1989.
    He initially worked as a storage administrator and a system developer. Since 2005 he has been an active Lustre developer, developing Lustre Pools and contributing to the design and development of Lustre Hierarchical Storage Management. More recently he has been involved in H2020 projects in the area of advanced and future storage such as SAGE and BigStorage .

  • 14:30 – 15:00 Convergence of HPC and Big Data: a storage-oriented perspective, Gabriel Antoniu, INRIA

    Abstract

    The ever-increasing computing power available on HPC platforms raises major challenges for the underlying storage systems, both in terms of I/O performance requirements and scalability. Metadata management has been identified as a key factor limiting the performance of POSIX file systems. Due to their lower metadata overhead, blobs (Binary Large Objects) have been proposed as an alternative to traditional POSIX file systems for answering the storage needs of data-intensive HPC applications. Yet, the interest for blobs spans far beyond HPC, as they are also used as a low-level primitive for providing higher-level storage abstractions such as key-value stores or relational databases in cloud environments. Therefore, could blobs be an enabling factor for storage convergence between both these worlds? Our work aims to leverage real-world use cases and applications in order to evaluate the key features and requirements for achieving such blob-based storage convergence.

    Bio

    Gabriel Antoniu is a Senior Research Scientist at Inria, Rennes. He is the Head of the KerData research team, which focuses on storage and I/O management for Big Data processing on scalable infrastructures (clouds, HPC systems). He received his Ph.D. degree in Computer Science in 2001 from ENS Lyon. He leads several international projects in partnership with Microsoft Research, IBM, Argonne National Lab, the University of Illinois at Urbana Champaign. He has served as Program Chair for the IEEE Cluster conference in 2014 and 2017 and regularly serves as a PC member of major conferences in the area of HPC, cloud computing and Big Data (SC, HPDC, CCGRID, Cluster, Big Data, etc.).

  • 15:30 – 16:00 Managing the transition to Storage System Hierarchy, Jean-Thomas Acquaviva, DDN

    Abstract

    Presently, two factors are affecting the design of new storage architectures. First, the race for Exascale is pushing storage systems to a scale unseen in current practice. Second, new technologies, such as Storage Class Memories (SCM), are causing disruptions. In the past, systems could be seen as three well identified components, each with its own order of magnitude: DRAM – nanosecond, network – microsecond, storage – millisecond. Now, this hierarchy and the subsequent isolation between components is becoming irrelevant because storage components are filtering down to faster levels (e.g., SCM latency is in nanoseconds, and NVMe storage is in microseconds).

    Revisiting the legacy software stack and re-think current architecture is a daunting task which need a considerable effort from our community. Such a long and probably painful journey can not be envisioned without insights. This talk will present some research results from DDN Storage aiming to ease the transition to new architectures.

    Bio

    Jean-Thomas Acquaviva, Research Engineer, DDN, France, successively worked for Intel, the University of Versailles and the French Atomic Commission (CEA). He participated to the creation of their joint laboratory the Exascale Research Centre, where he led the Performance Evaluation Team. Today he’s actively contributing to the setting-up and development of the DataDirectNetwork Advanced Technology Centre in France, where he’s still busy on performance by working on the Infinite Memory Engine (IME) project. Jean-Thomas is the author of twenty publications mostly focused on performance optimization. He holds a PhD from the University of Versailles performed at CEA/DAM.

  • 16:00 – 16:30 SAGE percipient storage (EU H2020 project), Malcolm Muggeridge , Sai Narasimhamurthy – Seagate

    Abstract

    The StorAGe for Exascale Data Centric Computing (SAGE) system, researched and built as part of the SAGE project, aims to implement a Big Data/Extreme Computing (BDEC) and High Performance Data Analytics (HPDA) capable infrastructure suitable for Extreme scales – including Exascale and beyond. Increasingly, overlaps occur between Big Data Analysis and High Performance Computing (HPC), caused by the proliferation of massive data sources, such as large, dispersed scientific instruments, sensors, and social media data, whose data needs to be processed, analysed and integrated into computational simulations to derive scientific and innovative insights. The SAGE storage system, will be capable of efficiently storing and retrieving immense volumes of data at Extreme scales, with the added functionality of “Percipience” or the ability to accept and perform user defined computations integral to the storage system. The SAGE system will be built around the Mero object storage software platform and its supporting ecosystem of tools and techniques, that will work together to provide the required functionalities and scaling desired by Extreme scale workflows.

    Bios

    Sai Narasimhamurthy PhD (male) is currently Staff Engineer, Seagate Research (formerly Lead Researcher, Emerging Tech, Xyratex) working on Research and Development for next generation storage systems (2010-). He has also actively led and contributed to many European led HPC and Cloud research initiatives on behalf of Xyratex (2010-) currently coordinating and providing technical leadership for the SAGE H2020 project. Previously(2005 – 2009) , Sai was CTO and Co-founder at 4Blox, inc, a venture capital backed storage infrastructure software company in California addressing IP SAN(Storage Area Network) performance issues as a software only solution. Sai obtained his PhD from Arizona State University specialising in IP based storage(2005).

    Malcolm Muggeridge is Senior Director, Engineering; responsible for collaborative research at Seagate Systems UK. He joined Seagate through its acquisition of Xyratex in 2014 and was with Xyratex at its creation as a management buyout from IBM in 1994. Malcolm has more than 39 years experience in the Technology, manufacturing, quality and reliability of Disk drives and Networked data storage systems and in recent years in HPC data storage, architecting and managing designs and new technologies across many products.
    More recently he has been focused on Strategic Innovation and Business development, Research and Technology. He is a Steering Board member of the ETP4HPC defining research objectives for future within Europe and is active in the Partnership Board of the cPPP on HPC. He is a member of the UK eInfrastructure board with Special interest in HPC. Malcolm has a B.Eng degree in Electronics from Liverpool University.

    This talk provides an overview of the concepts underlying the SAGE project and of the status of the technical developments.

  • 16:30 – 17:00 NEXTGenIO – non-volatile memory for next generation I/O, Michele Weiland, EPCC

    Abstract

    The NEXTGenIO project is developing a prototype computing platform that uses on-node non-volatile memory, bridging the latency gap
    between DRAM and disk. In addition to the hardware that will be built as
    part of the project, NEXTGenIO is developing the software stack (from OS
    and runtime support to programming models and tools) that goes
    hand-in-hand with this new hardware architecture. This talk will focus
    on the different use cases that can benefit from this new architecture.

    Bio

    Dr Michèle Weiland is a Project Manager at EPCC, the supercomputing
    centre at the University of Edinburgh, and the technical lead for the
    NEXTGenIO project. Michèle’s research interests are in the fields of
    energy efficiency and novel technologies.