To fully address HPC I/O optimization we must move beyond configuration, application and library optimization to looking at the whole workflow and to concentrate on data movement driven by dependencies in the workflow. Our Octopus project will deliver a high-level workflow description, scheduling, and execution framework, built around data and memory hierarchy awareness. We believe that being able to schedule applications while being able to reason about their data production, data locality, and data consumption, and accounting for the cost of moving data between tiers of the memory hierarchy will enable efficient execution of coupled applications in complex workflows. Our use-cases cover: in-situ analysis and visualization; simulation with multiple consumers; coupled simulations with multiple concurrent analysis, external-data source and archiving dependencies; and even sample distribution and shuffling in high-throughput machine learning tasks.
Publication Type: Invited talk
Year of Publication: 2018
Authors: Utz-Uwe Haus (Cray)
Publisher: 3rd IMI-ISM-ZIB MODAL Workshop on Challenges in Real World Data Analytics and High-Performance Optimization
Place published: Tokyo, Japan.