Hydra: Brokering Cloud and HPC Resources to Support the Execution of Heterogeneous Workloads at Scale

Scientific discovery increasingly depends on middleware that enables the execution of heterogeneous workflows on heterogeneous platforms. One of the main challenges is to design software components that integrate within the existing ecosystem to enable scale and performance across cloud and high-performance computing (HPC) platforms. Researchers are met with a varied computing landscape, which includes services available on commercial cloud platforms, data and network capabilities specifically designed for scientific discovery on government-sponsored cloud platforms, and scale and performance on HPC platforms. We present Hydra, an intra/cross cloud/HPC brokering system capable of concurrently acquiring resources from commercial/private cloud and HPC platforms and managing the execution of heterogeneous workflow applications on those resources. This paper offers four main contributions: (1) the design of brokering capabilities in the presence of task, platform, resource, and middleware heterogeneity; (2) a reference implementation of that design with Hydra; (3) an experimental characterization of Hydra’s overheads and strong/weak scaling with heterogeneous workloads and platforms; and (4) the implementation of a workflow that models sea-rise with Hydra and its scaling on cloud and HPC platforms.

Citation

Alsaadi, Aymen, Shantenu Jha, and Matteo Turilli. "Hydra: Brokering Cloud and HPC Resources to Support the Execution of Heterogeneous Workloads at Scale." Proceedings of the 14th Workshop on AI and Scientific Computing at Scale using Flexible Computing Infrastructures. 2024.

Authors from IE Research Datalab