论文标题
使用Kubernetes计算资源自动缩放HTCONDOR池
Auto-scaling HTCondor pools using Kubernetes compute resources
论文作者
论文摘要
HTCONDOR在管理全球分布,宜人的科学工作负载方面非常成功,尤其是作为开放科学网格的一部分。 HTCondor System Design使其非常适合整合从任何地方进行计算资源,但是它对由其他解决方案管理的自主配置资源的本机支持非常有限。这项工作提出了一种解决方案,允许对Kubernetes管理的资源进行自主,需求驱动的供应。介绍了有关架构的高级概述,并与在本地部署和云部署中使用的设置的描述相结合,以支持几个开放的科学网格社区。经验表明,所描述的解决方案通常适用于将基于Kubernetes的资源贡献给现有的HTCONDOR池。
HTCondor has been very successful in managing globally distributed, pleasantly parallel scientific workloads, especially as part of the Open Science Grid. HTCondor system design makes it ideal for integrating compute resources provisioned from anywhere, but it has very limited native support for autonomously provisioning resources managed by other solutions. This work presents a solution that allows for autonomous, demand-driven provisioning of Kubernetes-managed resources. A high-level overview of the employed architectures is presented, paired with the description of the setups used in both on-prem and Cloud deployments in support of several Open Science Grid communities. The experience suggests that the described solution should be generally suitable for contributing Kubernetes-based resources to existing HTCondor pools.