自动机：基于梯度的数据子集选择，用于计算高效高参数调整

论文标题

自动机：基于梯度的数据子集选择，用于计算高效高参数调整

AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

论文作者

Killamsetty, Krishnateja, Abhishek, Guttu Sai, Aakriti, Evfimievski, Alexandre V., Popa, Lucian, Ramakrishnan, Ganesh, Iyer, Rishabh

论文摘要

近年来，深层神经网络取得了巨大的成功。但是，训练深层模型通常具有挑战性，因为其性能在很大程度上取决于使用的超参数。此外，即使使用最先进的（SOTA）高参数优化（HPO）算法，找到最佳的超参数配置也可能是耗时的，需要在整个数据集上进行多次培训，以进行不同的可能的超参数集。我们的中心洞察力是，使用数据集的信息子集进行模型训练涉及超参数优化的模型运行，使我们能够找到最佳的超参数配置明显更快。在这项工作中，我们提出了Automata，这是一个基于梯度的子集选择框架，用于高参数调整。我们通过在文本，视觉和表格域中的现实世界数据集上进行多个实验，通过经验评估自动机在高参数调整中的有效性。我们的实验表明，使用基于梯度的数据子集进行超参数调谐，可以明显更快地实现周转时间和3 $ \ times $ -30 $ \ times $ $的速度，同时与使用整个数据集发现的超级参数相当。

Deep neural networks have seen great success in recent years; however, training a deep model is often challenging as its performance heavily depends on the hyper-parameters used. In addition, finding the optimal hyper-parameter configuration, even with state-of-the-art (SOTA) hyper-parameter optimization (HPO) algorithms, can be time-consuming, requiring multiple training runs over the entire dataset for different possible sets of hyper-parameters. Our central insight is that using an informative subset of the dataset for model training runs involved in hyper-parameter optimization, allows us to find the optimal hyper-parameter configuration significantly faster. In this work, we propose AUTOMATA, a gradient-based subset selection framework for hyper-parameter tuning. We empirically evaluate the effectiveness of AUTOMATA in hyper-parameter tuning through several experiments on real-world datasets in the text, vision, and tabular domains. Our experiments show that using gradient-based data subsets for hyper-parameter tuning achieves significantly faster turnaround times and speedups of 3$\times$-30$\times$ while achieving comparable performance to the hyper-parameters found using the entire dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题