论文标题
通过线性时间变化环境中的元学习识别系统识别
System Identification via Meta-Learning in Linear Time-Varying Environments
论文作者
论文摘要
系统识别是增强学习,控制理论和信号处理的一个基本问题,即使对于线性时变(LTV)系统,对相应样品复杂性的非反应分析也很具有挑战性且难以捉摸。为了应对这一挑战,我们为LTV系统开发了一个情节块模型,在该模型中,模型参数在每个块内保持恒定,但会从块变为块。基于观察到,不同块之间的模型参数是相关的,我们将每个情节块视为一项学习任务,然后在许多块上运行元学习,以进行系统识别,即使用两个步骤,即离线元学习和在线适应。我们对基于元学习的系统识别的性能进行了全面的非反应分析。为了应对每个块中样本相关性和小样本量的技术挑战,我们设计了一种新的两尺度的martingale小球方法,用于离线元学习,以跨块的任意模型相关结构。然后,我们通过利用与相关样本的线性随机近似的最新进展来量化在线适应的有限时间误差。
System identification is a fundamental problem in reinforcement learning, control theory and signal processing, and the non-asymptotic analysis of the corresponding sample complexity is challenging and elusive, even for linear time-varying (LTV) systems. To tackle this challenge, we develop an episodic block model for the LTV system where the model parameters remain constant within each block but change from block to block. Based on the observation that the model parameters across different blocks are related, we treat each episodic block as a learning task and then run meta-learning over many blocks for system identification, using two steps, namely offline meta-learning and online adaptation. We carry out a comprehensive non-asymptotic analysis of the performance of meta-learning based system identification. To deal with the technical challenges rooted in the sample correlation and small sample sizes in each block, we devise a new two-scale martingale small-ball approach for offline meta-learning, for arbitrary model correlation structure across blocks. We then quantify the finite time error of online adaptation by leveraging recent advances in linear stochastic approximation with correlated samples.