从间接观察的图形模型中学习的结构学习

论文标题

从间接观察的图形模型中学习的结构学习

Structure Learning in Graphical Models from Indirect Observations

论文作者

Zhang, Hang, Abdi, Afshin, Fekri, Faramarz

论文摘要

本文考虑了使用参数和非参数方法研究$ p $二维随机向量$ x \ in r^p $的图形结构的学习。与以前直接观察$ x $的作品不同，我们考虑了间接观察方案，其中样本$ y $是通过sensing矩阵$ a \ in r^{d \ times p} $收集的，并因某些添加噪声$ w $而损坏，即，即$ y = ax + w $。对于参数方法，我们假设$ x $是高斯，即$ x \ in r^p \ sim n（μ，σ）$和$σ\ in r^{p \ times p} $。我们首次证明，可以使用不足的样品（$ n <p $）在不确定的传感系统（$ d <p $）下正确恢复正确的图形结构。特别是，我们表明，对于确切的恢复，我们需要尺寸$ d =ω（p^{0.8}）$和样本号$ n =ω（p^{0.8} \ log^3 p）$。对于非参数方法，我们假设$ x $而不是高斯的非态度分布。在轻度条件下，我们表明我们的图形结构估计器可以获得正确的结构。我们得出最低样品数字$ n $和dimension $ d $作为$ n \ gtrsim（deg）^4 \ log^4 n $和$ d \ gtrsim p +（deg \ cdot \ log（d-p））^{β/4} $，其中deg是图形模型中的最大Markov blacket，$ pustrant postoric ander nives postort usever nives nives undiest storts and postion undist opp nistest> 0是某些deg> 0> 0。此外，我们从间接观测值中获得了$ x $的估计误差，从而获得了噪声分布的知识。据我们所知，这种界限是第一次得出的，并且可以作为独立利益。提供了现实世界和合成数据的数值实验，确认了理论结果。

This paper considers learning of the graphical structure of a $p$-dimensional random vector $X \in R^p$ using both parametric and non-parametric methods. Unlike the previous works which observe $x$ directly, we consider the indirect observation scenario in which samples $y$ are collected via a sensing matrix $A \in R^{d\times p}$, and corrupted with some additive noise $w$, i.e, $Y = AX + W$. For the parametric method, we assume $X$ to be Gaussian, i.e., $x\in R^p\sim N(μ, Σ)$ and $Σ\in R^{p\times p}$. For the first time, we show that the correct graphical structure can be correctly recovered under the indefinite sensing system ($d < p$) using insufficient samples ($n < p$). In particular, we show that for the exact recovery, we require dimension $d = Ω(p^{0.8})$ and sample number $n = Ω(p^{0.8}\log^3 p)$. For the nonparametric method, we assume a nonparanormal distribution for $X$ rather than Gaussian. Under mild conditions, we show that our graph-structure estimator can obtain the correct structure. We derive the minimum sample number $n$ and dimension $d$ as $n\gtrsim (deg)^4 \log^4 n$ and $d \gtrsim p + (deg\cdot\log(d-p))^{β/4}$, respectively, where deg is the maximum Markov blanket in the graphical model and $β> 0$ is some fixed positive constant. Additionally, we obtain a non-asymptotic uniform bound on the estimation error of the CDF of $X$ from indirect observations with inexact knowledge of the noise distribution. To the best of our knowledge, this bound is derived for the first time and may serve as an independent interest. Numerical experiments on both real-world and synthetic data are provided confirm the theoretical results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题