论文标题

语义丰富的人类流动性建模的球形隐藏马尔可夫模型

A Spherical Hidden Markov Model for Semantics-Rich Human Mobility Modeling

论文作者

Zhu, Wanzheng, Zhang, Chao, Yao, Shuochao, Gao, Xiaobin, Han, Jiawei

论文摘要

我们研究了从语义跟踪数据对人类移动性进行建模的问题,其中轨迹中的每个GPS记录都与描述用户活动的文本消息相关联。现有方法在揭示人类运动规律的情况下缺乏,因为它们要么根本没有对文本数据进行建模,要么严重遭受文本稀疏性的影响。我们提出了SHMM,这是一种多模式的球形隐藏马尔可夫模型,用于语义丰富的人类移动性建模。在隐藏的马尔可夫假设下,SHMM通过在迹线的每个步骤中共同考虑观察到的位置,时间和文本来建模给定迹线的生成过程。 SHMM的区别特征是文本建模部分。我们使用固定大小的矢量表示来编码文本消息的语义,并使用Von Mises-Fisher(VMF)分布来建模L2标准化的文本嵌入在单位球体上的生成。与其他替代方案(如多变量高斯人)相比,我们对VMF分布的选择不仅会降低参数,而且更好地利用了定向度量空间中文本嵌入的判别能力。 VMF分布的参数推断是非平凡的,因为它涉及贝塞尔函数比率的功能反演。从理论上讲,我们证明:1)经典期望最大化算法可以与VMF分布一起使用; 2)虽然很难获得M-Step的封闭式溶液,但保证牛顿的方法会以二次收敛速率收敛到最佳溶液。我们已经对合成数据和现实生活数据进行了广泛的实验。综合数据的结果验证了我们的理论分析;虽然现实生活数据的结果表明SHMM学习了有意义的语义富裕模型,但要优于下一个位置预测的最先进的移动性模型,并造成较低的培训成本。

We study the problem of modeling human mobility from semantic trace data, wherein each GPS record in a trace is associated with a text message that describes the user's activity. Existing methods fall short in unveiling human movement regularities, because they either do not model the text data at all or suffer from text sparsity severely. We propose SHMM, a multi-modal spherical hidden Markov model for semantics-rich human mobility modeling. Under the hidden Markov assumption, SHMM models the generation process of a given trace by jointly considering the observed location, time, and text at each step of the trace. The distinguishing characteristic of SHMM is the text modeling part. We use fixed-size vector representations to encode the semantics of the text messages, and model the generation of the l2-normalized text embeddings on a unit sphere with the von Mises-Fisher (vMF) distribution. Compared with other alternatives like multi-variate Gaussian, our choice of the vMF distribution not only incurs much fewer parameters, but also better leverages the discriminative power of text embeddings in a directional metric space. The parameter inference for the vMF distribution is non-trivial since it involves functional inversion of ratios of Bessel functions. We theoretically prove that: 1) the classical Expectation-Maximization algorithm can work with vMF distributions; and 2) while closed-form solutions are hard to be obtained for the M-step, Newton's method is guaranteed to converge to the optimal solution with quadratic convergence rate. We have performed extensive experiments on both synthetic and real-life data. The results on synthetic data verify our theoretical analysis; while the results on real-life data demonstrate that SHMM learns meaningful semantics-rich mobility models, outperforms state-of-the-art mobility models for next location prediction, and incurs lower training cost.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源