论文标题
学习神经声学领域
Learning Neural Acoustic Fields
论文作者
论文摘要
我们的环境充满了丰富而动态的声学信息。当我们走进大教堂时,回响和外观一样多地告诉我们圣所广阔的开放空间。同样,随着对象在我们周围移动,我们希望发出的声音也会表现出这种运动。尽管学习的隐式功能的最新进展导致视觉世界质量越来越高,但学习空间听觉表示方面尚未有所提高。为了解决这一差距,我们介绍了神经声场(NAFS),这是一种隐含的表示,捕获听起来在物理场景中的传播方式。通过将场景中的声学传播建模为线性时间不变的系统,NAFS学会了将所有发射器和侦听器位置对不断映射到神经脉冲响应功能,然后将其应用于任意声音。我们证明,NAF的连续性使我们能够在任意位置为听众提供空间声学,并可以预测新颖位置的声音传播。我们进一步表明,NAFS所学的表示形式可以用稀疏的视图来改善视觉学习。最后,我们表明在学习NAF时出现了场景结构的表示形式。
Our environment is filled with rich and dynamic acoustic information. When we walk into a cathedral, the reverberations as much as appearance inform us of the sanctuary's wide open space. Similarly, as an object moves around us, we expect the sound emitted to also exhibit this movement. While recent advances in learned implicit functions have led to increasingly higher quality representations of the visual world, there have not been commensurate advances in learning spatial auditory representations. To address this gap, we introduce Neural Acoustic Fields (NAFs), an implicit representation that captures how sounds propagate in a physical scene. By modeling acoustic propagation in a scene as a linear time-invariant system, NAFs learn to continuously map all emitter and listener location pairs to a neural impulse response function that can then be applied to arbitrary sounds. We demonstrate that the continuous nature of NAFs enables us to render spatial acoustics for a listener at an arbitrary location, and can predict sound propagation at novel locations. We further show that the representation learned by NAFs can help improve visual learning with sparse views. Finally, we show that a representation informative of scene structure emerges during the learning of NAFs.