[MM 2024] In Situ 3D Scene Synthesis for Ubiquitous Embodied Interfaces

Haiyan Jiang, Leiyu Song, Dongdong Weng, Zhe Sun, Huiying Li, Xiaonuo Dongye, Zhenliang Zhang

October, 2024

Abstract

Virtual reality provides access to immersive virtual environments anytime and anywhere, allowing us to experience and interact with virtual worlds in various fields like entertainment, training, and education. However, users immersed in virtual scenes remain physically connected to their real-world surroundings, which can pose safety and immersion challenges. Although virtual scene synthesis has attracted widespread attention, many popular methods are limited to generating purely virtual scenes independent of physical environments or simply mapping physical objects as obstacles. To this end, we propose a scene agent that synthesizes situated 3D virtual scenes as a kind of ubiquitous embodied interface in VR for users. The scene agent synthesizes scenes by perceiving the user’s physical environment as well as inferring the user’s demands. The synthesized scenes maintain the affordances of the physical environment, enabling immersive users to interact with the physical environment and improving the user’s sense of security. Meanwhile, the synthesized scenes maintain the style described by the user, improving the user’s immersion. The comparison results show that the proposed scene agent can synthesize virtual scenes with better affordance maintenance, scene diversity, style maintenance, and 3D intersection over union compared to baselines. To the best of our knowledge, this is the first work that achieves in situ scene synthesis with virtual-real affordance consistency and user demand.

Type

Conference paper

Publication

In 2024 ACM International Conference on Multimedia (MM)

Click the Cite button above to import publication metadata into their reference management software.

Symmetrical Reality

Zhenliang Zhang

Research Scientist of AI

My research interests include wearable computing, machine learning, Cognitive Reasoning, and mixed/virtual reality.