[MM 2024] In Situ 3D Scene Synthesis for Ubiquitous Embodied Interfaces

Abstract

Virtual reality provides access to immersive virtual environments anytime and anywhere, allowing us to experience and interact with virtual worlds in various fields like entertainment, training, and education. However, users immersed in virtual scenes remain physically connected to their real-world surroundings, which can pose safety and immersion challenges. Although virtual scene synthesis has attracted widespread attention, many popular methods are limited to generating purely virtual scenes independent of physical environments or simply mapping physical objects as obstacles. To this end, we propose a scene agent that synthesizes situated 3D virtual scenes as a kind of ubiquitous embodied interface in VR for users. The scene agent synthesizes scenes by perceiving the user’s physical environment as well as inferring the user’s demands. The synthesized scenes maintain the affordances of the physical environment, enabling immersive users to interact with the physical environment and improving the user’s sense of security. Meanwhile, the synthesized scenes maintain the style described by the user, improving the user’s immersion. The comparison results show that the proposed scene agent can synthesize virtual scenes with better affordance maintenance, scene diversity, style maintenance, and 3D intersection over union compared to baselines. To the best of our knowledge, this is the first work that achieves in situ scene synthesis with virtual-real affordance consistency and user demand.

Publication
In 2024 ACM International Conference on Multimedia (MM)
Click the Cite button above to import publication metadata into their reference management software.
Zhenliang Zhang
Zhenliang Zhang
Research Scientist of AI

My research interests include wearable computing, machine learning, Cognitive Reasoning, and mixed/virtual reality.