Robotic Task Ambiguity Resolution via Natural Language Interaction

Abstract

Language-conditioned robotic policies allow users to specify tasks using natural language. While much research has focused on improving the action prediction of language- conditioned policies, reasoning about task descriptions has been largely overlooked. Ambiguous task descriptions often lead to downstream policy failures due to misinterpretation by the robotic agent. To address this challenge, we introduce AmbResVLM, a novel method that grounds language goals in the observed scene and explicitly reasons about task ambiguity. We extensively evaluate its effectiveness in both simulated and real- world domains, demonstrating superior task ambiguity detection and resolution compared to recent state-of-the-art methods. Finally, real robot experiments show that our model improves the performance of downstream robot policies, increasing the average success rate from 69.6% to 97.1%.

Code

For academic usage a software implementation of this project based on PyTorch can be found in our GitHub repository and is released under the GPLv3 license. For any commercial purpose, please contact the authors.