Towards Interactive UAV Navigation: Introducing a novel Benchmark for Destination-Oriented Flight with Human Assistance

Beihang University
Conferance name and year

*Indicates Equal Contribution

Our platform, developed based on AirSim and UE4 offers a robust and flexible simulation environment. we create NEZHA platform aiding in data collection, algorithm verification, and online rendering.

Abstract

Recent advancements in natural language-guided navigation have combined multimodal large language models with robotic tasks, particularly benefiting autonomous vehicles, robots, and drones. However, research in aerial drone navigation remains underdeveloped, with existing datasets suffering from unrealistic environments and limited scalability.
We introduce Nezha Platform and Nezha Dataset for aerial visual and language navigation tasks to address these issues. Developed with Unreal Engine 4 and Microsoft AirSim, Nezha Platform offers enhanced realism, versatile control, and efficient data pipelines. Our dataset features 14 diverse environments and 90 target objects, providing high-precision waypoint trajectories and multimodal sensor data. our task requires agents to navigate using end-point environment descriptions and orientation.
Our contributions include (1) A scalable platform for efficient data collection, (2) A comprehensive benchmark with extensive evaluation, and (3) A collaborative large-small model approach for accurate and efficient navigation. Our dataset, with 10K trajectories and 96 target objects, significantly advances aerial VLN research.

Another Carousel

Poster

BibTeX

BibTex Code Here