Skip to content

A list of research papers, models, datasets, and other resources on Vision-Language-Action/Navigation (VLA/VLN) models for UAVs.

Notifications You must be signed in to change notification settings

TheBrainLab/Awesome-VLA-UAVs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 

Repository files navigation

Awesome-VLA-UAVs

A list of research papers and other related resources on Vision-Language-Action/Navigation (VLA/VLN) models for UAVs.

Contributions are welcome!

2025

  • [Review] UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility (Information Fusion 2025.3)[paper][code]

  • OpenFly: A Comprehensive Platform for Aerial Vision-Language Navigation (arXiv 2025.7)[paper][code]

  • TypeFly: Low-Latency Drone Planning With Large Language Models (IEEE Transactions on Mobile Computing 2025.9) [paper][code]

  • Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology (OpenUAV) (ICLR 2025)[paper][code]

  • UAV-Flow Colosseo: A Real-World Benchmark for Flying-on-a-Word UAV Imitation Learning (arXiv 2025.5)[paper][code]

  • UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents (ACM MM Dataset Track 2025)[paper][code]

  • AeroDuo: Aerial Duo for UAV-based Vision and Language Navigation (ACM MM 2025)[paper][[code]]

  • CityNav: A Large-Scale Dataset for Real-World Aerial Navigation (ICCV 2025)[paper][code]

  • CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory (ACL 2025)[paper][code]

  • VLM-Nav: Mapless UAV-Navigation Using Monocular Vision Driven by Vision-Language Model (SSRN)[paper][code]

  • Learning Fine-Grained Alignment for Aerial Vision-Dialog Navigation (AAAI 2025)[paper][code]

  • UAV-VLA: Vision-Language-Action System for Large Scale Aerial Mission Generation (Int. Conf. on Human Robot Interaction, HRI 2025)[paper][code]

  • General-Purpose Aerial Intelligent Agents Empowered by Large Language Models (arXiv 2025.5)[paper][[code]]

  • RAVEN: Resilient Aerial Navigation via Open-Set Semantic Memory and Behavior Adaptation (arXiv 2025.9 "Best Paper Finalist at IROS 2025 Active Perception Workshop")[paper][project]

2024

  • [Review] Large Language Models for UAVs: Current State and Pathways to the Future (IEEE Open Journal of Vehicular Technology 2024.8) [paper][[code]]

  • AeroVerse: UAV-Agent Benchmark Suite for Simulating, Pre-training, Finetuning, and Evaluating Aerospace Embodied World Models (arXiv 2024.8)[paper][[code]]

  • TPML: Task Planning for Multi-UAV System with Large Language Models (2024 IEEE 18th International Conference on Control & Automation (ICCA))[paper][code]

  • EAI-SIM: An Open-Source Embodied AI Simulation Framework with Large Language Models (2024 IEEE 18th International Conference on Control & Automation (ICCA))[paper][code]

  • Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning (STMR) (Submitted to ICRA 2025)[paper][[code]]

2023

  • AerialVLN: Vision-and-Language Navigation for UAVs (ICCV 2023)[paper][code]

System1 + System2 Thinking

  • Visual Agents as Fast and Slow Thinkers (ICLR 2025)[paper][code]

  • Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces (arXiv 2025)[paper][[code]]

  • Helix: A "System 1, System 2" VLA for Whole Upper Body Control (figure.ai) [link]

  • DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models (Conference on Robot Learning (CoRL) 2024)[paper][project]

  • Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models (Physical Intelligence (π)) (ICML 2025)[paper][blog]

  • HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers (Conference on Robot Learning (CoRL) 2024)[paper][[code]]

  • GR00T N1: An Open Foundation Model for Generalist Humanoid Robots (arXiv 2025.3)[paper][code][tech]

  • GR00T N1.5: An Improved Open Foundation Model for Generalist Humanoid Robots [tech][code][blog]

Related Awesome lists

About

A list of research papers, models, datasets, and other resources on Vision-Language-Action/Navigation (VLA/VLN) models for UAVs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published