Skip to content

Hub-Tian/UAVs_Meet_LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 

Repository files navigation

🚁 UAVs Meet LLMs 🚀

Awesome Badge Maintain Badge PR's Welcome Visitor Badge

Where Unmanned Aerial Vehicles Take Off and Large Language Models Unfold!


🏡About

This repository accompanies the work:
UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility

This is an active repository, you can watch for the latest advances.
If you find it useful, please star ⭐ this repo and cite the paper.


🔥 News

If you have any questions or suggestions, please feel free to open an issue or contact us via email.


Introduction

This repository accompanies our work on "UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility".
Here, we primarily store various tables referenced in the survey/overview paper. These tables focus on:

  • Summarization of typical LLMs, VLMs, and VFMs
  • Awesome works on Fundation Models based UAV Systems
  • UAV-oriented Datasets across multiple application domains

Note: The goal is to provide a structured, easy-to-navigate resource for researchers interested in the intersection of UAVs and Large Language Models.


Table of Contents


Preliminaries for UAVs

Typical configurations of UAV

Category Characteristics Advantages Disadvantages
Fixed-wing UAV Fixed wings generate lift with forward motion. High speed, long endurance, stable flight. Cannot hover, high demands for takeoff/landing areas.
Multirotor UAV Multiple rotors provide lift and control. Low cost, easy operation, capable of VTOL and hovering. Limited flight time, low speed, small payload capacity.
Unmanned Helicopter Single or dual rotors allow vertical take-off and hovering. High payload capacity, good wind resistance, long endurance, VTOL. Complex structure, higher maintenance cost, slower than fixed-wing UAVs.
Hybrid UAV Combines fixed-wing and multirotor capabilities. Flexible missions, long endurance, VTOL. Complex mechanisms, higher cost.
Flapping-wing UAV Uses clap-and-fling mechanism for flight. Low noise, high propulsion efficiency, high maneuverability. Complex analysis and control, limited payload capacity.
Unmanned Airship Aerostat aircraft with gasbag for lift. Low cost, low noise. Low speed, low maneuverability, highly affected by wind.

UAV Swarm Path Planning Method

Category Examples References
Intelligent optimization algorithm Ant Colony Algorithm Ref
Genetic Algorithm Ref
Simulated Annealing Algorithm Ref
Mathematical programming mixed integer linear programming Ref
nonlinear programming Ref
AI based method Deep Learning Ref
Reinforcement Learning Ref

UAV Swarm Task Allocation

Category Examples References
Heuristic Algorithm Particle Swarm Optimization Algorithm Ref
Genetic Algorithm Ref
Simulated Annealing Algorithm Ref
AI Based Algorithm Reinforcement Learning Ref
Artificial Neural Network Ref
Mathematical Programming Methods Mixed Integer Programming Ref
Market Mechanism Based Method Auction Based Algorithm Ref
Consensus Based Bundle Algorithm Ref
Contract Net Protocol Ref

UAV Swarm Communication architecture

Category References
infrastructure-based architectures Ref
Flying Ad-hoc Network (FANET) Architectur Ref

UAV Swarm Formation Control Algorithm

Category Example References
Centralized Control Virtual Structure Ref
Leader-Follower Approaches Ref
Decentralized Control Decentralized Model Prediction Method Ref
Distributed Control Behavior Method Ref
Consistency Method Ref

Summarization of LLMs, VLMs, and VFMs

LLMs

Subcategory Model Name Institution / Author
General GPT-3, GPT-3.5, GPT-4 OpenAI
Claude 2, Claude 3 Anthropic
Mistral series Mistral AI
PaLM series, Gemini series Google Research
LLaMA, LLaMA2, LLaMA3 Meta AI
Vicuna Vicuna Team
Qwen series Qwen Team, Alibaba Group
InternLM Shanghai AI Laboratory
BuboGPT Bytedance
ChatGLM Zhipu AI
DeepSeek series DeepSeek

VLMs

Subcategory Model Name Institution / Author
General GPT-4V, GPT-4o, GPT-4o mini, GPT o1-preview OpenAI
Claude 3 Opus, Claude 3.5 Sonnet Anthropic
Step-2 Jieyue Xingchen
LLaVA, LLaVA-1.5, LLaVA-NeXT Liu et al.
MoE-LLaVA Lin et al.
LLaVA-CoT Xu et al.
Flamingo Alayrac et al.
BLIP Li et al.
BLIP-2 Li et al.
InstructBLIP Dai et al.
Video Understanding LLaMA-VID Li et al.
IG-VLM Kim et al.
Video-ChatGPT Maaz et al.
VideoTree Wang et al.
Visual Reasoning X-VLM Zeng et al.
Chameleon Lu et al.
HYDRA Ke et al.
VISPROG PRIOR @ Allen Institute for AI

VFMs

Subcategory Model Name Institution / Author
General CLIP OpenAI
FILIP Yao et al.
RegionCLIP Microsoft Research
EVA-CLIP Sun et al.
Object Detection GLIP Microsoft Research
DINO Zhang et al.
Grounding-DINO Liu et al.
DINOv2 Meta AI Research
AM-RADIO NVIDIA
DINO-WM Zhou et al.
YOLO-World Cheng et al.
Image Segmentation CLIPSeg Lüdecke and Ecker
SAM Meta AI Research, FAIR
Embodied-SAM Xu et al.
Point-SAM Zhou et al.
Open-Vocabulary SAM Yuan et al.
TAP Pan et al.
EfficientSAM Xiong et al.
MobileSAM Zhang et al.
SAM 2 Meta AI Research, FAIR
SAMURAI University of Washington
SegGPT Wang et al.
Osprey Yuan et al.
SEEM Zou et al.
Seal Liu et al.
LISA Lai et al.
Depth Estimation ZoeDepth Bhat et al.
ScaleDepth Zhu et al.
Depth Anything Yang et al.
Depth Anything V2 Yang et al.
Depth Pro Apple

General-domain Datasets for UAV

Environmental Perception

Name Year Types Amount
AirFisheye 2024 Fisheye image, Depth image, Point cloud, IMU Over 26,000 fisheye images in total. Data is collected at a rate of 10 frames per second.
SynDrone 2023 Image, Depth image, Point cloud Contains 72,000 annotation samples, providing 28 types of pixel-level and object-level annotations.
WildUAV 2022 Image, Video, Depth image, Metadata Mapping images are provided as 24-bit PNG files, with the resolution of 5280x3956. Video images are provided as JPG files at a resolution of 3840x2160. There are 16 possible class labels detailed.

Event Recognition

Name Year Types Amount
CapERA 2023 Video, Text 2864 videos, each with 5 descriptions, totaling 14,320 texts. Each video lasts 5 seconds and is captured at 30 frames/second with a resolution of 640 × 640 pixels.
ERA 2020 Video A total of 2,864 videos, including disaster events, traffic accidents, sports competitions, and other 25 categories. Each video is 24 frames/second for 5 seconds.
VIRAT 2016 Video 25 hours of static ground video and 4 hours of dynamic aerial video. There are 23 event types involved.

Object Tracking

Name Year Types Amount
WebUAV-3M 2024 Video, Text, Audio 4,500 videos totaling more than 3.3 million frames with 223 target categories, providing natural language and audio descriptions.
UAVDark135 2022 Video 135 video sequences with over 125,000 manually annotated frames.
DUT-VTUAV 2022 RGB-T Image Nearly 1.7 million well-aligned visible-thermal (RGB-T) image pairs with 500 sequences for unveiling the power of RGB-T tracking. Including 13 sub-classes and 15 scenes cross 2 cities.
TNL2K 2022 Video, Infrared video, Text 2,000 video sequences, comprising 1,244,340 frames and 663 words.
PRAI-1581 2020 Image 39,461 images of 1581 person identities.
VOT-ST2020/VOT-RT2020 2020 Video 1,000 sequences, each varying in length, with an average length of approximately 100 frames.
VOT-LT2020 2020 Video 50 sequences, each with a length of approximately 40,000 frames.
VOT-RGBT2020 2020 Video, Infrared video 50 sequences, each with a length of approximately 40,000 frames.
VOT-RGBD2020 2020 Video, Depth image 80 sequences with a total of approximately 101,956 frames.
GOT-10K 2019 Image, Video 420 video clips belonging to 84 object categories and 31 motion categories.
DTB70 2017 Video 70 video sequences, each consisting of multiple video frames, with each frame containing an RGB image at a resolution of 1280x720 pixels.
Stanford Drone 2016 Video 19,000+ target tracks, containing 6 types of targets, about 20,000 target interactions, 40,000 target interactions with the environment, covering 100+ scenes in the university campus.
COWC 2016 Image 32,716 unique vehicles and 58,247 non-vehicle targets were labeled. Covering 6 different geographical areas.

Action Recognition

Name Year Types Amount
Aeriform in-action 2023 Video 32 videos, 13 types of action, 55,477 frames, 40,000 callouts.
MEVA 2021 Video, Infrared video, GPS, Point cloud Total 9,300 hours of video, 144 hours of activity notes, 37 activity types, over 2.7 million GPS track points.
UAV-Human 2021 Video, Night-vision video, Fisheye video, Depth video, Infrared video, Skeleton 67,428 videos (155 types of actions, 119 subjects), 22,476 frames of annotated key points (17 key points), 41,290 frames of people re-recognition (1,144 identities), 22,263 frames of attribute recognition (such as gender, hat, backpack, etc.).
MOD20 2020 Video 20 types of action, 2,324 videos, 503,086 frames.
NEC-DRONE 2020 Video 5,250 videos containing 256 minutes of action videos involving 19 actors and 16 action categories.
Drone-Action 2019 Video 240 HD videos, 66,919 frames, 13 types of action.
UAV-GESTURE 2019 Video 119 videos, 37,151 frames, 13 types of gestures, 10 actors.

Navigation and Localization

Name Year Types Amount
CityNav 2024 Image, Text 32,000 natural language descriptions and companion tracks.
CNER-UAV 2024 Text 12,000 labeled samples containing 5 types of address labels (e.g., building, unit, floor, room, etc.).
AerialVLN 2023 Simulator path, Text 25 city-level scenes, 8,446 paths, 3 natural language descriptions per path, totaling 25,338 instructions.
DenseUAV 2023 Image Training: 6,768 UAV images, 13,536 satellite images. Test: 2,331 UAV query images, 4,662 satellite images.
map2seq 2022 Image, Text, Map path 29,641 panoramic images, 7,672 navigation instruction texts.
VIGOR 2021 Image 90,618 aerial images, 238,696 street panorama images.
University-1652 2020 Image 1,652 university buildings, 72 universities, 50,218 training images, 37,855 UAV query images, 701 satellite query images, and 21,099 ordinary & 5,580 street view images.

Domain-specific Datasets for UAV

Transportation

Name Year Types Amount
TrafficNight 2024 Image, Infrared Image, Video, Infrared Video, Map The dataset consists of 2,200 pairs of annotated thermal infrared and sRGB image data, and video data from 7 traffic scenes, with a total duration of approximately 240 minutes. Each scene includes a high-precision map, providing a detailed layout and topological information.
VisDrone 2022 Video, Image 263 videos, 179,264 frames. 10,209 still images. More than 2,500,000 object instance annotations. The data covers 14 different cities, covering a wide range of weather and light conditions.
ITCVD 2020 Image A total of 173 aerial images were collected, including 135 in the training set with 23,543 vehicles and 38 in the test set with 5,545 vehicles. There is 60% regional overlap between the images, and there is no overlap between the training set and the test set.
UAVid 2020 Image, Video 30 videos, 300 images, 8 semantic category annotations.
AU-AIR 2020 Video, GPS, Altitude, IMU, Speed 32,823 frames of video, 1920x1080 resolution, 30 FPS, divided into 30,000 training validation samples and 2,823 test samples. The total duration of the 8 videos is about 2 hours, with a total of 132,034 instances, distributed in 8 categories.
iSAID 2020 Image Total images: 2,806. Total number of instances: 655,451. Test set: 935 images (not publicly labeled, used to evaluate the server).
CARPK 2018 Image 1448 images, approx. 89,777 vehicles, providing box annotations.
highD 2018 Video, Trajectory 16.5 hours, 110,000 vehicles, 5,600 lane changes, 45,000 km, totaling approximately 447 hours of vehicle travel data; 4 predefined driving behavior labels.
UAVDT 2018 Video, Weather, Altitude, Camera angle 100 videos, about 80,000 frames, 30 frames per second, containing 841,500 target boxes, covering 2,700 targets.
CADP 2016 Video A total of 5.24 hours, 1,416 traffic accident clips, 205 full-time and space annotation videos.
VEDAI 2016 Image 1,210 images (1024 × 1024 and 512 × 512 pixels), 9 types of vehicles, containing about 6,650 targets in total.

Remote Sensing

Name Year Types Amount
RET-3 2024 Image, Text Approximately 13,000 samples. Including RSICD, RSITMD and UCM.
DET-10 2024 Image In the object detection dataset, the number of objects per image ranges from 1 to 70, totaling about 80,000 samples.
SEG-4 2024 Image The segmented data set covers different regions and resolutions, totaling about 72,000 samples.
DIOR 2020 Image 23,463 images, containing 192,472 target instances, covering 20 categories, including aircraft, vehicles, ships, bridges, etc., each category contains about 1,200 instances.
TGRS-HRRSD 2019 Image Total images: 21,761. 13 categories, including aircraft, vehicles, bridges, etc. The total number of targets is approximately 53,000 targets.
xView 2018 Image There are more than 1 million goals and 60 categories, including vehicles, buildings, facilities, boats and so on, which are divided into seven parent categories and several sub-categories.
DOTA 2018 Image 2806 images, 188, 282 targets, 15 categories.
RSICD 2018 Image, Text 10,921 images, 54,605 descriptive sentences.
HRSC2016 2017 Image 3,433 instances, totaling 1,061 images, including 70 pure ocean images and 991 images containing mixed land-sea areas. 2,876 marked vessel targets. 610 unlabeled images.
RSOD 2017 Image Contains 4 types of targets (tank, aircraft, overpass, playground) with 12,000 positive samples and 48,000 negative samples.
NWPU-RESISC45 2017 Image A total of 31,500 images, covering 45 scene categories, 700 images per category, resolution 256 × 256 pixels, spatial resolution from 0.2m to 30m.
NWPU VHR-10 2014 Image 800 high-resolution images, of which 650 contain targets and 150 are background images, covering 10 categories (such as aircraft, ships, bridges, etc.), totaling more than 3,000 targets.

Agriculture

Name Year Types Amount
WEED-2C 2024 Image Contains 4,129 labeled samples covering 2 weed species.
CoFly-WeedDB 2023 Image, Health data Consisting of 201 aerial images, different weed types of 3 disturbed row crops (cotton) and their corresponding annotated images.
Avo-AirDB 2022 Image 984 high-resolution RGB images (5472 × 3648 pixels), 93 of which have detailed polygonal annotations, divided into 3 to 4 categories (small, medium, large, and background).

Industry

Name Year Types Amount
UAPD 2021 Image There are 2,401 crack images in the original data and 4,479 crack images after data enhancement.
InsPLAD 2023 Image 10,607 UAV images containing 17 classes of power assets with a total of 28,933 labeled instances, and defect labels for 5 assets with a total of 402 defect samples classified into 6 defect types.

Emergency Response

Name Year Types Amount
AFID 2023 Image A total of 816 images with resolutions of 2720 × 1536 and 2560 × 1440. Contains 8 semantic segmentation categories.
FloodNet 2021 Image, Text The whole dataset has 2,343 images, divided into training (~60%), validation (~20%), and test (~20%) sets. The semantic segmentation labels include: Background, Building Flooded, Building Non-Flooded, Road Flooded, Road Non-Flooded, Water, Tree, Vehicle, Pool, Grass.
Aerial SAR 2020 Image 2,000 images with 30,000 action instances covering multiple human behaviors.

Military

Name Year Types Amount
MOCO 2024 Image, Text 7,449 images, 37,245 captions.

Wildlife

Name Year Types Amount
WAID 2023 Image 14,375 UAV images covering 6 species of wildlife and multiple environment types.

Drone Detection

Name Year Types Amount
DroneRFa 2024 RF signal It includes 24 types of UAV signals (9 types of outdoor acquisition and 15 types of indoor acquisition) and 1 type of background signals, covering 3 ISM frequency bands.
IDTDSAT 2019 Infrared image, Trajectory Infrared image sequence of 22 segments, total number of frames 16,177, total number of targets 16,944, 30 tracks; image resolution 256 × 256 pixels.
DTDAOTRES 2019 Radar 15 segments of 8.76 GB.

Open Platforms for UAVs

Name Publication
AirSim Airsim: High-fidelity visual and physical simulation for autonomous vehicles
Carla CARLA: An open urban driving simulator
NVIDIA Isaac Sim
AerialVLN Simulator Aerialvln: Vision-and-language navigation for uavs
Embodied City EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Advances of FM-based UAV Systems in Various Tasks

Visual Perception

Title Type Publication Code
Li et al. (A Benchmark for UAV-View Natural Language-Guided Tracking) VFM MDPI GitHub
Ma et al. (Applying Unsupervised Semantic Segmentation to High-Resolution UAV Imagery for Enhanced Road Scene Parsing) VFM Arxiv -
Limberg et al. (Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery) VFM+VLM Arxiv -
Kim et al. (Weather-Aware Drone-View Object Detection Via Environmental Context Understanding) VLM+VFM ICIP 2024 -
LGNet (Shooting condition insensitive unmanned aerial vehicle object detection) VFM Expert Systems with Applications -
Sakaino et al. (Dynamic Texts From UAV Perspective Natural Images) VLM+VFM ICCV 2023 -
COMRP (Unsupervised semantic segmentation of high-resolution UAV imagery for road scene parsing) VFM Arxiv GitHub
CrossEarth (CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation) VFM Arxiv GitHub
TanDepth (TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs) VFM Arxiv GitHub
DroneGPT (DroneGPT: Zero-shot Video Question Answering For Drones) VLM+LLM+VFM CVDL 2024 -
de Zarzà et al. (Socratic video understanding on unmanned aerial vehicles) LLM Procedia Computer Science -
AeroAgent (Agent as Cerebrum, Controller as Cerebellum: Implementing an Embodied LMM-based Agent on Drones) VLM Arxiv -
RS-LLaVA (Rs-llava: A large vision-language model for joint captioning and question answering in remote sensing imagery) VLM MDPI -
GeoRSCLIP (RS5M and GeoRSCLIP: A large scale vision-language dataset and a large vision-language model for remote sensing) VFM IEEE Transactions on Geoscience and Remote Sensing GitHub
SkyEyeGPT (Skyeyegpt: Unifying remote sensing vision-language tasks via instruction tuning with large language model) VFM+LLM Arxiv GitHub

VLN

Title Type Publication Code
NaVid (Navid: Video-based vlm plans the next step for vision-and-language navigation) VFM+LLM Arxiv -
VLN-MP (Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts) VFM Arxiv GitHub
Gao et al. (Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning) VFM+LLM Arxiv -
MGP (CityNav: Language-Goal Aerial Navigation Dataset with Geographic Information) LLM+VFM Arxiv GitHub
UAV Navigation LLM (Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology) LLM+VFM Arxiv GitHub
GOMAA-Geo (GOMAA-Geo: GOal Modality Agnostic Active Geo-localization) LLM+VFM Arxiv GitHub
NavAgent (NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation) LLM+VFM+VLM Arxiv -
ASMA (ASMA: An Adaptive Safety Margin Algorithm for Vision-Language Drone Navigation via Scene-Aware Control Barrier Functions) LLM+VFM Arxiv -
Zhang et al. (Demo Abstract: Embodied Aerial Agent for City-level Visual Language Navigation Using Large Language Model) VFM+LLM IPSN 2024 -
Chen et al. (Vision-Language Navigation for Quadcopters with Conditional Transformer and Prompt-based Text Rephraser) LLM MMAsia 2023 -
CloudTrack (CloudTrack: Scalable UAV Tracking with Cloud Semantics) VFM+VLM Arxiv -
NEUSIS (NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions) VFM+VLM Arxiv -
Say-REAPEx (Say-REAPEx: An LLM-Modulo UAV Online Planning Framework for Search and Rescue) LLM Openreview -

Planning

Title Type Publication Code
TypeFly (Typefly: Flying drones with large language model) LLM Arxiv -
SPINE (SPINE: Online Semantic Planning for Missions with Incomplete Natural Language Specifications in Unstructured Environments) LLM+VFM+VLM Arxiv -
LEVIOSA (LEVIOSA: Natural Language-Based Uncrewed Aerial Vehicle Trajectory Generation) LLM MDPI GitHub
TPML (TPML: Task Planning for Multi-UAV System with Large Language Models) LLM ICCA 2023 -
REAL (Real: Resilience and adaptation using large language models on autonomous aerial robots) LLM Arxiv -
Liu et al. (Multi-Agent Formation Control Using Large Language Models) LLM Techrxiv -

Flight Control

Title Type Publication Code
PromptCraft (Chatgpt for robotics: Design principles and model abilities) LLM IEEE Access GitHub
Zhong et al. (A safer vision-based autonomous planning system for quadrotor uavs with dynamic obstacle trajectory prediction and its application with llms) LLM WACV 2024 -
Tazir et al. (From words to flight: Integrating openai chatgpt with px4/gazebo for natural language-based drone control) LLM WCSE 2023 -
Phadke et al. (Integrating Large Language Models for UAV Control in Simulated Environments: A Modular Interaction Approach) LLM Arxiv -
EAI-SIM (EAI-SIM: An Open-Source Embodied AI Simulation Framework with Large Language Models) LLM ICCA 2024 GitHub
TAIiST (TAIiST CPS-UAV at the SBFT Tool Competition 2024) LLM SBFT 2024 GitHub
Swarm-GPT (Swarm-gpt: Combining large language models with safe motion planning for robot choreography design) LLM Arxiv -
FlockGPT (FlockGPT: Guiding UAV Flocking with Linguistic Orchestration) LLM Arxiv -
CLIPSwarm (CLIPSwarm: Generating Drone Shows from Text Prompts with Vision-Language Models) VFM Arxiv -

Infrastructures

Title Type Publication Code
DTLLM-VLT (DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM) VFM+LLM CVPR 2024 -
Yao et al. (Can llm substitute human labeling? a case study of fine-grained chinese address entity recognition dataset for uav delivery) LLM Companion Proceedings of the ACM Web Conference 2024 GitHub
GPG2A (Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance) LLM Arxiv GitLap
AeroVerse (AeroVerse: UAV-Agent Benchmark Suite for Simulating, Pre-training, Finetuning, and Evaluating Aerospace Embodied World Models) VLM+LLM Arxiv -
Tang et al. (Defining and Evaluating Physical Safety for Large Language Models) LLM Arxiv Hugging face
Xu et al. (Emergency Networking Using UAVs: A Reinforcement Learning Approach with Large Language Model) LLM IPSN 2024 -
LLM-RS (Real-time Integration of Fine-tuned Large Language Model for Improved Decision-Making in Reinforcement Learning) LLM IJCNN 2024 -
Pineli et al. (Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks) LLM Arxiv -

Contributors

We want to thank the following contributors for creating, maintaining, and curating the tables in this repository:

  • Yonglin Tian
  • Fei Lin
  • Yiduo Li
  • Tengchao Zhang
  • Xuan Fu

If you have any questions about this repository, feel free to get in touch with Yonglin Tian 📧 or Fei Lin 📧.

(If you would like to contribute to this repo, please open an Issue or Pull Request.)


Star History

Star History Chart


Citation

If you find this repository useful, please consider citing this paper:

@misc{tian2025uavsmeetllmsoverviews,
      title={UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility}, 
      author={Yonglin Tian and Fei Lin and Yiduo Li and Tengchao Zhang and Qiyao Zhang and Xuan Fu and Jun Huang and Xingyuan Dai and Yutong Wang and Chunwei Tian and Bai Li and Yisheng Lv and Levente Kovács and Fei-Yue Wang},
      year={2025},
      eprint={2501.02341},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2501.02341}, 
}

License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published