nav2-vlm

🤖 Ros2 Navigation2 with Vision-Language Models

A Vision-Language Model (VLM) powered system for waypoint generation and intelligent navigation using ROS2, Nav2, and TurtleBot3.

This project is in active development.

Base Results (Youtube Demo) : https://youtu.be/9UGKGdawtN0?si=5kDADisyX7LLFk6V

🚀 Project Overview

This project integrates Vision-Language Models (VLMs) with ROS2 Nav2 to enhance TurtleBot3’s navigation in complex environments.

The system analyzes cost maps and occupancy grids.
It generates intelligent waypoints using GPT-4V (Vision).
Converts pixel coordinates to real-world waypoints for TurtleBot3.
Executes autonomous navigation while avoiding obstacles dynamically.

🎯 Features

✅ AI-Powered Path Planning – Generates waypoints intelligently.
✅ Cost Map & Occupancy Grid Analysis – Extracts spatial insights.
✅ Pixel-to-World Coordinate Conversion – Ensures real-world accuracy.
✅ ROS2 Nav2 Integration – Executes AI-generated waypoints.
✅ Gazebo Simulation & RViz Visualization – Test before real-world deployment.

🔧 Installation & Setup

1️ Prerequisites

ROS2 Humble
Nav2 (Navigation Stack)
Gazebo (for simulation)
OpenAI API Key (for GPT-4 Vision)
Python libraries: numpy, opencv-python, PIL, requests

**2 Clone

git clone https://github.com/atharvahude/nav2-vlm.git cd nav2-vlm

**3 Setup the simulation

Can use the world files that I have made or use your own

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
gazebo-envs/apartment_1		gazebo-envs/apartment_1
worlds		worlds
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
commands.txt		commands.txt
convert_image.py		convert_image.py
cost_map_saver.py		cost_map_saver.py
door-checkpolints.txt		door-checkpolints.txt
doorcheckpoint.json		doorcheckpoint.json
global_costmap.png		global_costmap.png
input.png		input.png
input_mapvlm.py		input_mapvlm.py
metaprompt.py		metaprompt.py
nav2_test.py		nav2_test.py
prompt.py		prompt.py
responses.txt		responses.txt
run_project.bash		run_project.bash

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nav2-vlm

🤖 Ros2 Navigation2 with Vision-Language Models

This project is in active development.

Base Results (Youtube Demo) : https://youtu.be/9UGKGdawtN0?si=5kDADisyX7LLFk6V

🚀 Project Overview

🎯 Features

🔧 Installation & Setup

1️ Prerequisites

**2 Clone

**3 Setup the simulation

**4 Capture the map with Lidar SLAM using the Cartographer and SLAM toolkit

**5 Convert the map image .pgm to png and rename it as input.png

**6 Start the cartographer to visualize the Global and Local Planner

**7 run the nav2_test.py

About

Releases

Packages

Languages

License

atharvahude/nav2-vlm

Folders and files

Latest commit

History

Repository files navigation

nav2-vlm

🤖 Ros2 Navigation2 with Vision-Language Models

This project is in active development.

Base Results (Youtube Demo) : https://youtu.be/9UGKGdawtN0?si=5kDADisyX7LLFk6V

🚀 Project Overview

🎯 Features

🔧 Installation & Setup

1️ Prerequisites

**2 Clone

**3 Setup the simulation

**4 Capture the map with Lidar SLAM using the Cartographer and SLAM toolkit

**5 Convert the map image .pgm to png and rename it as input.png

**6 Start the cartographer to visualize the Global and Local Planner

**7 run the nav2_test.py

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages