Skip to content

This repository houses the code and resources for our innovative project aimed at enhancing image accessibility through advanced image captioning techniques. By providing descriptive captions for images, we empower visually impaired individuals to perceive and engage with visual content in a more meaningful way.

Notifications You must be signed in to change notification settings

zabi-32/Image_Captioning

Repository files navigation

Visualizing the Unseen: Enhancing Image Accessibility through Image Captioning

Welcome to the "Visualizing the Unseen" project! 📸🔍

This repository houses the code and resources for our innovative project aimed at enhancing image accessibility through advanced image captioning techniques. By providing descriptive captions for images, we empower visually impaired individuals to perceive and engage with visual content in a more meaningful way.

Features

  • Image Captioning Model: Utilizes state-of-the-art deep learning models to generate descriptive captions for images.
  • User-Friendly Interface: Intuitive user interface for uploading images and receiving generated captions.
  • Customization: Fine-tune the captioning model for specific domains or languages.
  • Multi-Modal Interaction: Supports both image upload and URL input for enhanced flexibility.

Getting Started

Follow these steps to get started with the "Visualizing the Unseen" project:

  1. Clone the Repository: Clone this repository to your local machine using git clone https://github.com/your-username/visualizing-the-unseen.git.

  2. Setup Environment: Set up your Python environment with the required dependencies. You can use requirements.txt to install the necessary packages using pip install -r requirements.txt.

  3. Download Pre-trained Model: Download the pre-trained image captioning model weights from here and place them in the models/ directory.

  4. Run the Application: Launch the image captioning application by running python app.py. This will start a local web server.

  5. Access the Interface: Open your web browser and navigate to http://localhost:5000 to access the image captioning interface.

Project Structure

The "Visualizing the Unseen" project is organized as follows:

  • app.py: The main script that sets up the web application and handles user interactions.
  • image_captioning.py: Contains the code for image caption generation using the pre-trained model.
  • models/: Directory for storing pre-trained model weights.
  • templates/ and static/: Directories containing HTML templates and static assets for the web interface.
  • data/: Placeholder for additional data or resources needed for the project.
  • docs/: Documentation files, including user guides and references.

Contributing

We welcome contributions to the "Visualizing the Unseen" project! If you have ideas, bug fixes, or improvements, please submit a pull request.

Collaborators

Faiziab Khan

Souvik Ghosh

Zabi Ulla Ismail

Contact

For questions or feedback, please contact us at Zabi Ulla Ismail .

Let's work together to create a more inclusive and accessible visual experience! 🌟📸🔍


Disclaimer: "Visualizing the Unseen" is a project created for the purpose of enhancing image accessibility and inclusivity. It is not intended to replace professional assistive technologies for visually impaired individuals.

About

This repository houses the code and resources for our innovative project aimed at enhancing image accessibility through advanced image captioning techniques. By providing descriptive captions for images, we empower visually impaired individuals to perceive and engage with visual content in a more meaningful way.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published