What problem did the project solve?
This project aimed to support visually impaired individuals by enabling them to understand the
content of images through an AI system that generates descriptions (image captioning) and reads
them aloud (text-to-speech). The goal was to offer real-time assistance, allowing users to
comprehend visual content independently and intuitively.
What algorithms were used?
CNN (Convolutional Neural Networks) such as VGG or Inception were used to extract features
from images.
LSTM (Long Short-Term Memory) networks were used to generate meaningful textual
descriptions based on the image features.
gTTS (Google Text-to-Speech) was implemented to convert the generated text into audible
speech