Skip to content

himansh005/Multilingual-OCR

 
 

Repository files navigation

Optical Character Recognition

Project on Multilingual Optical Character Recognition

Tasks

  • Process image - Focus on the text area, removing all corners and extra features
  • Extract text - Using tesseract, extract text from the image
  • Translate - Translate English text to Spanish
  • Server - Flask server to integrate Python modules with the frontend
  • Website - Basic website to upload image and display OCR text and translated text
  • Document server code

APIs

   # Url
   URL = "https://translate.yandex.net/api/v1.5/tr/translate"
   
   # Parameters
   # key: API KEY
   # text: Text to translate
   # land: from - to language
   PARAMS = {
       'key': api_key,
       'text': text,
       'lang': "en-es"
   }
   
   # Request
   r = requests.get(url=URL, params=PARAMS)
   
   # Parse response
   data = r.text
   data = re.sub('<[^>]+>', '', data)[1:]

Run

  • Install requirements
pip install -r requirements.txt
  • Start the server
python server.py
  • Run the website
  • Upload image

Contributors

😍 Nishant Rodrigues

About

Image project on Multilingual OCR

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 81.3%
  • HTML 18.7%