Optical character recognition (OCR) is sometimes referred to as text recognition. An OCR program extracts and repurposes data from scanned documents, camera images and image-only pdfs. OCR software singles out letters on the image, puts them into words and then puts the words into sentences, thus enabling access to and editing of the original content. It also eliminates the need for manual data entry.
This project requires OpenCV and also Tesseract
to install OpenCV run
pip install opencv-python
to Download tesseract click
This is the input Image that we send to the Python script to perform OCR.
This is the output in the form of Image that we get while processing the Image.
This is the exact output that we get after completion of the program which is the image as a string.
Saving a Text File in Python
After learning about opening a File in Python, let’s see the ways to save it. Opening
a new file in write mode will create a file and after closing the file, the files get
saved automatically. However, we can also write some text to the file. Python
provides two methods for the same.
This is the Image as audio feature in use
saved_audio.mp4
- Black-White Image
- Rotate angle of the Image
- Inverted Image
- Remove Noice from Image
- Remove Border from Image
- Insert custom White Border to Image
- Shrink Font Weight (from thick to thin)
- Swell or Increase Font Weight (from thin to thick)
- Boxing object in Image
- Image to Text
- Image to Audio
I'm a Data Analyst and I am developing my skills towards Data Science and Machine Learning. I started with Basics of Python in my free time, messed with errors and bugs, took help from the internet, and sometimes also my friends helped me. I then cracked some competitive stuff, started developing my very own Optical Character Recognition (OCR) modal with python and Aha! I made one.
If you have any feedback, please reach out to us at [email protected]