Skip to content

Retrive meaningful information from PAN Card image using tesseract-ocr 😎

License

Notifications You must be signed in to change notification settings

keshav1998/PAN-Card-OCR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

✨ This document for OCR ✨

PAN Card to JSON


Problem:


Extract information from image of Personal Account Number(PAN) Card
by OCR in proper format[Standard according Indian Govt.].
	Information like - 
				Name, Father's Name, Date of Birth, PAN


Solution:


Steps:
	-> Take image
	-> crop to box(which has text in it)
	-> convert into gray scale(mono crome)
	-> give to tesseract
	-> text(output of tesseract)
Now we will process this text means we will get meaningful information from it.
	-> find name using name database
	-> find father's name(assuming that second will be father's name)
	-> find year of birth
	-> find for PAN


Dependent packages


-python
-opencv
-numpy
-pytesseract
-JSON
-difflib
-csv
-PIL
-SciPy
-dataparser


Structure and Usage


Directories:
	src-
		which contains code files		
	testcases-
		which contains testing images
	result
		it contains JSON object
		
Usage:
	python file_name.py [input image]
	Output will be JSON object name			 

💯

About

Retrive meaningful information from PAN Card image using tesseract-ocr 😎

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.8%
  • Dockerfile 1.2%