Appearance
Data extraction from Handwritten documents
Deep Learning Model: resnet18
In this project, I extracted text data from handwritten documents using an image classification model.
Developed using: Python, Pytorch, Numpy, matplot, and skimage
Objective
In the Aerospace and defense manufacturing industry, all the processes are documented. Most of the data are documented in handwritten form. At the time of my writing, this post I work in an Aerospace manufacturing industry called TATA Boeing Aerospace Limited[TBAL]. Though we document everything in handwritten forms we cannot easily infer/process the data unless they are in digital format. So, to infer data TBAL has assigned a few persons[ around 24+ man hours per day] to digitalize a few important documents.
The main objective of this project is to extract the data from the handwritten document without human interference So, a few persons can be utilized for other works.
Solution planning
For this problem, I opted for an image classification model. You may think that why not Optical Character Recognition (OCR)? The Words they use in the handwritten forms are predefined So, they write only the words predefined hence we can solve the problem with a simple Image classification model.
Input image of Document
INFO
This is a sample image prepared by me for reference.

At first, we need to find the bounding box for each Observation Feild. I found the bounding box of each observation using image processing and warping the image to reference the image.
And the split input looks like this,

And the expected out is like,
NM
Defining Model
With small googling, I found that Resnet18 would do the job. So, I defined a Resnet18 model in Pyorch(python) in the Google colab environment.
Data Collection
I used the old documents data for data collection and a few new data from a few people for testing the model. + image augmentation for the training dataset
Training Resnet18
The training happened in a colab environment utilizing GPU[ Cuda enabled GPU ]. I trained the model for 100 epoch
Epoch 99/99
----------
train Loss: 0.8627 Acc: 0.6667
val Loss: 0.2408 Acc: 0.9333
Training complete in 38m 21s
Best val Acc: 1.000000
Saving the predictions in Google sheets
Overall process
take the data from one folder -> Run prediction -> Edit Google sheet -> Put the image in another folder
In Google colab, we can easily mount Google drive and edit the google Sheets.
py
from google.colab import drive
drive.mount('/drive')
I edited the google sheet using a python library called gspread like below,
py
from google.colab import auth
auth.authenticate_user()
import gspread
from google.auth import default
creds, _ = default()
gc = gspread.authorize(creds)
The project worked well and helped TBAL to reduce the manpower involved in data entering work.