Binarization of Ancient Malayalam Documents - A Novel Weight-based Denoising Approach
Even though several studies exist on denoising degraded documents, now a days it is a tedious task in the field of document image processing because ancient document may contain several degradations which will be a barrier for reader. Here we use old Malayalam Grantha scripts that contain useful information like the poem titled ‘Njana Stuthi’ and ancient literature. These historical documents are losing content due to heavy degradations such as, ink bleed, fungi-found to be brittleness & show through. In order to remove these kind of degradations, the study is proposing a novel binarization algorithm which remove noises from Grantha scripts as well as notebook images and make the document readable. Here we use 500 datasets of Grantha scripts for experimentation. In our proposed method, binarization is done through a channel based method in which we are converting image in to RGB, further adding weights to make the image darker or brighter followed by morphological operation open and finally passing it RGB and HSV channel for more clarity and clear separation of black text and white background, remaining noise will be removed using adaptive thresholding technique. The proposed method is outperformed with good accuracy.