Deep Extreme Learning Machine-Based Optical Character Recognition System for Nastalique Urdu-Like Script Languages

The Computer Journal ◽

10.1093/comjnl/bxaa042 ◽

2020 ◽

Author(s):

Syed Saqib Raza Rizvi ◽

Muhammad Adnan Khan ◽

Sagheer Abbas ◽

Muhammad Asadullah ◽

Nida Anwer ◽

...

Keyword(s):

Extreme Learning Machine ◽

Character Recognition ◽

Optical Character Recognition ◽

Recognition Rate ◽

Recognition System ◽

Software Systems ◽

Script Language ◽

Optical Character ◽

Handwritten Text ◽

Learning Machine

Abstract Optical character recognition systems convert printed or handwritten scripts into digital text formats like ASCII or UNICODE. Urdu-like script languages like Urdu, Punjabi and Sindhi are widely spoken languages of the world, especially in Asia. An enormous amount of printed and handwritten text of such languages exist, which needs to be converted into computer-understandable formats for knowledge extraction. In this study, extreme learning machine’s (ELM’s) most recently proposed variant called deep extreme learning machine (DELM)-based optical character recognition (OCR) system is proposed to enhance Urdu-like script language’s character recognition rate. The proposed DELM-based character recognition model is optimizing the OCR process by reducing the overhead of Pre-processing, Segmentation and Feature Extraction Layer. The proposed system evaluations accomplished 98.75% training accuracy with 1.492 × 10−3 RMSE and 98.12% testing accuracy with 1.587 × 10−3 RMSE, with six DELM hidden layers. The results show that the proposed system has attained the foremost recognition rate as compared to any previously proposed Urdu-like script language OCR system. This technique is applicable for machine-printed text and fractionally useful for handwritten text as well. This study will aid in the advancement of more accurate Urdu-like script OCR’s software systems in the future.

Download Full-text

APPLICATION OF ZONAL AND CURVATURE FEATURES TO NUMERALS RECOGNITION

International Journal of Students Research in Technology & Management ◽

10.18510/ijsrtm.2021.922 ◽

2021 ◽

Vol 9 (2) ◽

pp. 7-12

Author(s):

Binod Kumar Prasad

Keyword(s):

Language Processing ◽

Character Recognition ◽

Optical Character Recognition ◽

Recognition Rate ◽

Recognition System ◽

Signature Verification ◽

Optical Character ◽

Knn Classifier ◽

Average Recognition Rate ◽

Distance Coding

Purpose of the study: The purpose of this work is to present an offline Optical Character Recognition system to recognise handwritten English numerals to help automation of document reading. It helps to avoid tedious and time-consuming manual typing to key in important information in a computer system to preserve it for a longer time. Methodology: This work applies Curvature Features of English numeral images by encoding them in terms of distance and slope. The finer local details of images have been extracted by using Zonal features. The feature vectors obtained from the combination of these features have been fed to the KNN classifier. The whole work has been executed using the MatLab Image Processing toolbox. Main Findings: The system produces an average recognition rate of 96.67% with K=1 whereas, with K=3, the rate increased to 97% with corresponding errors of 3.33% and 3% respectively. Out of all the ten numerals, some numerals like ‘3’ and ‘8’ have shown respectively lower recognition rates. It is because of the similarity between their structures. Applications of this study: The proposed work is related to the recognition of English numerals. The model can be used widely for recognition of any pattern like signature verification, face recognition, character or word recognition in another language under Natural Language Processing, etc. Novelty/Originality of this study: The novelty of the work lies in the process of feature extraction. Curves present in the structure of a numeral sample have been encoded based on distance and slope thereby presenting Distance features and Slope features. Vertical Delta Distance Coding (VDDC) and Horizontal Delta Distance Coding (HDDC) encode a curve from vertical and horizontal directions to reveal concavity and convexity from different angles.

Download Full-text

Improve OCR Accuracy with Advanced Image Preprocessing using Machine Learning with Python

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g5745.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1026-1030

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Neural Networks ◽

Character Recognition ◽

Optical Character Recognition ◽

Image Preprocessing ◽

Optical Character ◽

Handwritten Text ◽

Printed Text ◽

Learning Machine

Optical Character Recognition or Optical Character Reader (OCR) is a pattern-based method consciousness that transforms the concept of electronic conversion of images of handwritten text or printed text in a text compiled. Equipment or tools used for that purpose are cameras and apartment scanners. Handwritten text is scanned using a scanner. The image of the scrutinized document is processed using the program. Identification of manuscripts is difficult compared to other western language texts. In our proposed work we will accept the challenge of identifying letters and letters and working to achieve the same. Image Preprocessing techniques can effectively improve the accuracy of an OCR engine. The goal is to design and implement a machine with a learning machine and Python that is best to work with more accurate than OCR's pre-built machines with unique technologies such as MatLab, Artificial Intelligence, Neural networks, etc.

Download Full-text

A Design of a Hybrid Algorithm for Optical Character Recognition of Online Hand-Written Arabic Alphabets

Iraqi Journal of Science ◽

10.24996/ijs.2019.60.9.22 ◽

2019 ◽

pp. 2067-2079

Author(s):

Waleed Noori Hussein ◽

Haider N. Hussain

Keyword(s):

Decision Tree ◽

Character Recognition ◽

Optical Character Recognition ◽

Recognition Rate ◽

Recognition System ◽

Optical Character ◽

Recognition Systems ◽

The Difference ◽

Artificial Neural Network Ann ◽

Handwritten Recognition

The growing relevance of printed and digitalized hand-written characters has necessitated the need for convalescent automatic recognition of characters in Optical Character Recognition (OCR). Among the handwritten characters, Arabic is one of those with special attention due to its distinctive nature, and the inherent challenges in its recognition systems. This distinctiveness of Arabic characters, with the difference in personal writing styles and proficiency, are complicating the effectiveness of its online handwritten recognition systems. This research, based on limitations and scope of previous related studies, studied the recognition of Arabic isolated characters through the identification of its features and dots in view of producing an efficient online Arabic handwriting isolated character recognition system. It proposes a hybrid of decision tree and Artificial Neural Network (ANN), as against being combined with other algorithms as found in previous studies. The proposed recognition process has four main steps with associated sub-steps. The results showed that the proposed method achieved the highest performance at 96.7%, whereas the benchmark methods which are EDMS and Naeimizaghiani had 68.88% and 78.5 % respectively. Based on this, ANN has the best performance recognition rate at 98.8%, while the best rate for decision tree was obtained at 97.2%.

Download Full-text

A Method for Arabic Handwritten Diacritics Characters

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1034.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 209-212

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Image Representation ◽

Recognition Rate ◽

Nearest Neighbors ◽

Text Line ◽

K Nearest Neighbors ◽

New Model ◽

Optical Character ◽

Handwritten Text

An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to read and recognized Arabic diacritics characters as human still, remain a problem. More, specifically, poor recognition rate in most of optical diacritics characters recognition is mainly attributed to failing in segmenting a handwritten text correctly. To overcome this problem, we perform develop a method based on seven operations; it starts with searching the text-line height followed by reading words from the line. Then identify the diacritics regions. The segmentation is also applied during this operation by converting the text-line into a grayscale and binary image. Moreover, we introduced a new model based on k-nearest neighbors (KNN) algorithm to identify diacritics and characters segmentation. KNN is trained to directly predict the diacritic from the text-line. Finally, we offer an evaluation discussion on optical diacritics characters recognition.

Download Full-text

Recognition of handwritten Arabic (Indian) numerals using skeleton matching

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v19.i3.pp1461-1468 ◽

2020 ◽

Vol 19 (3) ◽

pp. 1461

Author(s):

Bassam Alqaralleh ◽

Malek Zakarya Alksasbeh ◽

Tamer Abukhalil ◽

Harbi Almahafzah ◽

Tawfiq Al Rawashdeh

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Recognition Rate ◽

Numerical Data ◽

Recognition System ◽

Worst Case ◽

Limited Success ◽

Optical Character ◽

The Right ◽

Handwritten Arabic

This paper brings into discussion the problem of recognizing Arabic numbers using a monocular camera as the only sensor. When a digital image is presented in this application, optical character recognition (OCR) can be exploited to comprehend numerical data. However, there has been a limited success when applied to the handwritten Arabic (Indian) numbers. This paper aims to overcome this limitation and introduces optical character recognition system based on skeleton matching. The proposed approach is used for handwritten Arabic numbers only. The experimental results indicate the effectiveness of the proposed optical character recognition system even for numbers written in worst case. The right system achieves a recognition rate of 99.3 %.

Download Full-text

10 mW CMOS retina and classifier for handheld, 1000 images/s optical character recognition system

1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No.99CH36278) ◽

10.1109/isscc.1999.759194 ◽

2003 ◽

Author(s):

P. Masa ◽

P. Heim ◽

E. Franzi ◽

X. Arreguit ◽

F. Heitger ◽

...

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Recognition System ◽

Optical Character

Download Full-text

Real‐time optical character recognition on field programmable gate array for automatic number plate recognition system

IET Circuits Devices & Systems ◽

10.1049/iet-cds.2012.0339 ◽

2013 ◽

Vol 7 (6) ◽

pp. 337-344 ◽

Cited By ~ 20

Author(s):

Xiaojun Zhai ◽

Faycal Bensaali ◽

Reza Sotudeh

Keyword(s):

Real Time ◽

Field Programmable Gate Array ◽

Character Recognition ◽

Optical Character Recognition ◽

Recognition System ◽

Optical Character ◽

Field Programmable ◽

Gate Array

Download Full-text

Optical character recognition system based on a novel fuzzy descriptive features

Proceedings 7th International Conference on Signal Processing, 2004. Proceedings. ICSP '04. 2004. ◽

10.1109/icosp.2004.1441471 ◽

2005 ◽

Author(s):

Y. Alginahi ◽

I. El-Feghi ◽

M. Ahmadi ◽

M.A. Sid-Ahmed

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Recognition System ◽

Optical Character

Download Full-text

Developing Automated Optical Character Recognition System Using Machine Learning Algorithm to Solve Payment Verification Issues

10.1109/icoris52787.2021.9649514 ◽

2021 ◽

Author(s):

Michael Siek ◽

Rafi Soeharto

Keyword(s):

Machine Learning ◽

Character Recognition ◽

Optical Character Recognition ◽

Learning Algorithm ◽

Recognition System ◽

Machine Learning Algorithm ◽

Optical Character

Download Full-text

Optical Character Recognition System for Urdu Words in Nastaliq Font

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2016.070575 ◽

2016 ◽

Vol 7 (5) ◽

Cited By ~ 5

Author(s):

Safia Shabbir ◽

Imran Siddiqi

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Recognition System ◽

Optical Character

Download Full-text