Detection of Text in Videos Using Discrete Wavelet Transform and Gradient Difference

Author(s):  
Manoj Kumar Dixit

Text detection in video frames provide highly condensed information about the content of the video and it is useful for video seeking, browsing, retrieval and understanding video text in large video databases. In this paper, we propose a hybrid method that it automatically detects segments and recognizes the text present in the video. Detection is done by using laplacian method based on wavelet and color features. Segmentation of detected text is divided into two modules Line segmentation and Character segmentation. Line segmentation is done by using mathematical statistical method based on projection profile analysis. In line segmentation, multiple lines of text in video frame obtained from text detection are segmented into single line. Character segmentation is done by using Connected Component. Analysis (CCA) and Vertical Projection Profile Analysis. The input for character segmentation is the line of text obtained from line segmentation, in which all the characters in the line are segmented separately for recognition. Optical character recognition is Processed by using template matching and correlation technique. Template matching is performed by comparing an input character with a set of templates, each comparison results in a similarity measure between the input characters with a set of templates. After all templates have been compared with the observed character image, the character’s identity is assigned with the most similar template based on correlation. Eventually, the text in video frame is detected, segmented, and processed to OCR for recognition.

Author(s):  
Ipsita Pattnaik ◽  
Tushar Patnaik

Optical Character Recognition (OCR) is a field which converts printed text into computer understandable format that is editable in nature. Odia is a regional language used in Odisha, West Bengal & Jharkhand. It is used by over forty million people and still counting. With such large dependency on a language makes it important, to preserve its script, get a digital editable version of odia script. We propose a framework that takes computer printed odia script image as an input & gives a computer readable & user editable format of same, which eventually recognizes the characters printed in input image. The system uses various techniques to improve the image & perform Line segmentation followed by word segmentation & finally character segmentation using horizontal & vertical projection profile.


Author(s):  
Hendy Gunawan ◽  
Janson Hendryli ◽  
Dyah Erny Herwindiati

The Image Conversion Program of Music Notation being Numeric Notation is a character recognition system that accepts input in form of music notation image that produces an output of a DOCX file containing the numeric notation from the input image. Music notation has notation value, ritmic value and written with a music stave. The system consists of four main processes: preprocessing (grayscale and thresholding), notation line segmentation, notation character segmentation, and template matching. Template matching is used to recognize the music notation that obtained after segmentation. The recognition process obtained by comparing the image with the template image that has been inputted before to the database. This system has 100% success rate on segmentation of the character and success rate 38,4843% on the character recognition with template matching.


2011 ◽  
Vol 403-408 ◽  
pp. 900-907
Author(s):  
Anubhav Kumar ◽  
Awanish Kr Kaushik ◽  
R.L. Yadava ◽  
Divya Saxena

In this paper, a proposal of a new and unusual framework to detect and extract the text from the images and video frames have been presented. In the past various methods have been presented for detection and localization of text in images and video frames. In this paper, a comparison has been made between several text detection methods and proposed method for text detection in images and video frames. The proposed method is carried out by edge detection, and the projection profile method is used to localize the text region better. Various experiments have been carried out to evaluate and compare the performance of the proposed algorithm. Experimental results tested from a large dataset have demonstrated that the proposed method is effective and practical. Various parameters like average time, precision and recall rates and analyzed for both existing and proposed method to determine the success and limitation of our method.


2022 ◽  
Vol 12 (2) ◽  
pp. 853
Author(s):  
Cheng-Jian Lin ◽  
Yu-Cheng Liu ◽  
Chin-Ling Lee

In this study, an automatic receipt recognition system (ARRS) is developed. First, a receipt is scanned for conversion into a high-resolution image. Receipt characters are automatically placed into two categories according to the receipt characteristics: printed and handwritten characters. Images of receipts with these characters are preprocessed separately. For handwritten characters, template matching and the fixed features of the receipts are used for text positioning, and projection is applied for character segmentation. Finally, a convolutional neural network is used for character recognition. For printed characters, a modified You Only Look Once (version 4) model (YOLOv4-s) executes precise text positioning and character recognition. The proposed YOLOv4-s model reduces downsampling, thereby enhancing small-object recognition. Finally, the system produces recognition results in a tax declaration format, which can upload to a tax declaration system. Experimental results revealed that the recognition accuracy of the proposed system was 80.93% for handwritten characters. Moreover, the YOLOv4-s model had a 99.39% accuracy rate for printed characters; only 33 characters were misjudged. The recognition accuracy of the YOLOv4-s model was higher than that of the traditional YOLOv4 model by 20.57%. Therefore, the proposed ARRS can considerably improve the efficiency of tax declaration, reduce labor costs, and simplify operating procedures.


Author(s):  
Kazuhiko Kawamoto ◽  
◽  
Naoya Ohnishi ◽  
Atsushi Imiya ◽  
Reinhard Klette ◽  
...  

A matching algorithm that evaluates the difference between model and calculated flows for obstacle detection in video sequences is presented. A stabilization method for obstacle detection by median filtering to overcome instability in the computation of optical flow is also presented. Since optical flow is a scene-independent measurement, the proposed algorithm can be applied to various situations, whereas most of existing color- and texture-based algorithms depend on specific scenes, such as roadway and indoor scenes. An experiment is conducted with three real image sequences, in which a static box or a moving toy car appears, to evaluate the performance in terms of accuracy under varying thresholds using a receiver operating characteristic (ROC) curve. For the three image sequences, the ROC curves show, in the best case, that the false positive fraction and the true positive fraction is 19.0% and 79.6%, 11.4% and 84.5%, 19.0% and 85.4%, respectively. The processing time per frame is 19.38msec. on 2.0GHz Pentium 4, which is less than the video-frame rate.


2016 ◽  
Vol 25 (5) ◽  
pp. 053005 ◽  
Author(s):  
Jiangmin Tian ◽  
Guoyou Wang ◽  
Jianguo Liu ◽  
Yuanchun Xia

2014 ◽  
Vol 519-520 ◽  
pp. 572-576
Author(s):  
Yuan Chun Hu ◽  
Jian Sun ◽  
Wei Liu

In traditional way, the segmentation of image is conducted by simple technology of image processing, which cannot be operated automatically. In this paper, we present a kind of classification method to find the boundary area to segment character image. Referring to sample points and sample areas, the essential segmentation information is extracted. By merging different formats of image transformation, including rotation, erosion and dilation, more features are used to train and test the segmentation model. Parameter tuning is also proposed to optimize the model for promotion. By the means of cross validation, the basic training model and parameter tuning are integrated in iteration way. The comparison results show that the best precision and recall can up to 97.84% in precision and 94.09% in recall.


Sign in / Sign up

Export Citation Format

Share Document