Research Communication | Open Access
Volume 2020 | Communication ID 129 |

EXTRACTING TEXT FROM IMAGES ACCORDING TO A DEFINED REGULAR EXPRESSION

Boulchahoub Hassan, Labriji Amine, Hicham Gourram, Zineb Rachik, El Houssine Labriji
Academic Editor: Youssef EL FOUTAYENI
Received
Accepted
Published
27 January 2020
11 February 2020
10 March 2020

Abstract: The recognition of text from images has been the subject of several researches and studies since the early 1920s. The Optical Character Recognition (OCR) initially proposed by Gustav Tauschek is the most used technique today thanks to the high accuracy remarked in the texts extracted. Recently, the OCR process has been improved, more than ever, by the integration of convolutional neural networks and we are witnessing today an unexpected efficiency and a margin of error tending to zero in the processed images. Despite the effort made to extract the entire text from an image. Most text detectors use only a small part of the extracted text, for instance they are interested in plate numbers, e-mails, phone numbers, product codes, etc. In this paper, we propose to use the OCR technique to extract only the desired part of the text and which corresponds to a pattern or a regular expression initially defined.