EXTRACTING TEXT FROM IMAGES ACCORDING TO A DEFINED REGULAR EXPRESSION
Boulchahoub Hassan, Labriji Amine, Hicham Gourram, Zineb Rachik, El Houssine Labriji
Academic Editor: Youssef EL FOUTAYENI
Received |
Accepted |
Published |
27 January 2020 |
11 February 2020 |
10 March 2020 |
Abstract: The recognition of text from images has been the subject of several researches and studies since the early 1920s. The Optical Character Recognition (OCR) initially proposed by Gustav Tauschek is the most used technique today thanks to the high accuracy remarked in the texts extracted. Recently, the OCR process has been improved, more than ever, by the integration of convolutional neural networks and we are witnessing today an unexpected efficiency and a margin of error tending to zero in the processed images. Despite the effort made to extract the entire text from an image. Most text detectors use only a small part of the extracted text, for instance they are interested in plate numbers, e-mails, phone numbers, product codes, etc. In this paper, we propose to use the OCR technique to extract only the desired part of the text and which corresponds to a pattern or a regular expression initially defined.