Continuous Offine Handwriting Recognition using Deep Learning Models
Résumé
Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis. The transcription of handwritten content present in digitized documents is signi cant in analyzing historical archives or digitizing information from handwritten documents, forms, and communications. The problem has been of great interest since almost the beginning of the development of machine learning algorithms. In the last ten years, great advances have been made in this area due to applying deep learning techniques to its resolution. This Thesis addresses the o ine continuous handwritten text recognition (HTR) problem, consisting of developing algorithms and models capable of transcribing the text present in an image without the need for the text to be segmented into characters. For this purpose, we have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq) models, respectively. The convolutional component of the model is oriented to identify relevant features present in characters, and the seq2seq component builds the transcription of the text by modeling the sequential nature of the text. For the design of this new model, an extensive analysis of the capabilities of di erent convolutional architectures in the simpli ed problem of isolated character recognition has been carried out in order to identify the most suitable ones to be integrated into the continuous model. Additionally, extensive experimentation of the proposed model for the continuous problem has been carried out to determine its robustness to changes in parameterization. The generalization capacity of the model has also been validated by evaluating it on three handwritten text databases using di erent languages: IAM in English, RIMES in French, and Osborne in Spanish, respectively. The new proposed model provides competitive results with those obtained with other well-established methodologies and opens the door to new lines of research focused on applying seq2seq models to the continuous handwritten text recognition (HTR) problem.
Description
Tesis Doctoral leída en la Universidad Rey Juan Carlos de Madrid en 2021. Directores de la Tesis: Ángel Sánchez Calle y José Vélez Serrano
Colecciones
- Tesis Doctorales [1552]