Siberian Federal University: Russian Scientists Have Taught a neural network to read handwritten Letters of the Russian Alphabet

With the development of IT, the importance of fast and high-quality conversion of handwritten text into a digital printed version is growing in order to make it easier to copy, edit or extract data from it. Apparently, the first step in this process will be the recognition of the letters of the Russian alphabet written by hand. SibFU scientists have developed a new convolutional neural network (CNN) capable of recognizing images of handwritten letters with high accuracy. The resulting algorithm transforms the image and recognizes the letter encrypted in it. According to the scientists, the algorithm’s accuracy is 99 %.

Today, 2.4 % of the world’s population speaks Russian. The difficulty of text recognition written in Cyrillic alphabet by hand is quite high — especially for people who are not familiar with it. There are widespread services on the Internet that can be used to recognize and convert any type of text, both digital and handwritten. However, the use of such services is fraught with information leaks and is unreliable from the point of view of user privacy and security. An application that can easily and quickly recognize Cyrillic text, works on the client side and does not require an Internet connection can be in great demand both by individual users and organizations.

Perhaps the most interesting feature of the handwritten Russian text is the individual style of writing letters — that’s what we call handwriting. Writing styles tend to change over time — it is enough to compare the calligraphic lines in the notebooks of the generation of the 70s-80s and the way modern schoolchildren write. Even a person’s handwriting changes over life. The purpose of our study was the recognition of handwritten text in Russian by a neural network using deep learning (DL) models. As far as we know, this is the first work of such kind in the world,”, said Andrey Levkov, co-author of the study and a student of the School of Information and Space Technology, SibFU.

To achieve this goal, the scientists took a number of steps: built a new dataset with a labelled image in a resolution of 32×32 pixels for 33 letters of the Russian alphabet, and developed a new CNN architecture to deal with the challenge of detecting handwritten letters of the Russian alphabet and compared it with the existing powerful CNN models. Besides, the Krasnoyarsk and St. Petersburg experts provided a full description of the convolutional neural network and source code used so that other researchers could reproduce these data to detect handwritten letters of the Russian alphabet. Python and Jupiter interactive development environment were chosen for programming.

The neural network was trained using preprocessed data from the CoMNIST storage, a well-known database containing samples of handwritten letters in Latin and Cyrillic alphabets. The data set in the database consists of 4-band images with a resolution of 278×278 pixels in .png format.

“The dataset contains 13,299 photos, each of which is in a separate folder. The folders, in turn, belong to a certain class. There are 33 such classes in the set, and each corresponds to a letter of the Russian alphabet. There are from 300 to 500 images for each class. These images contain uppercase, printed and italicized letters. So, approximately 85% of these images were shown to the neural network (CNN) learning to recognize the letters of the Russian alphabet, and another 15% was for checking the acquired knowledge,” shared Anastasia Safonova, head of the research team, assistant professor at the Department of Artificial Intelligence Systems, SibFU.

A new unique set of data (images) created by scientists was needed to hold an independent verification of the developed model. Only one letter in printed or written form was in each photo. The set contains from 5 to 10 images for each class. To increase the data set and their variability, the scientists used different functions of image transformation — rotated images to the right and left, applied the Gauss distribution, etc. As a result, the experts received 79,794 images, on 67,825 of them the neural network could learn, and 13,084 served for verification.

“We compared the model developed by our team with the most powerful CNN models, for example, with VGG-16, VGG-19, Xcept, Resnet-101, Mobilenet-V2 and others. It turned out that the accuracy of our model during training was up to 99 %, the entire training took 3 hours. The prediction accuracy of the model was up to 95.83 %. In general, our model was inferior to only one alternative — VGG-16 which demonstrated up to 99% accuracy; the lowest accuracy was demonstrated by Xception and Inception-V3 models,” mentioned Anastasia Safonova.

The scientists report that their neural network model is not final and can be improved in the future — its architecture will probably change to increase the accuracy. The experts also plan to train their model for recognizing Russian handwritten text on a new data set and introduce it to various writing styles. On the basis of their work, a unique ECM program was registered, the copyright holder of which is Siberian Federal University.