Tomsk State University: TSU historians will research medieval texts using AI
TSU historians will train artificial intelligence to analyze medieval documents in German by 2022. This will help them to find the necessary examples of changes in the meaning of terms in the text arrays. It is thought that such a digital archivist will find the necessary information in hours, while manually doing such work can take a lifetime.
– Many documents, texts of the 14-15th centuries in France, Germany, and England, have already been digitized, in particular, letters. AI will be able to analyze the formation of certain social practices based on big data. For example, to look at the texts that have survived, how people begin to use words with a strictly defined purpose, when an abstract, collective meaning appears, – said Anton Kotov, associate professor of the Department of the History of the Ancient World, the Middle Ages and the Methodology of History.
As the scientists explain, the future development has no close analogs yet. Now only text analysis programs are used, but they all work with already standardized and living languages. And medieval languages are very different from them, for example, grammar, spelling, lack of writing standards. Some German texts of that period cannot be identified by neural networks at all as written in German. Moreover, they usually analyze texts where some meaning is already assigned to the concept and everything is limited by this definition. Here the idea is to see how the addressee and the addressee influence the utterance. These diplomas are most often letters and messages, which helps us to consider the variability of the use of the term.
Also, thanks to artificial intelligence, it is possible to analyze semantics – to find words that are suitable for the meaning, determine when the word was first used in a different meaning, and who began to do it.
– For example, the word ‘Bund’ in German has one of the meanings – ‘union’. Now it is part of the word that we translate as ‘federation’, but initially, this word was associated with printing: then they did not put a seal on paper, as now, but attached a wax print of the seal on threads to the document and tied it up. Thus, ‘Bund’, as it were, meant an agreement with that document. We[OU1][OU2] must teach a neural network, including tracking the development of this concept, – explains Anton Kotov.
The introduction of this digital technology will significantly speed up linguistic and historical research. Anton Kotov noted that this can be done manually, but it takes a lot of time since it is very costly and difficult. Scientists have spent their entire lives compiling such catalogs, and AI should do it faster – perhaps within a few hours. Historians will only have to identify patterns. It is planned to complete the training of the neural network by 2022.