Artificial vision to digitize the secret archives of the Vatican

We come up with an interesting project that is being carried out with the aim of transforming some of the millions of existing documents into the Vatican Secret Archive into the digital format. There is a file that contains some 85 kilometers of shelves. That is all filled with private letters and other documents of kings and Popes from the 8th century to the present.

Artificial vision to digitize the secret archives of the Vatican, Artificial vision to digitize the secret archives of the Vatican, Optocrypto

Artificial vision to digitize the secret archives of the Vatican

The historical value of these documents is enormous. Some letters have been made public, such as some Knights Templar trials, letters from the artist Michelangelo, requests for marriage annulments of kings, requests for help from great characters in the story. Also, there are even letters from Abraham Lincoln and Jefferson Davis, but everything that is after 1939, is secret.

Transcribing that information is practically impossible. So an artificial vision system could be used within a proposed project called In Codice Ratio (in Latin). With a platform that would be able to automatically transcribe a part of the Archive. That is more than 18,000 pages of thirteenth-century letters between the Catholic Church and kings.

They need an optical recognition system capable of recognizing characters that combine adjacent letters and old abbreviations. And it is not easy to create data sets to train computers. So problems are not lacking. Now they have created an optical system that divides each word into a series of strokes that fit together like a puzzle. So that they get together to form public letters. Finally analyzing the result to see if it makes sense, a technique that is producing results.

They have obtained the help of 120 high school students. Who labeled patterns and training data sets of 15,000 characters by hand in a couple of hours. And computers have managed to accurately transcribe 65% of the images, the pieces of letters obtained from the scrolls.

Although the result is not published. The technique can help a lot to keep the information safe. With the hope that in the future we can better know the history of humanity from the letters of its “protagonists.”