Handwritten Text Recognition for the European Digital Treasures Collections.  Hands On workshop by Joan Andreu Sánchez and Enrique Vidal

The first day of  the workshop “New Digital Exponential Technologies Towards The Generation Of Business Models” was concluded by a hands on session led by Joan Andreu Sánchez and Enrique Vidal.

Joan Andreu Sánchez.

Joan Andreu Sánchez is assistant professor at Universitat Politècnica de València and the Director of the Pattern Recognition and Human Language Technologies (PRHLT) Research Center in this university. His main area of research is machine learning and formal languages applied to text recognition and math recognition.

Enrique Vidal is emeritus professor at the same university and former co-leader of PRHLT research center. For many years Dr. Vidal has focussed his research on handwritten document analysis and recognition leading the development of the probabilistic indexing technology.  Joan Andreu and Enrique are founders of tranSkriptorium, an AI spin-off company.

Enrique Vidal.

The contents of a massive volume of digitised handwritten records in archives and libraries all over the world are practically inaccessible, buried beneath thousands of terabytes of high-resolution images. The image textual content could be straightforwardly indexed for plain-text textual access using conventional information retrieval systems if perfect or sufficiently accurate text image transcripts were available.

However, fully automatic transcription results generally lack the level of accuracy that is required for reliable text indexing and search purposes. On the other hand, the massive volume of image collections typically considered for indexing render manual or even computer-assisted transcription as entirely prohibitive. Dr. Sanchez and Dr. Vidal explain how very accurate indexing and search can be directly implemented on the images themselves, without explicitly resorting to image transcripts; they present the results obtained using the proposed techniques on several relevant historical data sets. The results have led to a high interest in these technologies.

You can watch the session on YouTube here and the paper presented at the workshop here: Part I & Part II.

Written by Leonard Callus and the European Digital Treasures Team.

ICARUS Convention #28 in Paris with European Digital Treasures workshop


After two years of online conventions and zoom conferences, we are happy to announce that the upcoming ICARUS Convention #28 will be held in person in Paris from 23rd to 25th of May, 2022 as a hybrid event!

The conference will take place in the conference center of Campus Condorcet in Paris-Aubervilliers and is organised by the Institut de Recherche et d’Histoire des Textes (CNRS) with the support of the French Ministry of Culture and the National Archives (Archives nationales).

Within the programme of the convention, the European Digital Treasures project will hold their workshop “New Business & Conceptual models” led by Yvan Corbat!

One of the key objectives of the Digital Treasures project is to generate a greater added value, profitability, visibility and economic return of European archives, through the identification and implementation of new business models and activities.

The workshop will include practical examples of new activities being implemented by some partners of this project:

The programme of the convention will be finalized within the next days and weeks.
First prospect, further information, details and registration: https://icarus-28.sciencesconf.org/resource/page/id/2

Any questions? Please contact: info@icar-us.eu

More information to come soon – stay tuned!

We are looking forward to seeing you in Paris!

Written by ICARUS & the Digital Treasures team.