Michele Corazza

Investigating the Signs of Cypro-Minoan with Deep Learning

In recent years, the state of the art in computational linguistics, as well as other fields, has been dominated by machine and deep learning methods. These methods have clear advantages for natural language processing, as they do not rely on a set of predetermined rules in order to perform their tasks. Instead, they operate by learning from examples how to perform various tasks, with a higher flexibility than classical methods. In this context, bigger models requiring more data are commonplace and they often produce the state of the art in terms of performance.

While the usage of machine learning methods for computational linguistics is an established practice, the same can't be said for the study of ancient, undeciphered, writing systems. The reasons for this discrepancy are plentiful, but among those one of the more challenging aspects is the scarcity of available data, since most undeciphered scripts have limited attestations, especially for data-hungry models. Another challenge is the fact that for some scripts, even the inventory of signs is not agreed upon by experts, meaning that the graphical aspect of signs should also be considered. Additionally, since no gold standard can be obtained in the case of undeciphered scripts, the models can only be unsupervised, as we aim to test hypotheses without biasing the model.

In the emerging field of computational paleography, this presentation regards the development of a deep learning model that was succesfully used to investigate non consensual instances of allography in the Cypro-Minoan script, as well as the development of this method on another closely related writing system from Cyprus, namely the Cypriot-Greek sillabary. The resuting approach, which sits at the intersection of multiple disciplines such as paleography, computational linguistics and computer vision, is one of the first of its kind and it also useful results for the field of paleography.