Reference:
GRANDI, Nicola, BALLARÈ, Silvia, CHIUSAROLI, Francesca, GALLINA, Francesca, PASCOLI, Matteo, PISTOLESI, Elena; Corpus Univers-ITA-ProUniv. 2023, DOI: https://doi.org/10.60760/unibo/univers-ita-prouniv
The two corpora of non-ad hoc written texts (i.e., Univers-ITA-ProUniv and Univers-ITA-ProGior) were created thanks to the collaboration of the interns from the University of Bologna.
The UniverS-Ita-ProUniv corpus mainly consists of theses (in the version not corrected by the supervisor) and university reports (773 texts, totaling 6.267.765 tokens). For these texts, as indicated in the consultation guide, some metadata are available, such as the geographical location of the university, the academic discipline of the writer's degree program, the writer's gender, and the region of birth, etc.
A subsection of the corpus, balanced to represent the Italian university population, can be consulted using parameters such as the geographical location of the university and the academic discipline of the degree program (similarly to what was done for the UniverS-Ita corpus).
The corpus is accessible at this link.