STATE: ongoing
LANGUAGES:
multilingual (English, Chinese, Japanese, Arabic, Spanish, German, Italian, French)
DESCRIPTION:
The KIParla corpus of spoken Italian is an innovative resource available free of charge to all those working on spoken Italian. Its main feature is that it contains samples of various types of observable interaction in university settings, particularly lectures, student receptions, exams, free conversation among students, and semi-structured interviews with students. The KIParla corpus consists of two modules (KIP and ParlaTO) and a third one will soon also be starting (ParlaBO)
AIMS/ACTIVITIES:
The internship consists of 3 different types of activities plus a side activity of focusing on foreign languages in the city of Bologna:
1. Field data collection (Italian and foreign languages in the city of Bologna): interns, individually or in groups, will collect speech data, identifying suitable communicative situations and subjects to be involved in recordings
2. Transcription of speech: interns will take part in transcribing interviews according to the Jefferson system, applied to Italian and foreign languages
3. Preparation of data for automatic processing: interns will prepare the transcribed interviews for input on the NoSketchEngine platform, handle data encoding in XML format with the help of editors (Oxygen), and if necessary liaise with technical staff for data uploading
4. Focus on foreign languages in the city of Bologna: in-depth analyses of individual languages: interns will carry out in-depth analyses of individual languages, which will vary in relation to their skills and the migration languages present in the data; they will make use of various language resources and speech corpora for the languages in question, both to monitor the use of specific constructions and to identify contact phenomena in the system and in the discourse.
The internship will provide highly interdisciplinary and applied skills that can be usefully employed in the following areas of work:
- Translation for dubbing and subtitling
- Conducting surveys and interviews for private agencies
- Use of language data for commercial purposes (data mining, sentiment analysis, ...)
- Automated processing of language data
- Construction and management of language databases
- In-depth knowledge and use of software in the digital humanities field
open to:
LT LLS YES
LT LMCA YES
LM LMCP YES
LM LCIS YES
LM LSC YES