The Corpus of Border Karelia

The Corpus of Border Karelia contains the audio recordings and transcripts of dialects spoken in the area of Border Karelia, where the very closely related varieties of eastern Finnish dialects and Karelian were in contact. The informants are evacuees who were mainly moved to eastern Finland after W...

Full description

Bibliographic Details
Main Authors: Marjatta Palander, Helka Riionheimo, Vesa Koivisto, Institute for the Languages of Finland, Kotimaisten kielten keskus
Other Authors: FIN-CLARIN
Language:unknown
Published: 2019
Subjects:
Online Access:https://erepo.uef.fi/handle/123456789/7706
Description
Summary:The Corpus of Border Karelia contains the audio recordings and transcripts of dialects spoken in the area of Border Karelia, where the very closely related varieties of eastern Finnish dialects and Karelian were in contact. The informants are evacuees who were mainly moved to eastern Finland after World War II. The original interviews were recorded in the 1960s and the 1970s and transcribed at the University of Eastern Finland by various researchers using the Finno-Ugrian transcription system. The interviewees are elderly people who were born in the 1870s - 1910s. The original material has been archived by the Institute for the Languages of Finland. During the FINKA project (funded by the Academy of Finland in 2011–2014), the transcripts were reviewed and reorganized into a machine-readable corpus that is compatible with modern research tools. The written transcripts of the corpus will be made accessible through the Korp concordance tool (https://korp.csc.fi). The transcripts are also aligned with the corresponding audio files, and this subset will be made available through the LAT system (http://lat.csc.fi). More information about the resource: http://www.uef.fi/fi/finka/finka log 26.11.2018 link islrn.org/resources/288-221-011-723-8 removed and the access location link http://www.uef.fi/fi/finka/finka moved to the description part Raja-Karjalan korpus sisältää yht. 119 t 4 min 58 s audiotiedostoja (.wav) sekä niiden suomalais-ugrilaisella tarkekirjoituksella tuotetut transkriptiot, jotka ovat UTF-8-merkistökoodattuja raakatekstitiedostoja (.txt) ja jotka on kohdistettu lausumatasolle äänteitä vastaamattomista merkeistä riisuttuina TextGrid-tiedostoina. Yhteensä aineiston koko on noin 40 GB. Äänitteet edustavat 1800-luvun lopulla ja 1900-luvun alussa syntyneiden rajakarjalaisten haastattelupuhetta, jota on tallennettu Kotimaisten kielten keskuksen Suomen kielen nauhoitearkiston kokoelmiin pääosin 1960- ja 1970-luvuilla. Näytteet edustavat Ilomantsin, Korpiselän, Suojärven, Suistamon, Impilahden ja Salmin ...