Io Manolessou & Notis Toufexis
University of Patras & University of Cambridge
The present paper gives an overview of the branch of corpus linguistics that deals with historical corpora, i.e. electronic text compilations of of past forms of language, and discusses their applicability and availability for the study of the history of the Greek language. The methodology for constructing a historical corpus of the Cypriot dialect (Corpus of Medieval Cypriot Texts, CMCT) is presented, with discussion criteria for text inclusion and of modelling and implementation issues (mark-up languages, metadata, digital transcription methods).