What can you do with the CLARIN research infrastructure?

This half-day workshop will focus on practical issues for how corpus linguists can benefit from the well-developed CLARIN research infrastructure for language as social and cultural data. CLARIN stands for 'Common Language Resources and Technology Infrastructure'. It is a research infrastructure that was initiated from the vision that all digital language resources and tools from all over Europe and beyond are accessible through an online environment for the support of researchers in the humanities and social sciences. CLARIN's distributed network is made up of 70+ centres including those offering services or knowledge and expertise in the various domains covered by CLARIN (https://www.clarin.eu/content/overview-clarin-centres). A brief overview of the CLARIN infrastructure will be presented, including the easy-to-use language resources, the knowledge infrastructure, the participating consortia from across Europe, and a large set of resource families covering corpora, lexical resources, and NLP tools for tagging and annotation of corpora (https://www.clarin.eu/resource-families). 

Participants will learn practical steps for corpus users on how to access and search the existing corpora and tools in the CLARIN research infrastructure (i.e. the Virtual Language Observatory, Switchboard and Resource Families) and for corpus creators how to embed or deposit their own resources and tools in one of the centres of the infrastructure (with reference to annotation, formats and standards, metadata, licencing and documentation). We will present examples of key resources and tools such as those from the ParlaMint project, the first stage of which has already produced freely available comparable and interoperable corpora of 17 European parliaments with almost half a billion words. The ParlaMint II project (running until May 2023) will upgrade the XML schema and validation, extend the existing corpora to cover data at least to July 2022, add corpora for new languages, further enhance the corpora with additional metadata; and improve the usability of the corpora (https://www.clarin.eu/parlamint).  

Conference registration is expected to open on 3rd April 2023.