The Benefits of CLARIN Collaboration

This article aims to summarize the tangible benefits to the UK research, teaching and learning communities from our involvement with CLARIN. It is should be noted that these benefits have accrued despite only having joined as an observer, not a full member, and the absence of any direct funding for CLARIN activities in the UK.


Direct financial benefits have been received in the following cases:

Travel funding: More than 20 places have been funded to attend CLARIN workshops and conferences on a variety of research and technical topics for researchers from the universities of Aberdeen, Coventry, Glasgow, Huddersfield, Lancaster, Nottingham, Oxford, QML, Sheffield, Southampton, and Suffolk, as well as the Alan Turing Institute and the UK Data Archive. These include invitations to senior academics as keynote speakers (e.g. Professor Ian Gregory to the CLARIN Annual Conference 2016) and PhD students attending ‘creative camps’ and workshops.

Event funding: A workshop in Oxford on working with oral history data was fully funded by the CLARIN-PLUS project, as were places for UK researchers in follow-up workshops in Utrecht, Arezzo, and Munich. CLARIN funded a workshop organized by researchers from Wolverhampton and Oxford in Brussels, as part of the ‘Digital Youth in East Asia’ conference.

Project funding: Lancaster University was successful in November 2020 in securing €5000 funding for the ParlaCLARIN project, to provide UK parliamentary proceedings data to this CLARIN ERIC project.

Secondments: There are two instances of UK researchers obtaining paid secondments to work for CLARIN ERIC as Director for User Involvement and as a work package leader for the CLARIN-PLUS project.

Other project participation: SOAS received funding and important benefits in kind after being selected as one of the CLARIN use cases in the EUDAT project, which has offered secure replication and storage of their important endangered languages data, and workshops and discussion on good practice in data management.

Research visits: A CLARIN Mobility grant paid for a team to come from the CLARIN centre at Charles University, Prague to install and configure a new repository platform for the Oxford Text Archive. The Prague team offer ongoing technical support to the Bodleian Library team supporting the OTA. A UK-based researcher was funded by a Mobility Grant to go to the Meertens Institute in the Netherlands to prepare tutorial materials for a digital humanities course.

Discovery and visibility

Web presence: The CLARIN-UK website (here) is an online showcase for CLARIN consortium members to advertise their activities to potential users, collaborators, and to the European CLARIN community. The UK participants also all have a web presence on the website, including a feature in the Tour de CLARIN in 2020 (publication here), and showcases for UK resources (e.g. Graphcoll from Lancaster University here). Calls for UK events, job advertisements and other news have been widely publicized in the CLARIN Newsletter.

Knowledge centres: CLARIN’s services to provide support, advice and knowledge sharing are based on Knowledge Centres, also known as K-Centres. SOAS is part of the CLARIN Knowledge Centre for Linguistic Diversity and Language Documentation (CKLD), a distributed CLARIN K-Centre set up in 2018, which offers expertise on data and data-related methods, technology and background information on language resources and tools to researchers – including students and native speakers. CKLD is a joint initiative from Hamburg Centre for Language Corpora, Academy of Sciences and Humanities in Hamburg, Endangered Languages Archive and the World Languages Institute at SOAS, and the Data Centre for the Humanities and the Department of Linguistics at the University of Cologne.

Resource discovery: Resources in UK repositories can now appear in CLARIN’s central discovery service, the Virtual Language Observatory. So far this has been implemented for the Oxford Text Archive, but it would be possible for datasets in other repositories. In an experiment in November 2020, a ‘virtual record’ for the Welsh National Corpus was created, so that the corpus appeared in the VLO, while still being curated by the resource creators at the University of Cardiff.

Secure access to restricted resources: Members of UK HEIs can use their institutional credentials to log in to access data and services at more than forty CLARIN centres. These include major repositories for language resources in Denmark, Finland, Germany, the Netherlands and Norway, and more are being added all the time. This seamless access is available to users in all UK HEIs.

CLARIN has carried out the complex work to set up this 'domain of trust'.  When the UK joined CLARIN ERIC as an Observer, the necessary configuration changes were made in the CLARIN infrastructure to allow UK users to log in to restricted resources with their institutional single sign-on. There have been at least 1143 accesses of CLARIN data from the UK via the CLARIN identity provider since 2015.

The CLARIN identity provider also provides a route for those who don’t have credentials from a participating institution to register and obtain a valid identifier. A number of UK-based researchers have made use of this facility for ‘waifs and strays’.

There is support available from CLARIN for UK service providers to offer this access to their own secure resources as well, and this has been implemented at the Oxford Text Archive, with more than 500 users since July 2020, from all around the world, logging in with their institutional credentials to access secure protected resources, thanks to CLARIN.