Lancaster University

ESRC Centre for Corpus Approaches to the Social Sciences (CASS) and University Centre for Computer Corpus Research on Language (UCREL)

Two important units in Lancaster University will contribute to the CLARIN-UK consortium.

The ESRC Centre for Corpus Approaches to Social Science is dedicated not simply to corpus linguistics, but to bringing these latest techniques in linguistic analysis to bear on a range of questions in the social sciences. Building on an initial five year investment by the Economic and Social Research Council (ESRC), CASS is committed to work for 15 years to train a new generation of social sciences researchers to use these techniques, and to facilitate the uptake of corpus techniques in the social sciences. In doing so, we intend to turn one UK success story – corpus linguistics – into another: corpus informed social sciences.

The University Centre for Computer Corpus Research on Language (UCREL) specializes in the automatic or computer-aided analysis of large bodies of naturally-occurring language ('corpora'), and has a record of achievement of more than forty years as pioneers in this field. UCREL's work focusses on modern English, early modern English, modern foreign languages, minority, endangered, and ancient languages. It develops new methods and applies them to a range of real-world scenarios including online child protection, the language of extremism, understanding the quality of financial disclosures, metaphor in end-of-life care and early detection of dementia through combining data and text mining.

Key resources:

  • BNCweb online service
  • CQPweb online service
  • CLAWS part-of-speech tagger for English
  • VARD variant detector for historical texts in English
  • WMATRIX online service
  • Training and knowledge sharing in corpus informed social sciences

Lancaster University was a founding member of the CLARIN-UK consortium in 2015.