Dealing with spelling variation in historical English texts:
VARD 2 is an interactive piece of software produced in Java designed to assist users of historical corpora in dealing with spelling variation, particularly in Early Modern English texts. The tool is intended to be a pre-processor to other corpus linguistic tools such as keyword analysis, collocations and annotation (e.g. POS and semantic tagging), the aim being to improve the accuracy of these tools.
The VARD 2 software uses techniques derived from modern spell checkers to find candidate modern form replacements for spelling variants found within historical texts. The user can choose to process texts manually, selecting a candidate replacement offered by the system; automatically, allowing the system to use the best candidate replacement found; or semi-automatically, training the tool on a sample of the corpora.