We created a test set of communications by asking workers on Amazon Mechanical Turk to respond to 10 hypothetical communication situations. Workers create one sentence in the form of a statement and one sentence in the form of a question. We manually reviewed the data, dropping garbage and correcting obvious spelling or grammar errors.

The zip below contains various forms of the test set for use in evaluating predictive text entry interfaces designed to produce conversational-style text. It also contains the list of unique words used by workers as well as a unigram language model trained on the data. This may be particularly useful for researchers in augmentative and alternative communication (AAC).


Page last updated: November 29, 2016