This dataset is a collection of two-sided dialogues invented by Amazon Mechanical Turk workers. Different workers invented each turn in the dialogues.
Further details can be found in this ASSETS '17 poster:
@inproceedings{vertanen_aacdialogue,
author = {Keith Vertanen},
title = {Towards Improving Predictive AAC using Crowdsourced Dialogues and Partner Context},
booktitle = {ASSETS '17: Proceedings of the ACM SIGACCESS Conference on Computers and Accessibility (poster)},
year = {2017},
pages = {347--348},
}
We created a filtered version of the dialogues removing potentially offensive content as part of this paper:
@inproceedings{adhikary_speech,
author = {Jiban Adhikary and Robbie Watling and Crystal Fletcher and Alex Stanage and Keith Vertanen},
title = {Investigating Speech Recognition for Improving Predictive AAC},
booktitle = {SLPAT '19: Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies},
location = {Minneapolis, MN},
month = {June},
year = {2019},
pages = {37--43},
}
The Turk dialogue dataset is licensed under a Creative Commons CC BY 4.0 license.
| Corpus: |
272K |
Zip containing the original and filtered dialogues. |
|
3K |
Description of the files in the dataset (contained in the zip file). |
|
372K |
Original set of dialogues (contained in the zip file. |
|
303K |
Filtered set of dialogues (contained in the zip file. |