This dataset is a collection of two-sided dialogues invented by Amazon Mechanical Turk workers. Different workers invented each turn in the dialogues.
Further details can be found in this ASSETS '17 poster:
@inproceedings{vertanen_aacdialogue, author = {Keith Vertanen}, title = {Towards Improving Predictive AAC using Crowdsourced Dialogues and Partner Context}, booktitle = {ASSETS '17: Proceedings of the ACM SIGACCESS Conference on Computers and Accessibility (poster)}, year = {2017}, pages = {347--348}, }We created a filtered version of the dialogues removing potentially offensive content as part of this paper:
@inproceedings{adhikary_speech, author = {Jiban Adhikary and Robbie Watling and Crystal Fletcher and Alex Stanage and Keith Vertanen}, title = {Investigating Speech Recognition for Improving Predictive AAC}, booktitle = {SLPAT '19: Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies}, location = {Minneapolis, MN}, month = {June}, year = {2019}, pages = {37--43}, }
The Turk dialogue dataset is licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.
Corpus: |
272K |
Zip containing the original and filtered dialogues. |
|
3K |
Description of the files in the dataset (contained in the zip file). |
|
372K |
Original set of dialogues (contained in the zip file. |
|
303K |
Filtered set of dialogues (contained in the zip file. |