Automatically extracting social meaning and intention from spoken dialogue is an important task for dialogue systems and social computing.
We describe a system for detecting elements of interactional style: whether a speaker is awkward, friendly, or flirtatious.
We create and use a new spoken corpus of 991
4-minute speed-dates. Participants rated their
interlocutors for these elements of style. Using rich dialogue, lexical, and prosodic features, we are able to detect flirtatious, awkward, and friendly styles in noisy natural conversational data with up to 75% accuracy, compared to a 50% baseline.
We describe simple ways to extract relatively rich dialogue features, and analyze which features performed similarly for men and women and which were gender-specific.
Description:
Automatically extracting social meaning and intention from spoken dialogue is an important task for dialogue systems and social computing.
We describe a system for detecting elements of interactional style: whether a speaker is awkward, friendly, or flirtatious.
We create and use a new spoken corpus of 991
4-minute speed-dates. Participants rated their
interlocutors for these elements of style. Using rich dialogue, lexical, and prosodic features, we are able to detect flirtatious, awkward, and friendly styles in noisy natural conversational data with up to 75% accuracy, compared to a 50% baseline.
We describe simple ways to extract relatively rich dialogue features, and analyze which features performed similarly for men and women and which were gender-specific.