It’s no secret that chatbots have a bad reputation: no one enjoys a cyclical, frustrating conversation when all you need is a quick answer to an urgent question. But chatbots can, in fact, be good. Having bad conversations can help us get there before they’re ever deployed. This talk will draw on both academic and industry knowledge to discuss problems like: What do users’ reactions to unsuccessful systems tell us about what successful systems should look like? Are we evaluating the right things… or the easy to measure things? Do we really have to look at user data? If so, when and how often? When, if ever, should we retire old methods?
I gave (slightly different) versions of this talk twice, once as a key note at the Workshop on Insights from Negative Results in NLP (co-located with EMNLP) on November 10 2021 and once as Oxford Women in Computer Science Distinguished Speaker Series on October 11 2021.