Abstract: If you’ve ever built–or thought about building–an NLP system, you’ve probably run into a few questions: How can you tell if it’s working? How will you know if it continues to work in the future? How do you know when you should you update your models, if ever? Luckily, there are tools to help you! This talk will cover the differences between testing, validation and evaluation, explain why you need all three, and walk through an example with a chatbot system.