Podcast: Play in new window | Download
Subscribe: Apple Podcasts | Google Podcasts | Stitcher | TuneIn | RSS
Carmen Martinez and Paulo Azevedo combine her linguistics and ethnography skills with his computing and product skills to create computer interactions that feel almost human.


Carmen and Paulo collaborate to design conversation experiences for FlixBus, a company that helps millions of travelers around the world book bus travel.
It’s hard to create natural-feeling conversations between humans and computers, but they get better at it with every product launch.
We talked about:
- Carmen’s background as a conversational UX expert and Paulo’s as a product owner, data scientist, informaticist, and developer
- their collaborative process in designing conversational experiences
- Paulo’s moment of insight when he realized that his developer team would benefit from having a human-centered researcher and designer on the team
- how they align human and computer approaches to conversation design
- how complicated a seemingly simple task like providing a bus stop location is in a conversational interaction design
- the eye-opening challenges of helping digital conversationalists interact appropriately with humans
- the wide range of technologies that underlie conversation design
- how they use ethnographic and other research methods in their conversation design process, and how data from real human users feeds into their ongoing research
- the huge differences between graphical user interfaces and voice user interfaces
- the challenges of figuring out what you don’t know when their are conversational misunderstandings
- the importance of having a language person on your conversational design team
- how conversation design is still a work in progress
Carmen’s Bio
Dr. Carmen Martinez is a Conversation Analyst and Ethnographer of Communication working in Conversational Artificial Intelligence at FlixBus. As an expert in human-to-human conversation, she contributes to a cross-disciplinary team by automating customer service interactions, modelling both text- and voice-based human-to-machine conversations, and developing visual solutions for graphical and multimodal conversational agents. She is the author of “Conversar en español: un enfoque desde el Análisis de la Conversación” published by Peter Lang Berlin.
Connect with Carmen on LinkedIn.
Paulo’s Bio
Paulo Azevedo is an IT professional based in Germany, where he’s spent the last few years working on AI and machine learning projects at different capacities. He’s done data analysis, software development, developed machine learning models, and lately has been focusing on agile project management. Since March 2017 he’s been working at FlixMobility, a German mobility startup with operations in 30 countries, where he was responsible for the strategy and implementation of voice platforms.
Connect with Paulo on LinkedIn.
Video
Here’s the video version of our conversation:
Podcast Intro Transcript
When you talk to Siri or Alexa or interact with a support chatbot, you probably don’t give a lot of thought to the work that went into creating those conversational experiences. Carmen Martinez and Paulo Azevedo do think about that work – because they do it all day. They design conversational experiences for FlixBus, a company that helps millions of people book bus travel in countries around the world. Carmen and Paulo combine their linquistic and computing skills to get closer every day to conversational experiences that feel human.
Interview Transcript
Larry:
Hi, everyone. Welcome to Episode Number 83 of The Content Strategy Insights podcast. I’m really happy today to have with us, Carmen Martinez and Paulo Azevedo. They work at a company called FlixBus in, I guess you’re all over Europe, but you’re both based in Germany, I believe. Well, welcome to the show, Carmen and Paulo. Carmen, you’re a Conversational UX expert there. Tell us a little bit more about what that entails and how you ended up in that role?
Carmen:
Well, so you’re saying, a conversational UX expert, that means that I am interpreter for conversational UX design and conversational UX research for our conversational agency, in different platform. And yeah, how did I ended up in FlixBus? So basically my career got started in conversational AI, UX already three years ago and yeah. Like other conversational designers, I have changed from a previous career on something else. In my case, it was, I have an academic background and I was working before as a second linguist teacher for a university lecturer. And conversational AI came into my path. And I was also in a timing when I was having some turn in my career and, yeah. I just three years ago, I decided that older now, let’s say have acquired as a conversation analyst, somebody working with conversational data in human-to-human conversation, would be useful for these software developers that back then in 2017, were starting to design human-to-machine conversation. So yeah, here I am
Larry:
Great. I love that there are so many folks coming from academia over to the UX field. And it sounds … Well and we’ll get into this as we go, but it sounds like that’s really necessary in conversation design, but yeah.
Carmen:
Well, yeah. It’s necessary but those for the are less post docs positions that pay. So this is . . . what been going into the industry.
Larry:
I have a lot of friends in academia and I know that problem well. It’s kind of a racket. They turn out more PhDs than there are jobs for in academia, but we’re happy to have you here. Hey, and Paulo-
Carmen:
Thank you.
Larry:
Paulo and you’re a product owner there at FlixBus. Tell us a little bit about your role there, and a little bit about your background as well?
Paulo:
All right. Sure. So first of all, thank you very much for having us here. So the way to put it, so the job of product owner is essentially to bring together the people whom have a need that it will be tackled or so through technology and the people who will make that solution. And so my role is really to be the bridge between the two sides. So that the people writing the solution, developing the solution, will develop a solution for the real problem and not for the problem that they understood. Because it is often the case that people with different backgrounds, they might not be on the same page. And it is my job to really bring everybody from the technical and business sides alike to be on the same page so that the solutions developed are really satisfactory to everybody involved.
Larry:
Right. And you’ve got a good background for that, because you’ve got a background in data science and in coding and I know people come to product roles from a lot of different directions, but …
Paulo:
Yeah, exactly. So some people do come from a more like business oriented background and then they learn the tech bits and they learn to build a bridge from that perspective. I come from the other perspective, which is coming from a very technical background. I’ve worked as a developer, as a data scientist, and I’ve done like bachelor’s degree and master’s degree in informatics. And eventually ended up in this a more like business related science. So it’s essentially about the ability to bridge that gap. And I think that there are many diverse backgrounds that can yield similar results at the end.
Larry:
Yep, and the thing that I’m really getting a good glimpse of at FlixBus, and this I think is common probably in a lot of the tech companies nowadays, is the way you bring all that talent together. And you’re sort of at the center of that, but you and Carmen work really closely together on a lot of these things. Tell me a little bit about like, to what you just said about like solving the real problem, that’s kind of your job as the product guy to figure out what that is and keep everything on track. How do you and Carmen work together on articulating those problems and starting to figure out the solutions?
Paulo:
Well, I’d like to tell a story then. So before Carmen joined our team, we were just developers working on our conversational interfaces. Which used to be back then Google Assistant and Alexa basically. And yeah, at some point I came to the realization that what we’re developing had its merits, but it was clearly something developed by developers, meaning you could see that the design had been done by developers and so on and so forth. So you start to observe that the product has that tendency to favor a more utilitarian approach rather than what we really need.
Paulo:
And that was the moment when I realized that we needed somebody with a more … to bring a diverse background that would bring us back on track. And I remember reading on LinkedIn, a text that Carmen wrote about how to do conversational recovery and how it is done with machines versus how actual humans do it. And how to actually bring a bit of that human approach to the bots. And I was like, that’s the kind of situation that we need to put ourselves into. Like questioning the decisions that we made., and coming up with ways that feel more natural, feel more intuitive. And yeah, so I think that was the very beginning of our relationship. Then we started chatting over LinkedIn and eventually she became a colleague in our team. But I think that was essentially how it came to be.
Larry:
Well, now I’m dying to hear Carmen’s perspective … because Carmen, I didn’t mention this, I don’t think upfront that you’re a sociolinguist. So you’re like a people person and a language person. And now you’re coming into this team with Paulo. Tell me a little bit about your experience with that.
Carmen:
Well, my experience with that is a pretty similar, but coming from the other side, I mean like my first experience with a software developer that they were creating and do with Alexa was already in September 2017. Well, it was a consequence of my previous accident that brought me into conversational AI. And basically I finished up in an Alexa Developer Day in Berlin. And yeah, I always will remember that because there were nobody with a known technical background coming around in the event. You know what, I describe myself and I just I went to observe what they were doing. And actually when they, it was maybe directing me because back then there were not Alexa in my village in Europe, the one village is different to the other with the Amazon Echo coming to Germany and saw like, “Wow, this is the technology.”
Carmen:
But very, very soon through the designs and also through the use cases that they were showing, I could realize, like what Paolo said that they were trying this OPD diagram designs, like they were favoring clearly the mental models of the machine and as developers, I think that they are supposed to do that. I mean, like you shouldn’t demand from developers to create good conversational experiences, the job it’s different. It’s to favor the system, and they are assisting in that they have to coding the algorithms, but if you want a good conversational product, you have to have the other side of the experience. Actually, from that experience, I realized they are going to need people like me in the future that are going to be conversational designers.
Carmen:
Like they are UX designers sitting with developers and they will help them to improve and to make the experience more human-like, because of course it’s a human to system interaction and I would be fooling the audience if I would be advocating here that we should be able, we can’t be at fully natural interfaces with this, this is impossible. And most likely it’s not going to be the case. But this was my first experience with, I decided that they will need the work from people like linguists or sociolinguists. In my case I am a conversation analyst that we are a very rare specimen, also people working in sociolinguistic because we are focused in human-to-human, everyday conversation. And this is like the old team mate also type of a specialization that you can get to work for conversational design, because yeah.
Larry:
But you’re almost like an interpreter coming into this because you mentioned that a lot of the work by necessity has to work around the mental model of the machine, the computer that’s doing all this stuff. But you’re bringing in the actual human models, the mental models of how to craft and participate in these conversations. How do you bring those together? Like, is it just a constant back and forth conversation between the two of you and other people on the team about, “Well, that’s the way computers think about it, but this is how humans do it. Let’s try to make this experience more human.” I mean, we’re all human centered designers, I assume. Is that what’s going on or … either of you can tell me more about that process of aligning the human and computer mindsets?
Paulo:
Well, I wish I had a single answer, but this can actually happen in multiple ways I think. Sometimes we need to develop something, like extending something that was already there. Extending a use case we already supported for instance. And then when making the designs for that extension, Carmen, will see the legacy designer was there, and start questioning things, so that’s one way. Another way is when we start from scratch, I think or even when going through actual conversations that customers are having, or when measuring the different KPIs of the conversations. I think Carmen is very prolific in coming up with a lot of those ideas and it’s pretty much doing whatever I had hoped for. In that moment that I was telling before about that story. Like, we need somebody to bring that perspective. I think it’s really happening a lot there.
Larry:
Yeah, no I love that. And actually to what you just said, you mentioned the term use cases, and I think maybe that’s a way to … another entry point into this. Because you have a lot of use cases there. Just a quick piece of background for folks who are just tuning in, that FlixBus is sort of, I mean to grossly over simplify, it’s like Uber for bus travel mostly in Europe, but I think you’re branching out to other regions as well. And so there’s a lot of a huge amount of communication going on between customers booking travel and the agents working with you and the bus services working to tell you when. . .
Larry:
There’s just a lot of stuff going on there. So yeah, many different channels you’re communicating in. And many different kind of modes of conversation in each of those. I guess first of all, is how do you manage that scope? That’s a lot and you’re doing different things. Like you mentioned, Alexa Skills and, or Alexa Skills and Google as well. But you’re also doing stuff that’s more native, like chatbot kinds of things that are on different platforms. That’s just so much. Well, I guess, one quick thing about that, Paulo, you’re one of 40 product owners at FlixBus. So anyhow, just but maybe is there one good example, like one of those channels or one of those kinds of conversation, that’s a good view of how you operate, how you work together?
Paulo:
No, it’s difficult to name a single one. I don’t know. I think that one that we’ve been working a lot on in the last few months, every so often. Like we go and do some more changes and improve it a bit more and stuff. Bus stop info, I think, because we tried several different things. So maybe that’s one good starting point because we started under the assumption that the user would simply state the name of the station and then problem solved. Or the name of the city and the city only has one stop. But then we started to realize that what we’re seeing is actually much more complicated than that. And then Carmen had a lot of ideas on how we could improve that. And then we were experimenting a lot, observing what was the outcome in the metrics and then rinse and repeat while alternating with some other stuff as well.
Larry:
That’s right, because from a pure informatics perspective, It’d just be like here’s where we are, here’s the information you need, but Carmen, you can probably speak to how much more complicated the actual human communication experience is, right?
Carmen:
Yeah, actually this bus station info is a very good example of something of a capability that there was no pre-code because when we have our contexts in with something in which the code has been already produced. Yeah, my work focuses more into liberating as much I can the human perspective in that case. But with the bus stop information, we were starting the capabilities from scratch and also from the call center. That gave me the opportunity of actually applying, and well I think we said, from the very beginning of the design cycle. Because in a company like FlixBus, it’s definitely … We know that when a customer calls and demands the address of his bus stop, he wants, or she wants the bus stop address. But, and yeah, there is query, there is a questions that the developers are very much focusing, but what is the background problem?
Carmen:
You know, why are they calling us, for asking the bus stop of the others, they’re somewhere on the bus stop. So basically I always say start from the assumption that we don’t know the user case. We may know the query, but really we don’t know what is in the background. And for that capability for the research time, for the discovery phase, I was working very close with the human agents of the call center, like doing observations on what they thought, that bus stop info was about. And after creating a corpus of conversation of happy path of use cases with possiblity for fulfillment on the case of the customer of course. And thinking how to replicate that interaction with computational parameters.
Carmen:
So the final goal was understanding the context, the context in which with the customer was for that use case and how to translate that into properly to human machine interaction for that first installation of the capabilities. So after this, for me … the ethnographic research is very important, because actually it takes out from whatever pre assumption we may have before starting to work into the capability and yeah. After the first installation of the software, you already have metrics, you have KPIs, through which you can cross check if your assumptions, your approximately fine were the correct … No I mean implement on the data. So it’s for me it’s an example of capability successfully developed from the very beginning.
Larry:
I love that so much of it’s about clarifying assumptions. Because I think in normal human face-to-face conversation, it’s safe to make a lot of assumptions about … because you kind of can figure out a lot of the contextual stuff, but in these kind of situations, you have to do like specific ethnographic research to get to the bottom of what those assumptions are. And then how to, again, sort of align the queries they’re making with what the information you have to offer, and somehow crafting out of that experience that has so much computing in it, something that resembles a human conversation. Is that … Paulo, do you have anything on that? I saw you nodding your head.
Paulo:
Yeah, no. Thinking exactly in those lines that you were mentioning. It’s funny when I read something, focusing on that. We have to bring even the most basic assumptions that sometimes is second nature to us, that we’re not really conscious of them. You know, so for instance, if I were talking to you and then I were to turn around and start talking to somebody else out of the blue without even excusing myself or something. That would be extremely rude that would probably throw you off track of your train of thoughts and such. So this is the sort of stuff that nobody needed to teach us, we kind of know this. But if we think about it differently, if it is, I don’t know if I am talking to Google Assistant and then somebody walks in the room, that person will automatically have priority because it’s a human versus a robot, so I’ll start talking to them.
Paulo:
And that means that we have to be aware that we human tend to give this priority to humans and will not give it to a robot. So our bots will sometimes start listening to conversations that are not meant to them. And I learned this the hard way by having some conversations that were going wrong, and I was reading the transcripts and I was like, yeah, this was obviously not at all related to the topic of finding rides or whichever the use case that was working on was. And the person was just talking to somebody else around them. And so we definitely need to bring this experience back and it’s sometimes not straightforward, but very often it’s eye opening. Like I realized some stuff about human nature that maybe intuitively I knew, but I started to put words to them and I find it fascinating.
Larry:
That’s great. So I love that this is a never ending learning process, and that, it’s kind of like… Well, actually, which reminds me, like how sophisticated is the technology that underlies this? Are we talking about things like machine learning and AI, or is a lot of it just pattern matching kind of … I know that’s a huge question, but …
Paulo:
Oh, there is a bit of everything. I mean, there’s statistical methods that are used, that many of them aren’t actually pretty deterministic, but it will be used for instance, for fixing typos. You know, if you’re talking about a chat bot instead of a voice bot, then you probably put some typo correction first before doing the natural language understanding or NLU, and for that, you’ll usually use a thing called deep learning, which is one of the ways that you do artificial intelligence nowadays. And that requires, well humongous amounts of data and computing power, so that the response comes back in a reasonable amount of time. And there are also some clever tricks in between as well.
Larry:
Yeah. No, I have to say just as like somebody who’s relatively new to exploring this, I’ve had like, I don’t know, maybe three or four interviews now with folks in this realm, the amount of stuff that’s going on in the background just boggles my mind, and the fact that you manage that. But Carmen, I want to go back to you about your role in sort of moderating that learning process. Like what are the … because do you do mostly ethnographic research or do you employ other methods as well?
Carmen:
I basically use a mix and match combination of methods that I have learned from conversation analysis and … of communication. So the part related to observation and sitting along with the human agents or interviewing them and understanding the query perspectives, their own perspective of the queries and how much they can translate about the user or the customer vision, towards the participatory observation, therefore it’s anthropology, it’s ethnography of communication. And therefore I find it to build up a corpus of conversations and if I am interested about that then you scale, this is corpus linguistics and selecting different interactions that they have, the same kind of participants, and the participant themselves, they have the same sociolinguistic profile. So the corpus could be homogeneous, and homogeneously can represent the use case. And to the core of my expertise is in conversation analysis. Linguistically the knowledge that I have about the structure and interactive structure of human-to-human conversation on how to translate that into human-to-machine conversation.
Carmen:
If the data said it’s a small, I will transcribe. I will listen very carefully by myself, the conversation I would manually transcribe, but this is really, that is a speech-to-text outside to do that. It’s like a mannerism of conversation analysis. In conversation analysis is very important that you record the interaction. You are present, but you are not a participant in the conversation because you are at the same time observing. And after that, you listen several times, the records and you transcribe manually because it’s the first hint of the structure that is going on. Like when you write down the conversation, the linguistic aspects start to appear, and you will start to put your attention into certain patterns. So it’s part of the qualitative process. And after you have identified the conversational phenomena, that you are interested in, now in conversational artificial intelligence, the parameters that could be translated into human-to-machine conversation, then you start to apply more analysis and even statistical analysis, but everything is very qualitative. So as you see the part of the qualitative approach really helps for the first stage of the design on the structural knowledge, if implemented with properly, sorry, organizing the change there between the human and the machine.
Larry:
Hey, I wonder if you could clarify, you use the word corpus a couple of times. I think that’s familiar to all of us, but for folks who don’t know, that’s basically a body of words and texts you capture. And so your kind of doing, like any UX researcher, you have a bunch of qualitative input, ethnographic research and transcribing individual conversations. But then you end up with this big corpus, this big body of work that you can then go in. And that’s where you do the more like analytics kind of stuff on that corpus to validate the hypotheses that you’ve come up with in the more qualitative side of it. And then I gather that it’s like this constant back and forth between those. Is that accurate?
Carmen:
No, because after deployment, the design, the customer is interacting with the system, and this is what makes different artificial intelligence based products from other kind of products. But at the moment that they are deployed, that you have fresh real data of how your model is interacting with humans. So after that is more about analyzing the transcripts with real users.
Larry:
Right, and you have a lot of customers. So you have a lot of empirical data that feeds that system. Like, I don’t know … I know that sometimes with these kinds of things, you run into like statistical significance problems, but it sounds like FlixBus is pretty big. So do you always feel like you have enough data to be making good decisions?
Paulo:
Yeah, I think. So sometimes we do run experiments that we actually answer things with that level of certainty. Like we say that it’s 95% likely that what we were observing is because of intuition we made and not because of chance. We have done that even in this project, but sometimes we just cannot possibly know what the actual numbers are so that we compare things. So sometimes it’s very, very difficult to really say whether we are already there or not. To give an example, I mean, in general, with voice interfaces differently, from pretty much anything that you have a graphical user interface. In a graphical user interface, the user can only click on the buttons you put there, only type into the fields you put there, et cetera. With voice or even chat the user input can be anything at any time. And so it sometimes throws any assumptions that you made out of the window. So we cannot know sometimes if what we’re doing is good or not. I mean, our bot is usually able to say if a situation is so-called “fall-back.”
Paulo:
So the bot knows that it “knows” between quotes, that it didn’t really understand you, but sometimes the bot thinks it did. And it’s very difficult for us to obtain a canonical number that says, what is our actual fall-back rate? Because the volume of data is just too big for us to have people looking into it. And then specifically, I remember that Carmen brought up an example the other day to a meeting. So we have this thing that people might want to book a ticket for a child, and that might have a special characteristics in some markets. And the user actually didn’t mean to talk about booking tickets for a child, but rather bringing a child seat for the bus. But the vocabulary used in that query was similar enough that the bot misunderstood it. So therefore sometimes it’s very tricky to know what you don’t know because … yeah, this one came to our attention, but there are those conversations that might not have gotten to our attention, like manual attention for whatever reason. And therefore we think that the bot did fine when it didn’t or vice versa.
Larry:
Great.
Carmen:
Yeah, you actually find these conversations in which the human the system are completely in two different pages. And yeah, we have found also examples that they are the most amazing one in which the, although the human and the system are really interacting but speaking out different things on the end, there is conversion … convergence, sorry, like the one I haven’t seen in human to human conversation, like you really recover the topic of the conversation and you are able to align in the same topic. So yeah, it’s very difficult really to get to know what you don’t know, because actually in turn recognize doesn’t mean that there is not a problem there, but that kind of analysis is qualitative again, and it requires a lot of analysis and human power, actually.
Larry:
That’s great. Well, I just noticed we’re coming close to time. But I think this is … I just want to thank you both. This has been such a great glimpse into how you operate, but I’m realizing that I probably should have set aside five hours for this conversation because there’s so much more we could talk about. But I appreciate … I think folks will get a good feel for just the complexity and the stuff that needs to happen to make these conversations happen. And maybe have just a little more … I hope everybody has a little more patience and appreciation the next time we talk to Alexa or interact with a chatbot. But hey, I always like to give my guests a chance. Is there anything last, anything that’s come up in the conversation or that’s just on your mind about conversation design that you’d like to share with our folks?
Carmen:
Well, I would advise, although I’m not a really a big fan of giving advices. I would advise to the companies or other teams of developers who are working in conversational artificial intelligence to hire a linguist. It could be a second language teacher, it could be a sociolinguist, it could be a UX writer, it could be an ethnographer of communication. Could be whatever kind of linguist but bringing somebody with human experience to your team.
Larry:
That makes a lot of sense. Since, we’re talking about human conversation here. Paulo, anything last from you?
Paulo:
Yeah. So I think you actually made a very good point there. By raising awareness of this, people will perhaps be a bit more tolerant when those interfaces are not doing exactly what we want. Yeah, I think there are a lot of people out there, like I know for sure that our team is working very hard to build things that are really good for most users. And we are aware that sometimes, some users might not get their situations solved immediately by the automation. And I think that it’s important that people see that in the end, those automations, are really built in a way to save people time and money.
Paulo:
Ultimately, the same way that we got to used to reaching for our phones, to look up that information, that we sometimes needed to depend on people. We’re working on making that sort of information more and more easily available so that you don’t need to wait for somebody. So yeah, there might be a few glitches in the way, but certainly I think we’re striving very fast towards the future. So yeah, let’s just bear with it for a moment and things will be better for sure.
Larry:
Well, that’s my impression is that it’s very much a work in progress, but it works really well for the most part. I’m pretty impressed with the voice experiences. So thanks to both of you for contributing to that. And hey, one last thing, if folks want to follow you on social media, is there a good place to connect?
Carmen:
For me? The only place to connect is LinkedIn at the moment.
Larry:
Okay, great.
Paulo:
I think it would be the same for me as well. LinkedIn would be the place to go to get in touch.
Larry:
Sweet. Okay, I’ll include links to your profiles in the show notes. Well, thanks so much, Carmen, and Paulo. I really appreciate the conversation.
Paulo:
Larry, thank you so much for having us and anytime you want to talk more about this, like for sure I have the feeling we could have gone on for hours as well.
Larry:
Absolutely.
Carmen:
Yes. Thank you very much for having us.
Leave a Reply