Taking Testing to the Next Level with Dylan Zwick
Overview: In this episode of Digital Conversations with Billy Bateman, we are thrilled to have voice expert Dylan Zwick CEO of Pulse Labs join us. He walks us through his testing process, and best practices for using bots to leveraging other tools and building bots that create value.
Guest: Dylan Zwick– Dylan is the co-founder and CPO of Pulse Labs Labs, the premier user research, measurement, and analytics solution for developers, designers, and platforms creating the next generation of voice interfaces. Pulse Labs works closely with the major platforms (Amazon, Google, etc…) as they create this new technology, as well as with application developers building on these platforms, to help make their work great through feedback, testing and data from real world users.
Before striking out on his own, he was the Director of Data Science at Overstock.com. He led a team of data scientists on projects ranging from automated bidding optimization to supply chain management. Essentially, if there’s a facet of the company that could benefit from improved data analysis, visualization, automation, or optimization, his team tried to be be of assistance. Finding useful projects wasn’t difficult. Figuring out the most valuable projects was.
Billy: Alright everyone, welcome to the show. Today I have the pleasure of being joined by Dylan Zwick, the chief product officer at Pulse Labs. Dylan how’re you doing man?
Dylan: I’m doing great. Thank you so much for having me today and we’re really looking forward to it.
Billy: Yeah, I’m excited to have you. I saw you went to the University of Utah and I’m a BYU guy so we decided to have a truce.
Dylan: This is over! This interview is over!
Billy: I’m not going to bring up that we haven’t won a football game in like a decade. I’m excited to have you talk about what you guys are doing with voice skills. And before we get into that though, tell us just a little bit about yourself and your background. And how you got into it working with voice.
Dylan: Yeah, so as you mentioned I’m actually from Salt Lake City and grew up in Utah. And I spent some time in California, but was finishing my PHD at the University of Utah. That’s actually when I met my cofounder, who was working at Goldman Sachs at the time. He was a programmer for their high-speed trading system. We actually met as gym buddies, as workout buddies. It’s sort of a nontraditional way of meeting your co-founder. Usually it’s somebody that you went to school with or there’s somebody that you had worked with professionally, but I actually got to know and became friends with my cofounder as a workout buddy.
I was finishing my PhD at the University of Utah. Then I worked at overstock.com as a data scientist and then as the director of data science for them. So, I was kind of the head of the data science group at Overstock. And then in 2016, I started getting into Alexa and voice applications and these voice platforms.
Essentially, people have been wanting to talk to computers and wanting to talk to technology for as long as computers and technology have been around. But it’s always been the domain of science fiction. When I’d start playing around those OK this is actually the NLU aspect of it and the voice recognition aspect of it is actually reaching a stage where it’s transitioning into science fact. That’s going to be a critical part of human computers, just human technology interactions.
Voice, I think, is going to be one of, if not the primary, means of interacting with the Internet. I got into it and I started creating Alexa skills, trying to figure out if there’s a way that I could incorporate into some of the work I was doing professionally. And then Abhishek and I, my cofounder and I, we started researching what the needs were out there in the voice community.
What we found out was that the biggest need that we saw was understanding how real people were interacting with these devices, how they wanted to interact with these devices and what problems they would have. That was because most of the design experience we have with human computer interactions is based around a fundamentally visual paradigm. So designers and developers that had essentially been trained and built their careers on this paradigm really hadn’t learned yet the differences that exist within a conversational type of interface. Particularly within a voice interface.
A lot of things that are really good practices in visual design are actually bad practices in voice design and vice versa.
We built a prototype for what eventually became our product and we started doing a fair amount of customer research. And we were accepted into the Alexa accelerator, which was a startup accelerator that was a joint project between Techstar Seattle and Amazon. We got into that. It developed from a side project into a full-time job.
Billy: Were you still working your regular jobs?
Dylan: Yeah so, we were. I was still at Overstock and he had actually moved to New York. He had gotten a promotion. He was in a more of a product management role at Goldman Sachs. But yeah, we got into the Alexa accelerator and realized that we either had to say no we weren’t doing this or commit to doing it 100%. So we both moved. I moved from Salt Lake to Seattle which is where I am right now. We burn the ships is the metaphor there and just dove in 100% at that point.
The Biggest Challenge of Voice
Billy: Awesome, awesome. I think you’re right where building a voice bot or skill is fundamentally different than a chat bot in a lot of different ways. Where we build chat bots, the user can see everything. You can present options. There’s a lot of visual things that go with that and I think one of the challenges with voice is like how do I learn to use this skill? How do I even find that skill? So, what do you see right now? Before we get into your guys’ process, what are the challenges that voice is facing right now?
Dylan: So, one that you bring up is discoverability. Discoverability can be a problem in that it can be difficult to figure out even what’s available. It’s funny sometimes we’ll see or do these surveys. We’ll say “OK what are additional things that you’d like to be able to do with your smart speaker,” or something like that. And the responses that we get are frequently things that they can do. That’s available. It’s been available for a while. Discoverability can be an issue. It is not, in my opinion though, the major issue. The biggest hurdle can be retention. What we’ve seen is there are few killer applications out there that people use and that people keep coming back to and reusing. And then there’s been a lot of experimentation around additional new things and new ways that we can use them.
Future of Voice
I think that what we’re seeing actually is that the so much of it has been built around the smart speaker product. So much as it has been built around the expectation that this is going to be something that people are doing on a smart speaker and not really something that’s going to be integrated with a whole slew of devices. So it’s not just the case that we’re going to be talking over the next few years to just our speakers. The big vision behind this is that you’re going to be talking to your phone. You’re going to be talking to your televisions. You’re going to be talking to your cars. You’re even going to possible have them integrated into your refrigerator. That opens up a whole bunch of new possibilities.
Particularly, I think the big one you’re going to see over the next couple of years is in the car. Things that maybe haven’t made as much sense when you’re talking to a smart speaker in your living room. For example being able to order ahead for food at a restaurant or something like that can make a ton of sense when you’re driving home. I think that’s a big thing that we’re going to be seeing. So much of it is contextual.
And so, as voice expands into a whole bunch of other devices and other contexts, I think we’re going to be seeing a lot of new kind of killer use cases added to it. And that’s really the goal behind the major tech platforms play here. Amazon has sold a lot of echo speakers and it’s been a very successful device for them. But that is really not their big voice play. Their big voice play is that they kind of view this as potentially being the operating system of the Internet of things, the glue that is the underlying platform upon which all of these connected devices are built.
Billy: Yeah that makes sense. Any bot, whether it’s a voice bot or a chat bot, if it just lives within itself, it’s only so useful and it’s probably not that useful. But the real place where you start to leverage value is that it can leverage all these other tools and this other information that lives outside of it that it’s connected to. So, it makes sense that the voice would run an internet of things and be the tool you leverage to use it. Let’s hop into your testing and your optimization. How did you find this as a niche? I’m always interested in how you get to where you are. For us, we’re at a different place than I thought we were going to be. We started two years ago. So, tell me about the journey.
Pulse Labs Niche
Dylan: Yeah absolutely. So that is absolutely true. What we imagine we’d be doing and where we thought the world was going to go four years ago or three years ago or my gosh even three months ago. One thing that I remember hearing at the beginning, when I started my own company. You didn’t pay as much attention to it then but overtime I keep remembering it. You focus on the road and the turn ahead but don’t focus on 10 miles ahead. Focus on the next mile because you really don’t know what’s coming 10 miles away.
What we’ve seen is that, so we originally thought that most of our business would be coming from a lot of the third-party skilled developers. And that would be this ecosystem that was being built around Alexa and Google assistant. What we’ve seen is that we certainly have done a lot of work with those but the big interest has been from the platforms themselves. So Amazon Google both are major customers of ours and they’re also investors. Then Facebook is making a play here. Microsoft is making it play here. Then also the major music streaming companies, Spotify, it’s a very important part of their strategy. Continuous testing and optimization and building around there has been the big focus.
Then what we thought it would mostly be a focus on, as I said, applications that are being built around these platforms. What it’s turned out is that most of the focus. At least that we’ve seen the last year has been not so much on just applications but on devices. On new to devices that people are building on new context in which people want to build and integrate voice. The most exciting, the most significant one I think the next couple of years is going to be is going to be in cars.
So that’s where we’ve seen a ton of work and a ton of what we’re doing. Then just in terms of the last few months as I mentioned we have what our testing solution is a remote unmoderated solution. So all of our testers, when they’re interacting with the voice devices they use their own devices in their own home. So it mimics as closely as possible the interaction people are going to see in the real world. A lot of the testing work that is done by and the major big technology companies tends to be on premise testing. They build studios and all this stuff for recording these tests. Well in the last couple months that’s been a little bit difficult.
Billy: I would think so.
The Testing Process
Dylan: Yeah that was it that was in the before times that hopefully will get back to those. But we’ve got some time right now where that is not really an option. So the ability to do remote unmoderated testing has actually been very valuable. So we have seen to be honest a major kind of increase in the volume of testing work that we’ve been doing just because we have a solution that actually works just fine under quarantine or social distance. You mentioned you never know what’s going to happen or what you’re going to see. My goodness this last five months has been I think the world is seeing the truth of that. It’s actually been rather advantageous for us. I wouldn’t wish this to last another day on the world but in that particular aspect of it has been a silver lining for us.
Billy: That’s good, at least something good is coming out of this. When you’ve got your testers and they’re testing a bot, walk me through what that process looks like for them. And how you guys document that. What the high level, what’s that process look like?
Dylan: Yeah absolutely. So the way that it works is essentially if you want to start testing bots you can come to our website and you can basically just sign up. We advertise on that pretty much social media anywhere we can. You can come to our site. You can sign up for testing the bot. We ask you for some demographic information to get an idea of who you are. Then you take a practice test where we can kind of gauge, do you provide feedback. And is the quality of your feedback.
Does it meet a bar that we think is going to be useful for our customers. Then when customers are building an application or something, we allow them to target what their target demographic is. So if you’re building like an application for seniors or something like that we can actually source a panel of usability testers that are going to be seniors.
So you make sure that the feedback that you get is representative of the type of user you can expect to see in the real world. Then we work with our customers to set up the testing. Essentially figure out what instructions they want we want to provide to the users. Those can be as broad or specific as we want them to be. It really depends upon what facets you are testing. If you just want to say, I want to understand the initial user experience and how people are going to understand this out of the gate. We can potentially say here’s a one sentence description or here’s basically what you could expect to understand about this if you saw something like an ad.
Use it as you normally would. Or try to use it to achieve this goal. It also might be that there’s a very specific facet of the application that they are interested in testing. So we’ll say, do this go here and then use this part of it to do this. We can kind of pinpoint and say this is the part of the experience we want to test. So we can take the testers to that part and then say okay now go. So it really just depends upon what we’re interested in. And then we do kind of post these testing questions.
So after they’re done interacting with that they answer questions about the experience and figure out what questions you want to ask. That is going to help you answer the questions, that are going to let you make this better. And the most effective engagements that we’ve had are actually with companies that it’s not just a one and done thing.
It’s a process of kind of continuous improvement. They say okay we’re going to be testing and then we’re going to modify. And then we’re going to start testing again we’re going to modify. It really is best to do it as early and as often as possible.
Because the earlier you do it the quicker you can catch mistakes and the earlier you catch mistakes the easier they are to fix because they don’t then propagate and affect a lot of things that are downstream. That can be the same.
Challenges of Voice Pt. 2
There are a number of differences between voice and chat but there’s certainly a lot of similarities there. I think a lot of the usability issues that you can see in chat and a lot of the best design practices that you see in chat also apply to voice. I think they both are part of this. When you’re designing a visual user interface you can put a few buttons on there and you can be pretty sure that usually only going to click those few buttons.
Whereas you know if you design a chatbot sometimes you can have buttons and options there. But your user can also sometimes just type whatever they want. You have to be ready for the unconstrained nature of that and that is magnified even more in voice. People can say kind of whatever they want at any given time and ideally you need to be prepared for that. Or you need to be as prepared as possible.
Billy: You’re right with voice like they’re totally unconstrained, the end user. They can say whatever they want. Whereas when you’re building a bot you do have that nice almost crutch where I can say okay here it’s only a button. We’ve got 3, 5 buttons or we’re only going to allow them to you know put in an email address or a phone number. If they don’t put that in the bot you don’t recognize hey that’s not a phone number please give me a number. But with voice like you got the Wild West for every question.
Dylan: Exactly and so you can build in guidance there but I mean even things like “say yes or no”. Someone will say yes, no, uh-huh, yeah, and all of that will essentially mean yes. Then ideally and this is something that kind of differs between visual and voice. And I think chat kind of straddles between the two but you want to keep things broad and shallow. What I mean by that or is that nested menus for example can make a lot of sense in visual design because okay you want here and then here and then here and then here.
You can do this filtering process, in voice that’s a lot harder to do. So you basically just have to be ready for whatever they want to do at any given time. And if they say if step two they jump to what you would expect to be step six you need to be able to jump with them. If somebody says I want you to do this and it’s not like answering this, then answer this. It is OK what you want and then tell us in your own words and as best as possible you need to be able to respond to that and react appropriately.
Billy: That’s true. Dang man. It’s a whole different world really even though there’s a lot of similarities. So I want to ask you while I’ve got you, you guys are testing a lot of bots. What do you see are the characteristics of the skills that are getting high adoption and retention like what do those skills have in common?
Voice Skills with High Adoption and Retention
Dylan: Yeah so the skills that are getting high adoption, high retention. So two things, well maybe 3 things. First off it’s a lot more effective when you’re thinking about building these to try to say okay what is something that people are doing right now that we can make easier and better. So if it’s something that people are currently doing that you can then improve. That is huge and you can see major adoption there. As compared to something that they’re not doing that they would kind of need a significant change in human behavior.
The value proposition there has to be much higher. Something like being able to turn on the light bulb with your voice. That’s a really valuable proposition to a lot of people because turning off and on lights is something we have to do anyway. It’s something we’re doing anyway and so being able to make that better in certain contexts is a real win. Another thing is it helps if there is some, we’re talking about discoverability, it helps if there’s some trigger.
So if there’s some kind of external trigger in your life that’s going to make you think, I should do that. One of the most successful skills out there right now is actually the skill for the TV show jeopardy. Every single day based upon the jeopardy clues from that day there’s kind of an additional set of questions that people can play on Alexa just just by saying “let’s play jeopardy”. What they see is a big spike in the use right after the TV show. So that’s kind of a natural trigger for playing that game that has been very popular. It helps to have some trigger within the rest of the world. The same way that push notifications are a big thing on cell phones.
Make it Conversational
Then the other thing, and this is where testing really comes into it is that it needs to be really intuitive and natural. Humans have this I would even say innate expectation for how conversation is supposed to work. Talking is almost the quintessential human thing. It’s a fundamental aspect of what makes us us people. We have a lot of expectations about how a conversation is supposed to work. How talking is supposed to work. Expecting people to change that is a big expectation.
So you need to figure out how to best craft your conversation and craft your design in such a way that it mimics the way that people are already speaking. And a big part of that is just testing it and understanding it and getting real user feedback. That’s where pulse labs comes into this, helping with that aspect of it. And I think that applies to chat bots as well.
Billy: Yeah. We build a new bot, we do the exercise, you sit back to back. One person is the bot, one person is the end user. You read it out loud. Nobody would say that let’s rewrite the copy. I think you guys are on to something. I really like it makes sense with the usability. I’m excited to see what comes with cars. We had alexa at our home for a while eventually like we just weren’t really using it so we got rid of it. The only things we did use it for were things we were already doing like play podcasts, set timers, tell us the weather.
Dyan: Those are kind of the standard use cases. Alarms, timers, weather, music streaming. One thing we have seen a little bit more of is people are starting to use recipes a lot more now these people are cooking at home a lot more. So using kind of as a KitchenAid there. I think that the big changes were going to see are going to be around new devices, in new contexts. I think that’s that you’re going to see a lot of over the next couple of years.
Billy: I felt like I live in my car, driving everywhere and I love it. But I would probably use voice bot quite a bit once that became really available to me. Awesome man before let you go, you do have one other product that’s really interesting I wanted to let you tell people about it. Where you kind of like a Nielsen rating system for the bots. So I’ll just let you go.
Dylan: Absolutely so we also have something that’s essentially, you can view it as kind of as a Nielsen for voice. We have another panel that we can monitor what they’re using voice devices for. And what people are doing. What people are not doing. So if we want to get a gauge on what’s popular, what’s trending, anything like that we can. That’s information and data that we can capture that we can provide. So we have a partnership with Kantar which is kind of the one of the larger market research firms in the world. And we provide a voice component to their market research. So we’ve actually done a fair amount of work. This tends to be compared to product teams this tends to be more with marketing teams.
But we’ve done a lot of work with marketing teams essentially providing market research there. Or even tracking and attribution. If you’ve got an ad campaign and there’s a voice aspect to it being able to track and get engaged. How popular is that, what sort of changes are we seeing. We think consumer behavior based on that we can track that with our measurement panel.
Billy: That’s amazing, man I love that. You’ve got to have the data to make the smart decisions and that it just makes sense so they can, okay we want to build a bot for our brand. What makes sense. Whether people adopt what they use in how we compare once we’re live. It’s good stuff man. Well I appreciate you coming on Dylan it’s been really informative and entertaining. So if people want to reach out to you to continue the conversation where can they get a hold of you?
Dylan: So first off thank you very much for having me on. It has been a pleasure and I really enjoyed it. Our website is just www.pulselabs.ai. Or you can reach out to firstname.lastname@example.org. If there is anything you are curious about or want to know more about just reach out and we can continue the conversation.
Billy: Thanks, we’ll chat later.