Interview with Qualcomm’s Gary Brotman, Part 2: On-Device Machine Learning and AI Hitting Mainstream
Qualcomm is at the forefront of artificial intelligence computing on mobile devices, with many advancements in recent years and many more to come with the launch of the Snapdragon 845. Our very own Editor-in-Chief Mario Serrafero recently had the opportunity to interview Gary Brotman, a product director at Qualcomm who heads up the company’s Artificial Intelligence and Machine Learning Product Management.
In this second part of a two-part series, we talk about the future of AI and the role that Qualcomm is hoping to play in it.
Mario Serrafero: So we sat on this panel about moving machine learning, or at the very least inference, from the cloud to the device and this is what all these new advancements that we’re seeing are enabling, especially if they support all these frameworks that developers are using. They’ve made it easier. So do you actually see kind of that being a catalyst that will kind of stimulate some higher degree of adoption within just regular applications on smartphones?
Gary Brotman: Yeah! Yeah, I think so. One of the things that may seem a little bit cheeky, you look at examples in the market of when things start to tip over into the mainstream, it becomes part of the common dialogue of an average individual.
Do you watch Silicon Valley?
G: You know the Not Hotdog app? When something like that hits […] I have it on my phone actually, I’ve installed it and I’ve tested it at the grocery store, and I’ve gone and I’ve looked in the hot dog section, right? It only identifies a hot dog! Multiple hot dogs don’t work. But something like that is a clear indication that it’s hit a point where it’s understandable.
M: Yeah. The underlying mechanisms are still mystical to most.
G: They should be, by the way! Most users shouldn’t even know they exist.
M: No, you’re right yeah. It’s a closed box and it does its thing, and you don’t have to worry about anything. Yeah, no, that’s true and I hadn’t considered that it is tipping into the mainstream and in a way that you can see it.
G: Yeah but the other thing is […] it’s great that there’s all the attention, and I think it’s appropriate because what’s happening is that it’s changing so many things, and I think that [with the] limits [of] compute and power constraints […] we’re always needing more, like, “We want more, yeah, just give us more, give us more,” so we can advance more, so we can create more, and innovate more.
M: It’s crazy because, like, theoretically these algorithms, these strategies, have been laid for decades and it just took being able to finally, you know, get a proof-of-concept that’s viable for any kind of consumer application and now it just blew up. So that’s interesting too.
G: The term AI is also one that I think is a very broad one. Really, at the end of the day, the AI is the result of the technology. It’s the result of something that is supposed to mimic the human brain.
M: It is kind of also muddied up with general AI. I try to use the term AI as little as possible because I know some people are going to say, “Well that’s not actual AI!”, and well, it’s not meant to be, it’s not general AI.
G: Right, but it is a term that went from being science fiction and creepy, to now being common.
M: Unless you’re Elon Musk.
G: Uh, still creepy!
So you mentioned decades […] A long time ago, I worked in the music business — primarily in digital download, subscription music, MP3 compression. And I worked for a company in San Diego called Musicmatch. They had jukebox desktop software for the PC that you could use to rip your CDs, make your own MP3s, and connect into the Gracenote database to get all your metadata. And we had developed a personalization engine that was designed by some guys who were doing security for financial transactions. This was before Pandora, right? In fact, Pandora’s precursor was something called the Music Genome Project, and we had built our engine around probably the same time that the Music Genome Project had started. And the premise was, you as a consumer, the user of our product, you just let us listen and monitor your behavior. So your playlist will get uploaded to our servers if you give us permission, and we will provide you with rich recommendations. That was based on implicit data, implicitly gathered data — nobody would rate songs up or down, thumbs-up thumbs-down. It was: We observed what you do, not what you say you do, and we will compare your listening behavior with that of the community. So you’re not looking at eras or genres or instrument types or whatever, there’s not one variable. It’s basic, it’s very general. And [in the end] you get these very rich deep recommendations, your fabric, your palate of music is quite different than just a genre-based station on the radio.
The effortlessness of it was what I found amazing, and that was my first introduction to what would be considered AI collaborative filtering. That’s was back in 1999, so here we are almost twenty years later and now the thing that excites me is that now I can get to talk about the device on the desktop knowing my voice. Not just my voice, but knowing my mood by the tone of my voice, so that a playlist is generated based on that, kind of an implicit offer of information about me, my personal [mood]. I’m not saying, “I’m sad”, my voice, the intonation is [saying] I’m sad […] For [a] vision device like the Amazon Echo Show or whatever, well, the same thing is true. You can map a face, you could detect an emotion, you can provide context relevant to that emotion to alter it or to let it run. That’s all possible. So I look at those two things, twenty years later, the implicit observation is this was all cloud-based and now this is the local intelligence in observing what we do, to provide a very similar experience but not something that’s tethered to the cloud — except for the content of course.
M: Many of the cutting edge changes are being made through the DSP and HVX are coming through flagship chipsets. We’ve seen though in the past few years that a lot of flagship features have been making their way downstream, quicker to Qualcomm mid-range devices. So how do you see, first of all, that evolution for DSP and AI on mid-range devices and how do you see that kind of enhancing the adoption of both these technologies for both consumers and for developers?
G: AI is a compute problem and if there’s compute you can do something. Strategically it makes sense for us to be segmented, our portfolio is segmented by competing capabilities, different elements, different quality that you can capture pictures in, or whatever. And that’s not gonna change, and I don’t see that there would be a ubiquity across each of the chips with respect to hardware. We’ll likely keep certain things in high and mid-tier, that that won’t change. But because of the fact that there is compute there – we have software tools like the neural processing engine, and that’s designed to actually be agnostic. That’s already running all the way down to the 400 series. If you want to build a connected camera and you have a low tolerance or low threshold for frames per second […] well, we’ve got the tools for you to do that. The same thing’s true for audio, but each tier is going to give you a different level of performance.
It becomes a portfolio strategy question, it’s a sales problem as much as anything else. But the innovation [is] certainly at the top tier and I think it’s easier to do that, to “go big”, than it is to “start small” and try to build on top.
But I think it’s early days from a business standpoint. AI can be a differentiator for things like the premium tier because there are certain use cases where you can’t achieve the performance you need in lower tiers. The horsepower is just not there.
M: Precisely — ifthe developer ecosystem really wants to go big on this, they need a wider range of devices to be able to compute these features efficiently to the point they are actually worth implementing.
G: Yes, so you look at application developers like Facebook or Snapchat or whatever, they face [these] problems irrespective of AI. They face the same problems of, “Well if I have something that is resource-dependent, a feature, it’s going to perform differently on different phones.” Particularly with an AI-driven feature like style transfer, you want that to be as seamless as you can [make it] but you’re not going to get that down to the lower-tier, entry-level phones.
M: You know what, I actually haven’t tried that on a low-end phone. Now I’m interested to actually see that.
G: I’d be interested — I’ve never tested them! They have to account for that sort of thing, they have to figure out, “How do I find a happy middle ground, between exceptional performance and acceptable performance?” You can’t bifurcate the apps and have one app for one tier and one for another.