Microsoft Corporation (NASDAQ:MSFT) Goldman Sachs Communacopia + Technology Conference September 10, 2024 3:25 PM ET
Company Participants
Kevin Scott – CTO Microsoft
Conference Call Participants
Kash Rangan – Goldman Sachs
Kash Rangan
Excellent, excellent. Thank you very much. A real delight to be able to host the CTO of Microsoft, Kevin Scott, as our next speaker. I’m sure the discussion is going to be fascinating.
I’m told that, Kevin, this is — this maybe the first time ever at an investor conference, a Chief Technology Officer of the world’s, I think still the most valuable company — and I’ll let Kevin jump into his intro in a second, but I have to share with you that a little known piece of trivia.
If we’re talking about generative AI in such a big way and if this thing has become such a mainstream thing, it was a magical meeting that was perhaps facilitated between Sam Altman and Satya Nadella by none other than my guest here. So we owe our interest in generative AI and how it’s become such a mainstream thing to Kevin. So, Kevin —
Kevin Scott
Certainly not entirely.
Kash Rangan
In some part. So tell us a little bit about your background and how you got to be the CTO of Microsoft, and we have some questions to jump into.
Kevin Scott
Yes. It’s sort of a circuitous journey. But like the short story is grew up in the ’70s and ’80s in rural Central Virginia. I was super lucky to come of age when the personal computing revolution was happening, like I said, a computer science, I thought it was going to be a college professor for a really long time. And then I left academia. I dropped out of my PhD program and joined this startup called Google in about a year before their IPO.
And yes, I did a whole bunch of different things at Google. It’s the place where I first built AI system. So I did a bunch of stuff on both the search and the advertising side, like building large-scale machine learning systems, like for a while ran the ads quality system. So like this big machine learning system that did CTR prediction for advertising, which was the thing that made the ad auction actually work.
Kash Rangan
I believe it’s a small business.
Kevin Scott
Yes, small business. And then I helped build this advertising company called AdMob, but I left Google to do a start-up, which got acquired by Google 3 years later. I was at Google for a while, and then I went to LinkedIn, like I helped to — I ran engineering and operations at LinkedIn helped take the company public. And then I joined Microsoft when Satya acquired LinkedIn.
Kash Rangan
That’s great. And you ended up as CTO of Microsoft. That’s a very unlikely story.
Kevin Scott
Quite unlikely, yes.
Kash Rangan
Yes. For someone who considered a PhD in English Literature at some point in your life, this is quite a fascinating thing.
Question-and-Answer Session
Q – Kash Rangan
Kevin, can you share with us your view of where we are with generative AI today? How do you see it evolving over time? And is there at all a way to think about how this builds upon old AI that you developed at Google, if that’s even a way to think about, building one on top of the other?
Kevin Scott
Yes. So I think we’re still in relatively early innings. The interesting thing that’s happened over the past decade in particular is AI systems have started behaving more like proper platforms.
So the first AI systems that I built were relatively narrow. So if you wanted to solve a problem, like how do you calculate the effective CPM for an ad so you can rank them in an auction, you have a very narrow set of data about the ads and how people are clicking on them and you use a relatively simple algorithm that’s running at really big scale, and like you build a model and you run a bunch of experiments and it’s sort of a closed feedback loop.
And the model gets better and better and better over time, but only at the narrow thing that you trained it to do. And if you want to do a lot of machine learning, like in the past, like you had to have a whole bunch of different teams running a bunch of those like little vertical loops.
And I think the thing that has really started to accelerate over the past handful of years is you have with generative AI these frontier models that really are quite useful for a huge variety of different tasks and applications and product context, which means that you can think about them as a platform, like a database or like any other piece of infrastructure that you use to build things.
It doesn’t mean that you have zero work to do. There’s still a nontrivial amount of work that you have to do to go make something valuable out of this bag of capabilities that is modern generative AI. And I still think, to a certain extent, the hardest part about all of this is just having classic product sensibility, like understanding what a customer need is and what problem you’re really solving and then just attending to that with hair-on-fire urgency to try to do something valuable for somebody else.
But the problem that you don’t have right now is the one that I had when I wrote my first machine learning program, which is I had to sit down and spend 6 months reading a bunch of papers and writing a bunch of really complicated code, just so I could do a relatively basic thing. And that thing that was my first ML program in 2004. So depressingly long time ago. A high school kid could do it in like 2 hours on a Saturday.
I mean it’s just sort of shocking like how much capability is now in the hands of people who want to create new things with AI. And so like I think that’s where we’re at. And like the thing not to lose sight of, although I really would encourage folks not to get too swept up in hype, but there is a very real thing happening with these AI systems where they’re becoming like very much more powerful over time. And like I’ve said this a bunch of times in public and I say it all the time internally, like we are demonstrably not at the point of diminishing marginal returns on how capable these AI systems can get.
And I think Dario was on stage right before us, I think all of the people who are sitting on the frontier like evolving this particular new infrastructure can really see that like there’s still a lot of power yet to be built and a lot of capability to be put in the hands of developers where 5 years from now, 10 years from now, like somebody will be sitting on stage telling some story about the impossible thing that they couldn’t do in 2024 that now a high school kid can do in a Saturday afternoon, like — that I’m sure of.
Kash Rangan
That’s where you see the things that we take for a very complicated abilities to become table stake.
Kevin Scott
Yes. I mean it is — I mean I know you all are investors, I usually say this to like product makers. The thing that’s going to be really interesting over the next handful of years is seeing the companies and the entrepreneurs and the creative people sort of prospecting at that boundary of hard to — or impossible to hard.
I think in every platform shift that you get, whether it’s the PC revolution, the Internet revolution, the mobile revolution, the first things that happen is like you get this amazing new bag of capabilities and people go off and build trivial things. And then they very quickly realize that things that have enduring value are the things that are just on the verge of impossible, like they’re ridiculously hard to do, but at least they’re not impossible anymore. And like that’s where the really interesting stuff lives.
And you can see it model generation by model generation, like things face-shifting from impossible to just hard. And that’s a good spot to focus your attention.
Kash Rangan
Got it. You’ve lived through a few tech cycles. How do you compare and contrast this AI cycle that we’re going through to Internet mobile cloud?
Kevin Scott
Yes. I think there are a lot of similarities to all of the all of these big revolutions, like it’s sort of catalyzed by infrastructure, like you’ve got a new thing that makes a whole bunch of things possible that were impossible or extremely difficult or costly before. Because it’s — they are all platform revolutions, they’re not narrow, so it’s not a thing that one company is doing and it’s like, okay, I’ve got some secret infrastructure that only I have and only I can like imagine what the possibilities of the infrastructure are.
So we just, in parallel, like we have a huge community of people being inspired in a bunch of different ways by what the infrastructure is going to allow them to do, which I think is really interesting and exciting and invigorating, like the thing that makes me happiest about being in the spot that I’m in, is seeing what creative people are going to choose to do with this.
It also, interestingly, I think all of these things have changed the way that software development happens. So it not only opens up new possibilities for what software can be developed. It also changes the means of production of software itself.
So if you think about all of those previous revolutions, like you get a brand-new tool set and all of a sudden a type of software gets easier to write, like you’re just sort of excited as a software developer that like, oh, my God, now I’ve got this thing and like it’s just like all of this stuff that irritated me before is like easier now. And so like those 2 things constructively interfere with one another. So like you’re off chasing new ideas, but you’re chasing it with a toolset that’s made you more productive than you were before. And so that’s truly an exciting moment to be in.
And I don’t — we’ll sort of see over the coming years. These are things that are very hard to predict. But all of this may be happening faster than what we saw in the previous revolutions. But one thing that’s relatively certain, if you sort of believe that we’re in a big platform shift, trillion-dollar companies that are brand-new are getting created right now.
And usually, like the folks who move early, like latch onto the possibility, get into that learning loop where they are experimenting and learning and understanding what the valuable is versus what the trivial is, are the ones who have real advantages in building those durable super valuable things.
Kash Rangan
Got it. If this question sounds very intelligent, it is because Marco Argenti, our CIO, asked me to ask this question of you. I wish he’d been here sitting on stage with you, but he has another commitment. It goes like this. We have seen exponential improvements in LLM models so far.
There’s a race for attributes, parameters, modes and data size. Is the rate of change slowing down? Is this generation of models the path to AGI? Or do we need a fundamentally different evolution of the transformer architecture to continue making progress towards that goal? So clearly, that question did not come from me. Marco, thank you, in case you read the transcript of this.
Kevin Scott
Yes. Look, I mean again you all will have to be the judge of this over the coming weeks or months, but I think there’s some super exciting things that are coming, coming in the last half of this year that lots of folks have been working super hard on. I don’t see — just I’ve said over and over again, like I don’t think we’re at the point of diminishing marginal returns on the current paradigm.
I think we have — I think we have a pretty clear path to building infrastructure that is — stays on the same path of performance gains that we’ve seen. And like multiple dimensions. So it’s capability, it’s cost, it’s like sort of power performance of systems. It’s like just sort of a bunch of things and like entire ecosystem of really smart people tackling all of the different parts at all the layers of the stack just trying to improve things.
I mean that said, you are not wrong to suggest that there probably are some disruptive innovations that will change things again, like we should hope for it. Like I hope the transformer is not the last interesting discovery in ML architectures. I hope not. And we even have a proof point.
We all have 2-watt AI machine sitting between our ears, which is dramatically more efficient than the things that we’re building right now. So you should hope that we make some discoveries over the intervening years that bridge the gap between what we’re doing and what biology knows how to do.
So the things are not independent. We’re not, at least from what I can see, we’re not at the point where we’re about to stall on progress of the increasing the capability of the infrastructure because we don’t have the next idea for what needs to be done to make it more powerful.
Kash Rangan
Got it. There’s been a lot of talk about small language models versus large language models and the role and the relative positioning of these 2. How do you shake out on this SLM versus LLM? And a follow-up question that I wanted to ask, about open source. Where does open source fit into all this as well?
Kevin Scott
Yes. I mean we can sort of like start with the fundamental principle that I think answers them both, like I’m pro-developer. I think do whatever you need to do to get the interesting application written so that you can do something someone cares about. Being dogmatic about what tool you use to solve a problem is kind of nuts.
And you can practice a bunch of wishful thinking about what you would like developers to do, but I’m sure you all know developers, they’re the orneriest people on the planet, highly opinionated and they’re going to experiment with a bunch of different things and they will they will choose the tool that makes the most sense for them to solve the problem that they’re trying to solve.
So in that sense, like you even look at Microsoft developers building Copilots. The way that a Copilot is architected is that you are typically using a frontier model to do the bulk of the interesting work there. But you also use a bunch of smaller models in parallel like you have a fairly robust orchestration layer that decides how to route-request which model to let you achieve the performance that you need on a bunch of different dimensions for the application you’re trying to write.
Like sometimes you need to send something to a small model because you’re not going to get a better answer from the large model and it’s just much cheaper or much faster to make the inference request to the small thing. Sometimes you’re trying to do something on device locally, you don’t — you can’t afford a network call into the cloud to invoke a large model.
And so I think having that flexibility to architect the actual AI applications using the right mix of models is an important thing for developers to be able to have. But the large models are very important and they are the things that are — I mean, they sit on the frontier. And so again, if you are looking at building the most ambitious thing possible, you, I think, need to have one of them in your portfolio of tools so that you can be attempting only the things that they enable.
But it’s not a dichotomy, like not either/or. And same thing with open source. I think open source is just making tremendous progress. It’s super encouraging as a developer, I think, to see how fast the open source community is building really interesting things. And there are a bunch of people at Microsoft and Microsoft Research inside of my team who are building things like Phi, which is like a really capable SLM. And it’s open source for folks to use as they choose.
And again, with developers, they just — they want choice. They want to be able to experiment. They want to — they don’t want to be told what their toolset is. They want to be able to experiment and choose.
Kash Rangan
Got it. So at Goldman, we have this acronym IPA, infrastructure build-out, then platforms and applications. That’s where we’ve seen other computing cycles more or less evolve. Do you see that as a similar way in which generative AI progresses or am I hallucinating?
Kevin Scott
Yes. Look, I think — so what I’m seeing right now, and this is the place where it is like a really unusual revolution. So like you definitely have these things that are that are sort of independent of one another execution wise.
So there’s a bunch of infrastructure stuff that you need to go pour concrete, sign leases, get power in place, solve a bunch of electrical engineering problems, solve a bunch of cooling problems, like get the right silicon and the right places design the network fabrics. And all of these things operate on different time lines. And then you have the software stack that sits on top of that like your low-level systems layer. And then you have your middleware and applications stacks that sit on top of that.
Things are moving so fast right now that it is kind of hard to imagine a world where you go do the infrastructure build-out and you wait until it’s done until you make the substantial decisions and deployments on the next layer up.
So all of this stuff is really feeding into each other in a way that I haven’t quite seen before, where you are just making big decisions on things that really want to move super slow, but where you have to make the move fast because the technology is just sort of evolving at such an incredible pace. It’s really, really interesting. And I will say, I think you guys have Jensen coming on later.
Kash Rangan
Tomorrow morning, yes.
Kevin Scott
Yes. I mean, everybody in the ecosystem is moving materially faster right now than they were 3 or 4 years ago, materially faster.
Kash Rangan
Because the science and the technology is moving rapidly or is it — what is driving that?
Kevin Scott
Look, I think it’s the feedback loop. I think you’ve got a bunch of really smart people who can respond very well to urgency. And the place that we’re in right now with infrastructure is people ask all the time, are you building too much or building too little? And so far, it’s —
Kash Rangan
That’s what they want to know. Are we going too fast, too quickly, how much are you going to spend?
Kevin Scott
Yes. I want to know it too. But so far, demand for the infrastructure has materially outpaced our ability to supply it. And so like we are building at a pace where based on our understanding of where demand is going to be, we’re trying to sort of intercept things. Like I just said, there are a bunch of slow moving things in the equation that we’ve just really had to try to make move much, much faster than they were moving before.
And the thing that I will say, I think the whole ecosystem is responding incredibly well. Do I wish it were faster? Yes, I wish it were faster. But thank God, it’s so much faster than it was like 4 years ago, because we would really be in a pickle then.
Kash Rangan
Got it. I want to get your views on compute costs. Generally with tech cycles, things — the underlying inputs become cheaper. You got mass market, standardization, et cetera. How important — given the high cost of compute here, how important do you think it is to bring down compute costs? And if you think it is, what are the developments that might support that view?
Kevin Scott
Yes. So it’s super important always to bring down the cost of compute. One of the things that has supported all of these platform revolutions that we talked about, personal computing, Internet, smartphone, cloud, all of them has been this ability from silicon to networks to like the low-level software layers that empower the layers running on top of them to get exponentially cheaper over time.
And I think we are definitely seeing that. I don’t know exactly what the number is right now, but back in May when I was giving my keynote at Build, the anecdote that we gave the audience was, at that point in time, GPT-4 had gotten 12x cheaper like per token to do inference on than at launch time.
And so part of that is because the hardware had gotten a lot better. And part of it is because the infrastructure had been tuned within an inch of its life, everything from numeric kernels where people are writing the moral equivalent of a bunch of assembly code to like extract every ounce of performance out of the chips that you’ve got. And then just foundational improvements to the algorithms that are running on top of the like hardware layer. And I think this is just going to continue over time.
So the important thing to realize is things are already getting on a price performance basis, like way cheaper than they were. And there’s super good stuff coming from hardware to system software to algorithms that should keep that trend moving. We just got to keep pushing super hard on it, because like if you really, really want all of this stuff to be ubiquitous, you need it to be as cheap as possible so everyone can use as much of it as makes sense.
Kash Rangan
Kevin, you piqued my interest by saying super good stuff coming. So to the extent that you can share with us, what is the high-level super conceptually maybe the things that are giving you that conviction?
Kevin Scott
Unfortunately, very little that I could talk about.
Kash Rangan
That’s all right. I don’t want to create problems. We’ll just take that as a granted.
Kevin Scott
Yes, if we were off the record, I can share. So yes, look, I think the thing that ought to give everyone, we’ll have things that are coming shortly that will be super visible that I think will be very encouraging for people looking for signs of progress. But you can just even — even looking at what’s happening on a week-by-week basis is like all of this competition is happening where Meta is doing great stuff with Llama and Anthropic is doing super good stuff.
And Google is doing — I mean, so there are these objective benchmarks for how everyone is performing, and people, because of competition and because the science and the sort of engineering is happening at such an incredible pace, like just every week things are getting better.
And the point that I — the point that I have been trying to make for a while to all of the folks inside of Microsoft is there is a weird nature of how the frontier progresses, which is you go build gigantic supercomputing environments, which are like big capital projects that take a very long time to build, and then you put them in the hands of people who are going to train a frontier model on them. And then they optimize their workload on that environment and then they do this extraordinary work. And then you get a new frontier model.
And because of the nature of the way all of this unrolls, is you’re just applying dramatically more compute to the problem. And it just happens in these like increments because of the way that you’re building all of the systems.
And so, yes, the thing that people forget sometimes is between the updates, you can get into this mode where you’ve convinced yourself, well, progress is only linear, this benchmark only got this much better, and you sort of forget that like, you look at our partner, OpenAI, what the big updates have been, like the jump from GPT-2 to 3 and from 3 to 4, and I can’t say anything about like what’s next and when, but like it’s not like work stopped after GPT-4.
So you — the thing that we have seen for the past 6 years with generative AI is like, every couple of years or so, just because of the lockstep way that all of this stuff gets built, you get a major new update to the frontier.
Kash Rangan
So [Brett and Kendra], would love to, when he’s ready to officially announce the good stuff, we’ll love to host you back at our Goldman Sachs AI symposium. Just putting it out there. Always putting a plug for the firm. How dependent is your AI strategy on OpenAI? Because you also have your internal AI with the CEO of AI. How do these things work with the…
Kevin Scott
Yes. I think OpenAI, by any objective measure, has been one of the most consequential partnerships Microsoft has ever had. And we’re a company that’s had a lot of consequential partners. So we’re super grateful.
And I think we’ve been able to do things in a pretty extraordinary way just because it’s like 2 really capable companies trying to push a big platform shift forward rather than one trying to do everything. So we don’t even think about it in to say like we’re sort of super dependent and like there’s — it’s like — it’s a complicated bag of problems that we’re collectively trying to solve.
And just like with the PC revolution where you had Microsoft and Intel and a whole bunch of OEMs like doing this — I mean you just sort of think about this is before my time at Microsoft. I’ve only been there for like a little under 8 years now.
The mission of the company since, like at the point where it was founded, was to put a personal computer on every desk and in every house. And that’s at the time where people didn’t even know what a personal computer was. And so through that partnership, the entire ecosystem was able to deliver that mission, which is just completely audacious.
And I think this is another mission like really unlocking the full potential of AI to create value for people everywhere is another equally large thing. I just don’t think it gets done by like 1 entity. It’s a lot of people working very hard in the same direction.
Kash Rangan
And hence, that’s why you have your own AI CEO internally and then you have —
Kevin Scott
We do. I mean Microsoft has had AI researchers working on AI since the 1990s. We’re working on artificial intelligence when I was an intern at Microsoft Research in 2001.
Kash Rangan
You interned at Microsoft Research.
Kevin Scott
Yes. Microsoft Research reports to me now, and 23 years ago, I was an intern at Microsoft Research.
Kash Rangan
Any intern at Goldman Sachs, just take that as an inspiration.
Kevin Scott
But like — so there’s a lot of AI that Microsoft is doing that is very complementary to what OpenAI is doing. And we were doing it before, it’s going to continue for the foreseeable future because it’s a really large surface area. There are a lot of problems that need solving.
Kash Rangan
Good to know that. This, again, I’ll preamble this thing, this is a Marco Argenti question, so it’s going to sound very erudite. Right, Belinda? We seem to be moving chatbots to agents very quickly. What’s the vision with regards to AI performing more and more complex long-running tasks? Do we see a future where AI-powered agents will be able to perform tasks that require planning, decision-making and execution across multiple environments and systems? This man, what a beautiful question. This is like poetry, right?
Kevin Scott
Yes.
Kash Rangan
That’s why I had to give credit to Marc.
Kevin Scott
So the answer to the question is yes. And I guess why do I believe that?
Kash Rangan
Yes.
Kevin Scott
So look, I think one is like it’s just necessity. So in order for AI systems to be fully useful, they do need to be able to do longer-term planning. They need to have memory. They need to be able to actuate products and services and interfaces on behalf of users.
And I think there’s a bunch of good work that’s been happening, everything from orchestration layers where like you’re giving developers really good frameworks for figuring out how to uplift the basic capabilities of models to help them do more of this sort of long-range like multistep tasks.
And then the models themselves, I think, are getting like more capable of synthesizing plans on their own. I mean you can even see a little bit of this, like if you go to a ChatGPT right now and you ask it to give you a plan for something, like you can articulate like pretty comprehensive plans for very complicated things. And so the next thing that you would want after that is for the agent to be able to say like, “Okay, I see the plan, go do that.” And I think that’s…
Kash Rangan
That’s what’s next.
Kevin Scott
Yes. Look, I think lots of people are working on filling out that hole and the capability of these systems. So yes, I think lots of good stuff coming on that front.
Kash Rangan
Got it. We have 2 minutes. I don’t have any more questions. Is there a question that you want to create yourself and answer for yourself?
Kevin Scott
Oh, God.
Kash Rangan
You’re a prompt engineer, right? I mean, so.
Kevin Scott
Yes. I don’t know. So I think the — not necessarily a question, but just a thing that I will leave everyone with. I think the thing that we’ve seen over and over again with building hard things is you want to strike the right point between aggressive pessimism and aggressive optimism. You just don’t want to get yourself caught up in hype in either direction.
So the thing that we, inside of Microsoft, is like we’re trying to do very hard things with these very complicated models is you want teams to be as ambitious as humanly possible and how they’re putting this stuff to work. You really want to find the things that are like just went from impossible hard.
You probably don’t want to like spend a whole bunch of your energy doing a bunch of incremental things because optimizing an incremental thing when the underlying model infrastructure is getting more and more powerful so quickly probably means that the model is going to be able to do a whole bunch of this incremental stuff.
And this was a lesson we learned years ago, like in the very, very early days of our partnership with OpenAI. I would have teams inside of Microsoft that would like take GPT-3 and they would go build this, fine-tune, super optimized thing, and it was like 3% better than it on some benchmark and a little bit cheaper. And they’d be like, yay, victory. And then GPT-4 came, it’s like, crap, like what do we do now?
Kash Rangan
We do better.
Kevin Scott
Yes. So you just — you want to be on the frontier with your ambitions. And like it’s a good spot to be.
Kash Rangan
That’s great. We are right at the top of our allocated time. On that note, thank you so much for giving us a perspective.