FOD#53b: Exclusive Insights: Microsoft CTO on AI, Copilot, & Democratizing Software
Reporting from Microsoft Build in Seattle
A small coffee shop, just a 2-minute walk from the Seattle Convention Center, is running at full speed. A line of people in North Face fleece jackets with lumpy backpacks is calmly shuffling from foot to foot, waiting to order.
“The Convention Center people didn’t notify us about any events!” says the girl behind the coffee machine, moving incredibly fast.
Microsoft Build is in town, meaning that over 4,000 people have flooded downtown Seattle.
The day before the official start of the conference, Microsoft announced an AI personal computer (Copilot+ PC), potentially breaking the spell of Mac’s dominance in the laptop industry. It might be a fundamentally different way from how we used to use a computer before.
To qualify as a "Copilot+ PC," a computer must have CPUs, GPUs, and NPUs capable of over 40 trillion operations per second (TOPS), at least 16 GB of RAM, and a 256 GB SSD. This sets a high performance baseline, excluding devices like the MacBook Air, which only has 18 TOPS and starts with 8 GB of RAM.
On the first day of the conference, May 21, Satya Nadella and the team spent over two hours introducing 50+ releases, updates, and developer previews. Without attempting to cover it all, here are four topics that stood out as the most exciting developments:
Phi-3 Family
New Models-as-a-Service
As a bonus for Turing Post readers, I’ll provide commentary on some of these from Microsoft CTO Kevin Scott, thanks to Microsoft's communication group, who organized a meeting with him and a few other publishers.
It’s all about scaling compute
Satya Nadella comes on stage, and four thousand people burst into applause. And he is crushing it: there is no awkwardness in him that’s often the case with tech company leaders. He is calm and soothing, but you see the steel core underneath. He shares that the rate of diffusion is unlike anything he’s seen in his life. Then he summarizes what has been done recently: “What we have built with the activity scaling laws is a new natural user interface that's multimodal, supporting speech, images, and video as both input and output. We have integrated memory that retains important context, recalls our personal knowledge, and accesses data across our apps and devices. Additionally, our new reasoning and planning capabilities help us understand complex contexts and complete complex tasks, all while reducing the cognitive load on us.”
He speaks a lot about scaling laws, and compute is obviously one of Microsoft’s focuses. Satya Nadella's ability to amass compute power and forge powerful partnerships is remarkable. They use the leading chips from NVIDIA, AMD, and also their own Arm-based virtual machines (VMs): Azure Maia (update) and Azure Cobalt 100 (public preview).
Azure Compute Fleet simplifies managing Azure compute resources, highlighting Microsoft's commitment to optimizing cloud infrastructure for AI. It enables efficient deployment and management of up to 10,000 VMs with a single API call, enhancing performance, cost, and operational efficiency through automation.
This compute power lays the foundation for the Copilot Stack, enabling Microsoft to weave AI into every aspect of its infrastructure. This transformation is driving AI mass adoption on an unprecedented scale. With hundreds of millions of users, including many corporate clients, Microsoft’s AI integration is becoming almost coercive, compelling widespread adoption.
Windows Recall
One of the features that was introduced with the new Copilot+ PS was Windows Recall, which aims to enhance user experience by remembering and anticipating user actions on PCs. The new feature is designed to act as a "photographic memory" of a user's virtual activities. When I mentioned that it sounds scary, Kevin Scott replied that the interesting thing with Recall is that they've engineered the system to provide a lot of transparency and control over what it remembers. It includes controls to help you make the system forget things you'd rather it not remember. “For example, sometimes I write angry emails when upset and regret sending them. Now, I try to delete these drafts instead of sending them, recognizing that not everything should be remembered by my computer.”
He also added that transparency and control are key, but they also need to monitor the state of these systems. As memories grow longer, they can influence current actions in unexpected ways. This is a significant engineering challenge for systems with long-term memories.
Recommended by LinkedIn
Copilot stack
It’s funny to think, but Copilot could have never happened. “It's crazy,” says Kevin Scott. “We almost didn't do Copilot because the first folks who saw GPT-3 writing code thought it was really bad, with an acceptance rate of only 25-30%. People thought this would never work, and no one would want to use it. But we were like, yeah, but it went from zero to 30% instantly! It's not about where it is right now, but the trajectory it's on.”
At Microsoft Build 2024, a key focus was on the expansion and enhancement of Microsoft Copilot across various platforms and services. Copilot, powered by AI, is designed to assist users in increasing productivity, streamlining tasks, and enhancing user experiences through intelligent automation and support. Microsoft assumes that programming in natural language will continue to lower the barrier to entry for anyone who wants to build software.
For the everyday user, Copilot becomes a companion that Clippy could never be. Turns out Clippy was just too ahead of its time, can you believe it?
GitHub marketplace with Copilot extensions
One of the most interesting announcements related to Copilot family is GitHub Marketplace with Copilot extensions. Apps can be deployed on Azure directly within GitHub Copilot Chat:
Democratizing software developing too much? How will it affect junior developers?
Kevin Scott: “The world needs vast amounts of software, and despite fears over the years, there remains a significant demand for developers. Having programmed for 40 years, I’ve seen recurring concerns about jobs disappearing, but the reality is a persistent deficit in our ability to create necessary software. Junior developers shouldn't worry; instead, embrace change and continuously learn new skills. Although some skills remain fundamental, many others become obsolete as technology evolves. Success in this field comes from being curious, adapting to new trends early, and leading others through changes. Previously, complex tasks required advanced degrees and extensive effort. Today, a high school student can achieve the same results quickly. This democratization of technology excites me, as it allows more people to create software that solves problems.”
Team Copilot
Another worth-mentioning announcement was Team Copilot: an expansion from a personal assistant to a team assistant, aiding in meetings, group chats, and project management. New functionalities include managing agendas, taking notes, summarizing important information, and tracking tasks.
Microsoft, fueling the software development and business management of many enterprises, small and medium businesses, is perfectly placed to natively integrate AI into human lives.
Mustafa Suleyman, the newly appointed CEO of Microsoft AI, is responsible for making interactions with copilots even more human. Drawing from his experience with AI-shrink/emotional supporter at Inflection, Suleyman is working on building a new copilot mobile app (according to Kevin Scott).
New proprietary models and MaaS expansion
Phi-3 Family
Microsoft is a driving force behind the development of Small Language Models (SLMs), which are designed for efficiency and accessibility, making them ideal for mobile devices. While Microsoft isn't a dominant player in the mobile device market, their work on SLMs has broader implications for AI and could benefit various platforms, including their own Surface Duo and Windows devices.
A standout is freshly introduced Phi-3-vision, with multimodal capabilities that handle text and image inputs, supporting tasks like analyzing charts and graphs. It’s in preview and sized at 4.2 billion parameters. The other Phi family members:
You can find all Phi-3 models on Azure AI and Hugging Face.
SLMs adoption
To my question about how Kevin Scott envisions SLMs adoption, and considering that Microsoft does not have access to mobile devices like Apple does, he said there are two main points: the desire for local inference on mobile devices and Microsoft's success in building AI-powered mobile applications.
Read more about the new Models-as-a-Service and notable partnerships 👇🏼