Microsoft Build 2026：AI Agent 全栈时间线

Microsoft's annual developer conference came back to a live stage in San Francisco. Over two days it shipped 70-plus announcements — at first glance, a pile of unrelated product names. Line them up, though, and a single chain appears: Microsoft is turning "agents that do your work" from demo-ware into infrastructure that actually runs inside a company and stays under control. This report lays out that chain — and every claim links to Microsoft's own source.

🔬 The claim we set out to test

Build 2026's AI announcements aren't separate bets — they're parts of one machine: Microsoft is building, from the model up, a full stack that lets AI agents do reliable work while staying governable by the enterprise. We test two things: whether that chain actually holds, and what's usable today versus still a slide.

👥 Who this report is for

People who use AI to get work done

Developers, PMs, ops — what new tools landed, what you can use today, and what to wait for.

Start with → Models · Apps

People who build AI models / tools

Model vendors, open-source, tool startups — where Microsoft places its own models, and what it left open to third parties.

Start with → Models · Runtime · Governance

Azure partners / sales

Pitching this to customers — what's GA (sellable today), what's preview, and how the pieces combine.

Start with → Runtime · Knowledge · Governance

People tracking where AI is going

Investors, analysts — Microsoft's overall play, and where the hype needs cooling.

Start with → Throughline · Adversarial check · Conclusion

First, two myths going around that we'll clear up

The moment the conference ended, claims started spreading online and in the press. We checked each one against the official source documents and found that the two below are either wrong or overblown. Let's set them straight so you don't get misled:

① Inaccurate

"Copilot was rebuilt into a platform that runs OpenAI, Anthropic, and open-source models all at once"

The thing that lets you freely swap in open-source models is Foundry, the developer product — not Copilot, the consumer one. The two are being conflated.
"Rebuilt" is the media's word; Microsoft itself only said it "added a few more model options." Nowhere near that dramatic.
And Copilot's ability to switch models has been around since March this year — it's not news from the June conference. Using Anthropic's models still requires an admin to flip a switch and is limited to certain regions.

② Debunked

"Windows' AI agent framework officially launched at this conference"

What actually "officially launched" is the cross-platform Microsoft Agent Framework, and it shipped back in April this year — not at this conference.
The Windows-specific stack shown at the conference (Windows Agent Runtime and the like) is still alpha / preview, rolling out only with the Windows 11 update in the second half of the year.
In other words: nothing called a "Windows AI agent framework" actually landed at this conference.

"AI won't transform your business — the system that runs it will."

— Microsoft CEO Satya Nadella, opening Build 2026. His point: a smart AI alone is useless; what actually changes how you work is "the whole system that keeps the AI doing reliable work" — the model, the context it needs to understand, and the tools it can use. You need all three.

Source: Satya Nadella Opening Keynote (official Microsoft Developer) primary · official transcript primary · condensed Highlights · all sessions on demand

This "AI agent stack" runs bottom to top across six layers (drawn by 42-research from official announcements). Orange = parts already generally available / blue = mostly still in preview. We break it down layer by layer below.

01 Model layer · the "brain" at the bottom

Microsoft built 7 AI models of its own

In the past, Microsoft's AI was largely reselling OpenAI's. This time, Microsoft's own AI team (MAI, led by Mustafa Suleyman) put out 7 in-house models in one shot, spanning image generation, speech, transcription, reasoning, and coding. Put plainly: Microsoft wants to hold "the brain of AI" in its own hands and stop depending on others at every turn. This is the foundation of the whole system.

The seven MAI models shown on stage at the Build 2026 keynote — The seven MAI models debut on stage at the Build 2026 keynote. Image © Microsoft, via microsoft.ai.

new models (5 base + 2 Flash)

modalities: image / speech / transcription / thinking / coding

third-party distribution channels

The MAI seven-model family (hill-climbing machine) Announced · mostly preview

Led by Mustafa Suleyman / MAIVision "humanist super intelligence"Trained on Maia 200 chips

Microsoft frames the strategy as a "hill-climbing machine" (iterative self-improvement rather than one giant release), spanning five categories: image / speech / transcription / thinking / coding.
The full seven-model lineup: MAI-Image-2.5, MAI-Image-2.5-Flash, MAI-Transcribe-1.5, MAI-Voice-2, MAI-Voice-2-Flash, MAI-Thinking-1, MAI-Code-1-Flash.
★ First time distributing weights to third parties: beyond Foundry, they land on Open Router / Fireworks / Baseten, so customers can "tune the weights themselves" — a major signal of reducing OpenAI dependence. The voice models are clone-resistant and watermarked throughout; Microsoft claims 1.4× performance/watt on Maia 200 ⚠ not independently verified.

Source: Building a hill-climbing machine · MAI keynote transcript primary

MAI-Thinking-1 (the model that "reasons") still in limited preview

Size mid-range (~35B parameters)Memory span ~250K-word contextTraining not copied from anyone else's model

This is Microsoft's first model built to "think and write code." Microsoft claims it gets about half right on the hardest coding benchmark (SWE-Bench Pro 53%), on par with the top-tier Opus 4.6, and nearly aces competition math problems ⚠ all tested by Microsoft itself.
But right now it's open only as a limited preview — and Microsoft's own two documents conflict (one says preview, one says GA); we go with the more specific one (preview).

Source: MAI keynote transcript primary

The rest: coding / image / transcription / voice some usable now, rolling out

MAI-Code-1-Flash for coding: very small (5B parameters), yet claims to get half right on coding benchmarks (51%) and cost less than peers ⚠ Microsoft's own number; already in VS Code and other dev tools.
MAI-Image-2.5 for image generation: Microsoft claims it ranks #2 on image editing, beating Google's Nano Banana 2 ⚠, and it already works inside PowerPoint. MAI-Transcribe-1.5 for transcription: supports 43 languages, claimed 5× faster ⚠. MAI-Voice-2 for speech synthesis: 15 languages.

Source: Building a hill-climbing machine primary

What this means for you

If you use AI

A few more models to pick from — especially MAI-Code, which is small and cheap, perfect to drop into everyday dev tools.

If you build models / tools

Note this one: Microsoft put its own model weights on third-party platforms like Fireworks and OpenRouter for the first time — it's grabbing for "model distribution" turf.

Azure partners

You've got a new card to play: "Microsoft-built, cost-controlled." But most of it is still in preview, so don't rush to promise customers ship dates.

02 Runtime layer · where AI actually "runs"

Foundry: the "factory" that pushes AI agents into production

Plenty of AI agents demo well but never dare go live — they're unstable, insecure, and impossible to debug when something breaks. Foundry is Microsoft's answer: a "production floor" where AI agents run reliably and can be monitored and governed. This is the most concrete product core of the entire conference.

The Build / Deploy / Operate layers of the Microsoft Agent Platform (drawn by 42-research from the Foundry devblog).

11,000+

model catalog (Microsoft's claim)

Model Router · Fireworks

1:1

isolated sandbox per session

Foundry: a platform that runs any model in preview

Models you can run OpenAI · Anthropic · open source · Microsoft's own

It used to be called Azure AI Foundry; this release upgrades it into a unified platform. Last week it added OpenAI's realtime voice and Claude Opus 4.8 too.
"Model Router" (now GA) automatically picks the best-fit model for your need and budget — so Foundry isn't locked to one vendor; it calls whichever works best, and elastically allocates GPU compute on demand.
The open-source model hosting service Fireworks is also fully integrated; Microsoft claims it handled 176 billion calls and is used by 17 Fortune 500 companies during preview ⚠ Microsoft's own number.

Source: What's new in Foundry · Foundry Models primary

Agent Service: the "runtime environment" that hosts AI agents preview · GA timing uncertain

Tool-agnostic connects Microsoft / GitHub / LangGraph and more

Each AI agent runs in its own "isolated room" (dedicated compute, memory, and file space, no interference), able to hold up under overnight, long-running, work-on-its-own tasks.
It can also give AI agents an "identity" like an employee — issue a badge, an email, a presence in Teams, a slot in the org chart; they can also run on a schedule or call one another (all still in preview).
It gives AI a "memory": it can remember how to do things, your preferences, and the current conversation (Microsoft claims the "how-to memory" lifts task success rate by +7–14% ⚠ Microsoft's own number).
⚠ On when it goes GA, Microsoft's own two documents contradict each other (one says "within 30 days," one says "end of June"), and the official docs still marked it "preview" on June 19. As of our check it had not officially launched.

Source: Build and run agents at scale with Foundry primary

What this means for you

If you use AI

Want an AI agent that "works on its own and runs tasks overnight"? There's now an official hosted environment — no need to stand up your own servers.

If you build models / tools

It's "framework-agnostic" — you can plug in LangGraph and others. But it also means Microsoft wants to be the hub "everyone connects to."

Azure partners

This is the best "ready to ship" story to tell customers; Model Router and Fireworks are GA, while hosted agents are still racing toward GA.

03 Knowledge layer · making AI "understand your stuff"

Microsoft IQ: the four kinds of "knowledge" fed to AI

No matter how smart the model is, it doesn't know what's going on inside your company. Microsoft uses four "IQs" to feed different knowledge to AI: Work IQ (your email/meetings/files), Fabric IQ (your company's business logic), Foundry IQ (reusable knowledge bases), and Web IQ (the live web). Put together, that's what lets AI actually answer to the point.

Four IQ legs feed into a single grounded agent (drawn by 42-research).

+54%

recall (combined, vendor-tested ⚠)

6-16

Work IQ API endpoints GA

<165ms

Web IQ latency (Microsoft's claim ⚠)

Foundry IQ (unified knowledge plane) Knowledge Bases GA · Serverless Preview

Knowledge sources Work · Fabric · Azure SQL · File · MCPServerless $0.24/CU-hour · scale-to-zero

The "knowledge base" feature is now GA, and it's not limited to Microsoft's own — Claude, ChatGPT, and others can plug straight in. It turns "have the AI look things up before answering" (the industry calls it RAG), once a chore you had to build yourself, into an out-of-the-box service.
⚠ The widely quoted "+54% recall" needs its framing corrected: it refers to "evidence recall"; a pure knowledge base only gets +46%, and +54% requires a "small model + agentic retrieval" combination; benchmarked on BrowseComp-Plus with Microsoft's own gpt-5.4 ⚠ vendor-tested, "up to" is the upper bound.

Source: Foundry IQ · the original +54% recall benchmark primary

Work IQ + Work IQ API Preview · API GA 6-16

Captures how work actually happens (M365 email/meeting/file/Teams signals), providing a shared intelligence layer for all agents.
Work IQ API (endpoints GA set for 2026-06-16) is exposed three ways: A2A, remote MCP server, and REST; 10 general-purpose tools collapse hundreds of operations, independent of Copilot licensing and billed by usage.

Source: Work IQ + API primary

Microsoft Web IQ Limited-access Preview

Built on the Bing / Microsoft web index (Microsoft says it already serves 1 billion+ users), it delivers real-time, citation-backed grounding across web / news / images / video / shopping; model-agnostic and MCP-native; Microsoft claims sub-165ms · zero data retention ⚠.

Source: Announcing Microsoft Web IQ primary

What this means for you

If you use AI

AI can finally "understand your company's stuff" — read your email, files, and meetings — so the answers are reliable, not those of a smart stranger.

If you build models / tools

Microsoft turned "feeding data to AI" into a standard service (and it's compatible with Claude/ChatGPT) — part of the build-your-own-RAG work just got absorbed.

Azure partners

Knowledge bases are GA and a key selling point for enterprise adoption; but be clear that "+54%" is a vendor-tested number under a specific combination.

04 Application layer · the AI you'll actually use

AI agents you can see and touch

The layers below are all groundwork; this is the one you'll open every day: Scout (a personal assistant that tracks your tasks and preps your meetings), AI that collaborates inside Teams, and the GitHub Copilot desktop app that hands work off to programmers' agents.

An illustration of Scout proactively surfacing tasks (drawn by 42-research).

Microsoft Scout (always-on personal agent) Announced

A personal agent that lives in M365, proactively surfacing tasks, prepping meetings, and drafting work, operating across apps and data (grounded by Work IQ).
This is the most "visible face" of the Microsoft agent platform — it gathers the capabilities of the three layers below into a single entry point users open every day.

Source: Introducing Microsoft Scout primary

Collaborative agents + Frontier Tuning + RLEs Preview / Announced

Collaborative agents: multi-agent orchestration where work happens (Teams / M365), with agents coordinating among themselves and with people, built on the M365 Agents SDK.
Frontier Tuning: customize MAI models on your own data via reinforcement learning environments (RLEs) — which Microsoft calls an "AI training gym"; Microsoft claims Excel / McKinsey / Land O'Lakes tasks match or beat GPT-5.4/5.5 at 10× lower cost ⚠.

Source: Collaborative agents · Frontier Tuning primary

GitHub Copilot app (agent-native desktop) Announced

A brand-new desktop app that's agent-native: delegate coding tasks to agents from the desktop; the keynote positioned GitHub as "the control plane for all agents".

Source: GitHub Copilot app primary

What this means for you

If you use AI

This is what you'll actually open every day: Scout tracks your tasks, preps your meetings, and drafts; developers get a desktop tool dedicated to handing work off to AI.

If you build models / tools

"Frontier Tuning" lets enterprises tune a bespoke model on their own data — customization is the opportunity at this layer, and the competitive battleground.

Azure partners

Scout and collaborative agents are the best story to tell business teams; most are "announced," so be clear with customers about ship dates.

05 OS layer · AI moves into your computer

Windows draws a "safety fence" around AI agents

If an AI agent is going to run on your own computer, the OS has to rein it in first — it can't be allowed to rummage through your files or change settings at will. That's the "fence" Windows is building here. A heads-up: this is also the layer the media overhyped the most — most of it hasn't actually shipped.

Windows gives the agent a permission boundary; it can't freely access the whole system (drawn by 42-research).

Windows platform security for AI agents Announced · dedicated stack is alpha/preview

Introduces a security model for agents running on the OS: scoped identity / permissions / audit / isolation, so an agent can't freely access the whole system.
⚠ Untangle two things that get conflated: what's GA is the cross-platform Microsoft Agent Framework (MAF) (.NET/Python SDK, which hit 1.0 GA on 2026-04-02, before Build); the Windows-specific stack shown at Build (Windows Agent Runtime / MXC / OpenClaw on Windows) is alpha / preview, shipping with Windows 11 26H2 (second half of 2026). "Windows Agent Framework is GA" does not hold.

Source: Windows platform security for AI agents · MAF (1.0 GA on 4/2) primary

Project Solara + Surface RTX Spark Dev Box Announced

Project Solara: a new platform for agent-first devices, rebuilding the device/OS stack around "the agent as the primary actor" — a long-horizon vision.
Surface RTX Spark Dev Box: dedicated local AI development hardware that runs models/agents locally on NVIDIA RTX — the "local counterpart" to cloud Foundry. MAI models will also reach the N1X on Windows "within a few months."

Source: Project Solara · Surface RTX Spark Dev Box primary

What this means for you

If you use AI

In the future an AI agent can run on your own computer, locked in a "safety cage" — it can't rummage through your files and system. But most of this waits for the second-half Windows update.

If you build models / tools

Local AI hardware (Surface Spark) and a device platform (Project Solara) are a new battleground, but it's all early days.

Azure partners

⚠ This layer is the easiest to overhype — aside from one security model, the dedicated stack is basically preview/vision. Don't sell it as an off-the-shelf solution.

06 Governance layer · making the boss comfortable using AI

How to rein in AI so it can come into the company

When an enterprise puts AI to work, its biggest fears are "who's accountable when it errs, will it leak data, is it worth the money." This layer answers exactly those: set rules for AI, log everything, evaluate, and account for the return. And Microsoft open-sourced this toolset without locking it to its own platform — it wants to be the industry's "safety standard."

Agent Control Spec applies deterministic guardrails at 5 checkpoints in the loop (drawn by 42-research).

Two open "AI control standards": ASSERT and ACS ASSERT is open source

Who can use it not limited to Microsoft; any vendor's tools can plug inAlready adopted by KPMG · Zscaler · IBM and others

ASSERT (open source): helps you translate your company's rules into "automated checks," then run those checks against the AI to find its faults. Microsoft explicitly says it's not tied to its own platform — it's meant for AI developers across the whole industry.
ACS: a general standard for setting rules for AI — placing a checkpoint at each of the 5 key steps of an AI doing work (receiving the instruction, calling the model, changing state, using a tool, producing output), with rules portable across different platforms.

Source: Open trust stack (ASSERT/ACS) primary

Cross-framework observability (GA) + in-Foundry guardrails + Agent ROI Tracing & Eval GA · rest Preview

Tracing & Evaluations hit GA (6/3, built on OpenTelemetry, supporting LangChain/LangGraph/OpenAI SDK/MAF), with end-to-end tracing of prompts/models/tools/sub-agents.
Inside Foundry: Guided Guardrail Setup, Rubric Evaluator, Runtime DLP (all Preview), Purview Insights hit GA; Agent Optimizer (private preview → public this month); ROI for Agents (private preview) tracks completion rate / time saved / cost.

Source: Observability to ROI primary

Security across the SDLC (Agent 365-style governance) Preview

Protect code / agents / models (Defender + Entra Agent ID + Purview); every agent has an identity and is governed, including posture management and threat protection for AI workloads.

Source: Securing code, agents & models across the SDLC primary

What this means for you

If you use AI

When AI comes into the company, the boss worries most about "who's responsible if it errs, will it leak data." This layer answers that — traceable, governable, with the books balanced.

If you build models / tools

Microsoft open-sourced this set (ASSERT, ACS) and didn't lock it to its own platform — it wants to set the industry's "safety standard." Worth watching whether it catches on.

Azure partners

"Secure and governable" is a key chip for winning over large enterprises; cross-framework observability is GA, so you can pitch it directly.

07 Data and science · already shipping

Even databases are being "remade for AI," and R&D is already using it

Beyond the main thread there's another signal worth watching: even databases are being optimized for AI (better storage of memory and retrieval). More concrete still is Microsoft Discovery — one of the few things at the whole event that's "already GA," and it's already helping Mayo Clinic do medical research, not just demos.

Databases become agent-aware; Discovery is first to GA (drawn by 42-research).

Fabric IQ + Cosmos DB + Azure HorizonDB Preview

Fabric IQ models how the business operates (a semantic/ontology layer — the "business" leg of Microsoft IQ); the GPU-accelerated Fabric warehouse claims 7× performance ⚠.
Cosmos DB adds an agent memory toolkit / agentic retrieval / semantic reranking; Azure HorizonDB (managed PostgreSQL): zone-redundant, automatic failover, 128TB/cluster · 15 read replicas · 3× throughput ⚠.

Source: Agentic apps with Fabric & Databases primary

Microsoft Discovery GA

The agentic AI platform for scientific R&D hit GA (one of the few non-preview announcements at the whole event), with the Discovery app entering preview.
Used in the Mayo Clinic frontier medical-AI partnership (whose platform claims to cover roughly 100 million people across four continents) — agentic AI genuinely landing in science and medicine.

Source: Microsoft Discovery GA primary

What this means for you

If you use AI

Even databases are now "optimized for AI" — storing memory and doing retrieval more smoothly. The more tangible signal for everyone: Discovery is already helping hospitals do research.

If you build models / tools

The data layer (Fabric, HorizonDB) is the foundation of AI applications, and Microsoft is betting heavily here too — an opportunity up and down the stack.

Azure partners

Discovery is one of the few things at the event that's "already GA," with a flagship case like Mayo Clinic — the most mature solution to pitch.

We deliberately poked 5 holes

To avoid being swept up by the hype, we deliberately played devil's advocate — picking the 5 loudest claims and checking each against the official source documents to see if it holds. The conclusion: the broad direction is real, but some "highlights" don't survive the question "is it actually usable yet?" — one is outright debunked, and the rest are mostly "still in preview" or "vendor's own numbers."

Exactly 7 MAI modelsnote this

"Seven" is confirmed word-for-word by two primary sources, but = 5 base + 2 Flash variants; "announced" is accurate, "launched / already GA" is overstated.

Copilot spans OpenAI/Anthropic/open-source multi-modelneeds correction

Open-source models are a Foundry capability, not in the Copilot product; "rebuilt" is a reporter's phrasing; and this dates to 2026-03, before Build.

Foundry Agent Service GA in early Julytime-sensitive

Two official sources conflict on timing ("next 30 days" vs "end of June"); Learn still marked it preview on 6/19.

Windows Agent Framework is GAdebunked

What hit GA is the cross-platform MAF, and it was GA on 4/2; the Windows-specific stack is alpha / preview.

Foundry IQ recall +54%vendor benchmark

A pure knowledge base only gets +46%; +54% requires a "small model + agentic retrieval" combination; vendor-tested, not independently verified.

✅ The one-line takeaway

The main thread we set out to verify is real: the pile of Build 2026 announcements really does string together into one complete bottom-to-top chain — models → production floor → knowledge → apps → OS → control — and every link can be traced to a Microsoft source document. Nadella's line, "a smart AI alone is useless; you need the whole system that puts it to work," is the design logic of this chain.

But don't let the hype go to your head: most of this chain is still in "preview," not usable right away (only a few things are truly GA: Microsoft Discovery, Foundry knowledge bases, Model Router, cross-framework observability); the eye-catching numbers (+54%, 10× cost savings, etc.) are all tested by Microsoft itself, with no third-party verification; and we corrected two myths going around. The blueprint is clear, but whether it ships remains to be seen. The most honest line we can give you: the vision is "a complete AI agent infrastructure," but what you can actually bet on and use today is only the handful of pieces that have officially shipped.

📚 Citations and sources (primary-first; full list in sources.md)

Build 2026 official news hub (master index) primary

MAI keynote official transcript · the 7 MAI models primary

What's new in Microsoft Foundry · Foundry Agent Service primary

Foundry IQ · +54% recall benchmark primary

Open trust stack (ASSERT/ACS) · MAF (1.0 GA on 4/2) primary

Work IQ + API · Microsoft Web IQ primary

Microsoft Scout · GitHub Copilot app · Copilot multi-model (before Build) primary

Windows platform security for AI agents · Project Solara primary

Agentic apps with Fabric & Databases · Microsoft Discovery GA primary

Build 2026 Satya Nadella Opening Keynote (official full video) · official session catalog primary

⚠️ Limitations and open questions (research honesty)

Whether Foundry Agent Service / hosted agents hit GA on schedule — "next 30 days" vs "end of June" conflict between two official sources; Learn still marked it preview on 6/19; revisit after July to see if it was delivered.
The true status of MAI-Thinking-1 — the transcript calls it private preview, the blog overview calls it GA; the in-source conflict is unresolved.
The maturity and final naming of the Windows-specific agent stack — shipping with Windows 11 26H2; we won't know until the second half of 2026.
All vendor benchmarks (the various MAI figures / +54% recall / 10× savings / 7× warehouse / 3× HorizonDB) have no independent third-party reproduction and are presented only as "Microsoft's claims."
Images marked "© Microsoft" are externally linked official assets (the hero key visual and keynote stage photo, both from microsoft.ai, with source attribution, used for commentary/research); the rest, marked "drawn by 42-research," are self-drawn SVG diagrams (self-contained, vector, no external dependencies).
The official hub has 70+ announcements; this topic focuses on the ~25 on the AI main thread; non-AI-core items (Azure Cobalt 200 / Azure Linux 4.0 / quantum Majorana 2, etc.) are not included in the main body.

42-research · This artifact is self-contained semantic HTML following schema.org/ScholarlyArticle. The source of truth lives in git, the query index in D1; for methodology see docs/methodology. All text traces back to Microsoft's primary sources, collected 2026-06-22.