Understanding AI Costs in Microsoft Fabric & Monitoring Usage
Understanding AI Costs in Microsoft Fabric & Monitoring Usage
Created on 2026-05-07 13:57
Published on 2026-05-07 16:22
Over the last months, I’ve been writing a lot about Microsoft Fabric, Data Agents, and the economics behind modern analytics platforms like Microsoft Fabric and Databricks. You can find all the articles in my profile on Linkedin or discover them easier on http://www.thedatamassagist.com/
But there’s one question I never covered in detail — even though it comes up in every customer conversation:
How does Microsoft charge for AI in Fabric? And equally important: How can customers monitor the tokens used by Copilot for Power BI, Copilots in Fabric, and the built‑in capacity that powers Data Agents and Operational Agents?
This article finally answers that question clearly, practically, and without marketing fluff.
Why This Matters Now
AI is no longer an add‑on to analytics platforms — it is the interface. Fabric’s Copilots generate pipelines, SQL, Spark code, DAX, reports, and even entire Data Agents. But AI is also compute. And compute is cost.
Customers want to know:
What exactly am I paying for?
How do I monitor it?
How do I estimate the cost of AI workloads?
How do Data Agents consume capacity?
The good news: Microsoft Fabric uses a single, unified billing model for all AI features.
Before moving forward, it's important to remember that Microsoft Fabric abstracts the underlying LLM (i.e. OpenAI‑based models) so organizations don’t choose the model. Instead, Microsoft Fabric gives us controls, limits, and monitoring at the capacity and workload level. Microsoft Fabric does not expose LLM used to meet a request.
You can learn more about this topic by reading the edition #9 of The Data Massagist Newsletter: Governing AI Responsibly in Modern Analytics Platforms
The Core Principle: AI in Fabric Is Charged Through Capacity Units (CUs)
Fabric does not charge separately for AI. There is no “Copilot license”, no “per‑prompt fee”, no “AI surcharge”.
Instead:
Every AI interaction consumes Capacity Units (CUs) based on tokens processed.
This applies to:
Copilot for Power BI
Copilots in Fabric (Data Factory, Data Engineering, Data Science)
Data Agents
Operational Agents (AI‑generated pipelines, SQL, Spark, transformations)
Everything runs on your Microsoft Fabric capacity.
The Token Model: The Heart of AI Billing
Microsoft uses a transparent, predictable token model:
400 CU‑seconds per 1,000 input tokens
1,200 CU‑seconds per 1,000 output tokens
This is the same model used across all Fabric Copilots.
Why this matters? Output tokens are usually 2–4× larger than input tokens. So the cost driver is not the question — it’s the answer.
How Each AI Feature Consumes Capacity
All AI features in Fabric — including Copilot for Power BI, Copilots in Fabric, Data or Operational Agents, and AI Data Operations — are billed through Fabric Capacity Units (CUs) using a token‑based AI consumption model.
There is no separate AI license. AI usage = tokens → CU‑seconds → deducted from your Fabric capacity.
Let's go deeper.
1. Copilot for Power BI
Copilot for Power BI is included with Fabric capacity (F64+) or PPU. Every Copilot action — generating a report, writing DAX, summarizing a model, explaining a dataset — consumes CUs through the Fabric token model. All AI activity is billed based on input + output tokens, converted into CU‑seconds.
Power BI Pro License Clarification: Even though Copilot’s compute runs on Fabric capacity, Power BI licensing rules still apply:
A Power BI Pro license is required when users publish, share, or collaborate on Copilot‑generated content in non‑Premium workspaces.
A Pro license is not required when using Copilot in Power BI Desktop, in Fabric‑backed workspaces (F64+), or for personal, non‑shared content.
Where to Monitor Usage:
Fabric Capacity Metrics App → Copilot workload (Tracks token consumption and CU usage for Copilot actions)
Power BI Admin Portal → Users → Licenses (See which users have Power BI Pro assigned)
2. Copilots in Fabric (DF, DE, DS)
These Copilots generate pipelines, notebooks, Spark code, SQL, and transformations.
All of them use the same token → CU model.
Where to monitor it: Capacity Metrics App → AI workload Admin Portal → Workload usage
3. Data or Operational Agents
Data Agents are one of the most powerful — and most misunderstood — AI capabilities in Microsoft Fabric. They run entirely on Fabric capacity, and every natural‑language query they process is converted into tokens, which are then billed as CU‑seconds. Because Data Agents often return long, contextual, multi‑step answers, their output tokens tend to be high, which naturally increases CU consumption.
I do recommend to read my previous article: Designing AI‑Ready Analytics with Microsoft Fabric Data Agents
But Data Agents are only one side of the story.
Microsoft Fabric also includes Operational Agents, part of the Real‑Time Intelligence (RTI) workload. These agents are more sophisticated: instead of waiting for a user prompt, they continuously observe signals, events, and data streams, and then trigger actions based on real‑time analysis and predefined business rules.
Operational Agents behave like autonomous digital workers:
AI that runs things — monitoring, remediation, orchestration
Persistent, autonomous, and stateful
Continuously active, not one‑time prompts
Executed inside the Data Agent workload
In other words:
Data Agents → AI that answers
Operational Agents → AI that acts
Where to monitor both
All Data Agent activity — including both Data Agents and Operational Agents — is visible in:
4. Operational Agents (AI Data Operations)
These include:
AI‑generated pipelines
AI‑generated SQL/Spark code
AI‑assisted transformations
These operations often produce long code blocks → high output tokens → higher CU usage.
Where to monitor it: Capacity Metrics App → Data Factory or Data Engineering workloads
How to Monitor AI Usage in Fabric
Microsoft gives you three monitoring surfaces:
1. Fabric Capacity Metrics App
The most important and comprehensive place to monitor AI usage in Fabric is the Fabric Capacity Metrics App. This is where administrators can see exactly how Fabric Copilot and Fabric Data Agents are billed and how their consumption is reported.
Copilot usage is measured through the number of tokens processed. Tokens are small pieces of text — roughly 1,000 tokens ≈ 750 words. Microsoft Fabric calculates AI consumption per 1,000 tokens, and input tokens and output tokens are billed at different CU rates.
When you query a Data Agent using natural language, Microsoft Fabric converts your request into tokens based on the number of words in the prompt. The same applies to the agent’s response. For every ~750 words of text (input or output), the system processes approximately 1,000 tokens, which are then billed as CU‑seconds inside the Data Agent workload.
You can see:
CU‑seconds consumed by AI
Token usage
Workload breakdown
Per‑item consumption
2. Admin Portal → Capacity → Workload Usage
Shows:
Which workloads consumed CUs
Which AI features were used
When consumption spiked
3. Activity Logs
Shows:
AI‑generated operations
CU consumption per operation
Pipeline/code generation events
How to Estimate AI Cost (The Practical Way)
The formula is simple:
Why This Model Makes Sense
Fabric’s AI billing model is:
Predictable (token‑based)
Unified (one model for all AI features)
Transparent (visible in metrics)
Aligned with capacity economics
This is fundamentally different from per‑prompt pricing models in other platforms. It rewards customers who consolidate workloads into Fabric and optimize capacity.
Final Thoughts
AI is becoming the default interface for analytics. But AI without cost transparency is a liability. Microsoft Fabric solves this by using a single, unified, token‑based capacity model that applies to every AI feature — from Copilot for Power BI to Data Agents.
Now you know:
How AI is charged
How to monitor it
How to estimate it
How to explain it to customers