01 What actually works - right now
Bulk operations on the model
WORKS This is the confirmed main value of MCP. What used to take hours manually - the agent does in minutes: renaming 50+ measures by naming convention, generating translations for 200+ columns, adding descriptions to all tables and measures in one pass, hiding service columns, creating hierarchies and organizing folders.
u/Full_Metal_Analyst: "Added descriptions to 447 measures in one evening."
Reading and direct model modification
WORKS The agent via MCP sees everything: structure, measures, Power Query, relationships. And can make changes directly - create and fix relationships, replace DAX code in measures, create calculated columns and new tables with join logic. Important: breaking changes are intercepted and return errors - the agent sees them and can fix them independently.
Debugging diverging measures
WORKS One of the most impressive scenarios. The agent writes its own DAX queries for testing, runs them, and works backwards - figuring out why two measures diverge. Manually this often takes hours.
u/1234okie1234, r/PowerBI: "Claude can write its own DAX, run it and work backwards to understand what's happening. No need to explain the problem manually."
.pbip without MCP - a smart workaround
WORKS Saving the model in .pbip format gives JSON files that are easily readable by any LLM without setting up an MCP server. Changes need to be made manually, but for analysis, writing measures, and documentation - it works great without MCP risks.
02 What works partially - important caveats
Complex DAX generation
PARTIAL This is the most important point that gets lost in the hype. Simple measures (YTD, PY, basic aggregations) work well. Complex DAX with filter context and specific business logic - risk of getting code that calculates incorrectly but looks convincing.
Michael Hannecke, Medium, Dec 2025: "Complex DAX generation is unreliable: LLMs produce plausibly-looking code with wrong semantics. Use the agent for mechanical low-risk tasks. Don't use for tasks requiring business logic judgment."
Deneb + AI for custom visuals
PARTIAL For experienced Deneb users - real acceleration. Basic Vega-Lite specs (bar, line, scatter), hover effects, tooltips. But: LLM is non-deterministic (same prompt, different results), complex Vega specs are at the edge of capabilities, and without knowing Vega-Lite it's hard to evaluate quality.
SQLBI, March 2025: "The LLM assumed aggregation wasn't needed - because we didn't tell it what the dataset inside Deneb would look like. You need to provide context not just about measures, but about the data structure inside the visual."
Model quality = output quality
IMPORTANT Older models act like passive employees waiting for micromanagement. Premium models act like proactive agents. In practice: Claude Sonnet or equivalent from OpenAI/Google. Paid subscription + API costs.
pbidax, Nov 2025: "Strongly recommend using premium models."
03 Privacy and security
This isn't a paragraph about theoretical risks. This is a practical problem that will close the door for most corporate deployments.
Claude Pro vs API - a critical difference
RISK Claude Pro by default includes data use for model training. Data may be stored up to 5 years unless manually disabled. API is fundamentally different: commercial data is explicitly excluded from training. But API requires a corporate contract, DPA, and infrastructure setup.
MCP protocol vulnerabilities
RISK MCP is a protocol less than one and a half years old. It hasn't passed corporate audit. It has no built-in enterprise authentication. Documented CVEs from January 2026 in official Anthropic and Microsoft servers. Combining multiple MCP servers creates additional attack vectors.
Rule of thumb: Keep MCP server in an isolated network, with TLS and authentication. Never combine with production data without compliance review.
Corporate paradox: Microsoft Copilot wins not because it's technically better - it wins because Microsoft is already on the approved vendor list and IT security will let it through on inertia of trust in the ecosystem.
04 Ready prompts - copy and use
These templates prompt the LLM to give both code and explanation, minimizing hallucinations.
DAX measure with validation
1) Describe which tables and fields to use for calculating [NetMargin].
2) Write DAX code for the [NetMargin] measure (Profit/Revenue).
3) Validation query: EVALUATE ROW("CheckNetMargin", [NetMargin], "Expected", 0.12)
Bulk documentation
My Sales table has a Deleted=1 flag for deleted records - ignore them.
Add descriptions to all measures in the format:
- What it calculates
- Which filters it considers
- Example value
Working language: English.
Debugging measure divergence
Two measures return different results with the same filter:
[MeasureA] = 145,000
[MeasureB] = 138,000
Check filter context. Write a DAX query for diagnostics, run it and explain the divergence.
05 Ready DAX tests
Run in DAX Studio or SSMS. If Actual ≠ Expected - test failed.
DAX - Basic unit test
-- Basic unit test EVALUATE ROW( "Test", "TotalSales2023", "Expected", 12345, "Actual", CALCULATE([TotalSales], 'Date'[Year] = 2023) )
DAX - Multiple scenarios via UNION
EVALUATE UNION( ROW("Test", "Sales2023", "Result", IF(CALCULATE([TotalSales], 'Date'[Year] = 2023) = 12345, "PASS", "FAIL") ), ROW("Test", "SalesQ1", "Result", IF(CALCULATE([TotalSales], 'Date'[Quarter] = "Q1") = 3100, "PASS", "FAIL") ) )
Daily quick check (5-10 min)
01
Check main reports for correctness of key metrics (Sums/Counts)
02
Confirm last CI build passed successfully
03
Review errors in Power BI and CI logs (Data Refresh, DAX tests)
04
Assess deviations: if AI made changes, quickly check them against a known baseline
05
Reconcile metrics with business sources (duplication, fractions)
06 CI/CD process: implementation steps
1
Store in Git
PBIP and TMDL in repository. Any commit triggers the build. Feature branches + pull request reviews.
2
Build
Publish to test environment + synchronous model Refresh.
3
Automated DAX tests (DQV Testing)
Invoke-DQVTesting runs a set of test DAX queries with expected values. Any mismatch fails the build.
4
Audit and rollback
CI failure - team notification, merge blocked. Logs saved. Fast rollback to stable version.
5
Production
Only if all tests passed.
PowerShell - Azure DevOps CI/CD
# Publish model from PBIP Publish-PowerBISemanticModel -WorkspaceId $devWS -FilePath "Model.pbip" # Synchronous Refresh Invoke-SemanticModelRefresh -WorkspaceName $devWS -ModelName $modelName # Run DAX tests Invoke-DQVTesting -WorkspaceName $devWS -TestQueriesPath "./Tests/queries.dqv"
07 Full capability matrix
| Capability | Status | Comment |
|---|---|---|
| Bulk ops (descriptions, renaming, hierarchies) | WORKS | Main MCP value. Hours → minutes. |
| .pbip + LLM - analysis and documentation | WORKS | No MCP. You control what you send. |
| MCP + Claude - reading and debugging measures | WORKS | Agent sees everything, writes test DAX. |
| GitHub Copilot Agent Mode + MCP | WORKS | Alternative. Easier for corporate. |
| MCP + Claude - complex DAX | PARTIAL | Looks plausible, often semantically wrong. |
| Deneb + Claude - custom visuals | PARTIAL | Works for basics. Need to know Vega-Lite. |
| Built-in Copilot in Power BI | WEAK | Weaker than expected. Shallow access. |
| MCP - creating dashboards and visuals | NOT YET | Semantic model only. Dashboards manual. |
| Claude Pro + corporate data | RISK | Data may go to training by default. |
| MCP without compliance work | RISK | Young protocol, CVEs in official servers. |
The bottom line
MCP + Claude genuinely changes work for routine semantic model operations. For complex business logic in DAX and building reports - still unreliable. For corporate use with sensitive data - requires security work that most teams haven't done yet.
Best audience today
Freelancers, consultants, analysts at smaller companies without strict compliance requirements.
Safe to start with
.pbip + LLM for documentation and analysis. Zero MCP risk, immediate value.
Never skip
DAX unit tests after any AI-generated measure. Always validate before production.
Free. No credit card required. Early access + 300 bonus credits.
Sources
SQLBI Tabular Editor Blog Bulldog Analytics Medium — M. Hannecke Dark Reading MCP Engine — M. Anatsko pbidax Wiz.io r/PowerBI
