Incremental Refresh (IR) in Power BI Dataflows Gen2 is a powerful, often misunderstood capability—especially now that Microsoft Fabric positions OneLake as the unified storage layer for ingestion, transformation, and analytics.

This post explains how Dataflow Incremental Refresh works, how it differs from dataset IR, when to use it, and how to design it safely for enterprise‑grade workloads.

🔷 1. What Dataflow Incremental Refresh Is (and Isn’t)

Dataflow IR performs incremental ingestion at the ETL layer. It controls what data is extracted from source systems and loaded into OneLake, not how semantic models partition their imported data.

Key characteristics:

✔ Dataflow IR manages data ingestion, not semantic model partitions

Dataflows apply Store/Refresh windows during data extraction and materialization. Unlike dataset IR, they do not create semantic model partitions.

✔ Query folding is mandatory

If folding breaks, IR is ignored and the entire dataflow refreshes fully. Microsoft emphasizes folding behavior for incremental logic.

✔ Dataflows IR shapes what lands in OneLake

Semantic models built on top may apply their own Incremental Refresh, giving you a two‑tier IR architecture (best practice).

🔷 2. Dataflow IR Architecture

The following diagram shows how Dataflow IR fits into Fabric’s ingestion pipeline:

📊 Diagram 1 – Dataflow Incremental Refresh Architecture

Flow description:

Source systems provide raw transactional or telemetry data.
Dataflow Gen2 applies Incremental Refresh (Store/Refresh windows).
Data lands in OneLake as delta/parquet after IR processing.
A Semantic Model can optionally add a second layer of Incremental Refresh.

🔷 3. How Dataflow Incremental Refresh Works

A. Store Period

Defines how much history to retain in the Dataflow output.

B. Refresh Period

Defines the window of recent data to re‑ingest (e.g., last 3 days).

C. Rolling window behavior

Dataflow IR:

Pulls only the Refresh period from the source
Overwrites the Refresh region
Drops data older than Store period

This differs from semantic models, which create and preserve partitions.

📊 Diagram 2 — Store & Refresh Windows in a Dataflow

Blue = historical data retained in OneLake
Green = incremental window re‑processed each refresh

🔷 4. Dataflow IR vs Dataset IR — What’s the Difference?

Feature	Dataflow IR	Dataset IR
Layer	Ingestion (ETL)	Semantic model (BI)
Output	Refreshed OneLake tables	Partitioned model (Premium)
Partitioning	None	Yes (physical partitions)
Folding requirement	Mandatory	Strongly recommended
Performance impact	Reduces load on source systems	Reduces model processing time
Best used for	Heavy ETL / Delta loads	Report‑optimized partition models
Works with Direct Lake?	Yes	Often not needed

Why both layers matter

Semantic model IR still processes data into VertiPaq partitions—even if Dataflow IR already filtered history.
This two‑layer design is recommended for Fabric. (Supported by Fabric’s ingestion/storage model described in the overview)

🔷 5. When Should You Use Dataflow IR?

Use Dataflow Incremental Refresh when:

Your source system cannot handle full extract queries
You want to pre‑stage cleaned, historical tables in OneLake
You need ETL‑level optimization before semantic model processing
You want smaller semantic models (less memory pressure)

🔷 6. When NOT to Use It

Avoid Dataflow IR when:

Your queries don’t fold (full refresh every time)
You only have small datasets (no benefit)
You rely heavily on Direct Lake—semantic model IR becomes less relevant, and Dataflow IR may add redundancy

🔷 7. Best‑Practice Patterns for Fabric (2026)

⭐ Pattern 1 — Dataflow IR + Semantic Model IR (recommended)

Keeps source extraction light
Keeps semantic model refresh fast
Doubles robustness if one layer temporarily fails

⭐ Pattern 2 — Dataflow IR feeding Direct Lake

When Direct Lake is used as primary storage:

Dataflow IR keeps your staging area compact
Direct Lake auto‑syncs changes to semantic model via “Keep your Direct Lake data up to date”
3

⭐ Pattern 3 — Dataflow IR + Warehouse Merge

Use dataflows IR to ingest incremental data into a Lakehouse or Warehouse where transformations (MERGE, UPSERT) occur.

🔷 8. Common Failure Scenarios & Fixes

Failure	Root Cause	Fix
Full refresh unexpectedly triggers	Query folding broken	Re‑write Power Query steps to fold
Missing historical data	Store window too short	Extend store period
Source locking	IR refresh window too large	Reduce Refresh period
Schema drift	Source changed	Add schema validation steps

🔷 9. Operational Checklist

Confirm Power Query folding (View → Query diagnostics).
Ensure Date/ModifiedDate columns exist and are reliable.
Set Store period to business‑value retention.
Set Refresh period to source update frequency.
Monitor Dataflow runs in Fabric monitoring.
Consider pairing with dataset IR for large‑scale models.

🔷 10. References

Power BI Refresh Requirements (Import vs other modes) — Microsoft Learn
1
Fabric/Power BI architecture & unified storage — HSO overview
2
Direct Lake refresh behavior — Inkey Solutions (Fabric Direct Lake)
3

Search This Blog

Microsoft Fabric Best Practice!

Incremental Refresh in Dataflows Gen2 (2026 Edition)