M365 Show -  Microsoft 365 Digital Workplace Daily
M365 Show with Mirko Peters - Microsoft 365 Digital Workplace Daily
Your Fabric Data Lake Is Too Slow: The NVMe Fix
0:00
-20:57

Your Fabric Data Lake Is Too Slow: The NVMe Fix

Opening: “Your Data Lake Has a Weight Problem”

Most Fabric deployments today are dragging their own anchors. Everyone blames the query, the spark pool, the data engineers—never the storage. But the real culprit? You’re shoveling petabytes through something that behaves like a shared drive from 2003. What’s that? Your trillion-row dataset refreshes slower than your Excel workbook from college? Precisely.

See, modern Fabric and Power Platform setups rely on managed storage tiers—easy, elastic, and, unfortunately, lethargic. Each request canyon‑echoes across the network before anything useful happens. All those CPUs and clever pipelines are idling, politely waiting on the filesystem to respond.

The fix isn’t more nodes or stronger compute. It’s proximity. When data sits closer to the processor, everything accelerates. That’s what Azure Container Storage v2 delivers, with its almost unfair advantage: local NVMe disks. Think of it as strapping rockets to your data lake. By the end of this, your workloads will sprint instead of crawl.

Section 1: Why Fabric and Power Platform Feel Slow

Let’s start with the illusion of power. You spin up Fabric, provision a lakehouse, connect Power BI, deploy pipelines—and somehow it all feels snappy… until you hit scale. Then, latency starts leaking into every layer. Cold path queries crawl. Spark operations shimmer with I/O stalls. Even “simple” joins act like they’re traveling through a congested VPN. The reason is embarrassingly physical: your compute and your data aren’t in the same room.

Managed storage sounds glamorous—elastic capacity, automatic redundancy, regional durability—but each of those virtues adds distance. Every read or write becomes a small diplomatic mission through Azure’s network stack. The CPU sends a request, the storage service negotiates, data trickles back through virtual plumbing, and congratulations—you’ve just paid for hundreds of milliseconds of bureaucracy. Multiply that by millions of operations per job, and your “real-time analytics” have suddenly time-traveled to yesterday.

Compare that to local NVMe storage. Managed tiers behave like postal services: reliable, distributed, and painfully slow when you’re in a hurry. NVMe, though, speaks directly to the server’s PCIe lanes—the computational equivalent of whispering across a table instead of mailing a letter. The speed difference isn’t mystical; it’s logistical. Where managed disks cap IOPS in the tens or hundreds of thousands, local NVMe easily breaks into the millions. Five GB per second reads aren’t futuristic—they’re Tuesday afternoons.

Here’s the paradox: scaling up your managed storage costs you more and slows you down. Every time you chase performance by adding nodes, you multiply the data paths, coordination overhead, and, yes, the bill. Azure charges for egress; apparently, physics charges for latency. You’re not upgrading your system—you’re feeding a very polite bottleneck.

What most administrators miss is that nothing is inherently wrong with Fabric or Power Platform. Their architecture expects closeness. It’s your storage choice that creates long-distance relationships between compute and data. Imagine holding a conversation through walkie-talkies while sitting two desks apart. That delay, the awkward stutter—that’s your lakehouse right now.

So when your Power BI dashboard takes twenty seconds to refresh, don’t blame DAX or Copilot. Blame the kilometers your bytes travel before touching a processor. The infrastructure isn’t slow. It’s obediently obeying a disastrous topology. Your data is simply too far from where the thinking happens.

Section 2: Enter Azure Container Storage v2

Enter Azure Container Storage v2, Microsoft’s latest attempt to end your I/O agony. It’s not an upgrade; it’s surgery. The first version, bless its heart, was a Frankenstein experiment—a tangle of local volume managers, distributed metadata databases, and polite latency that no one wanted to talk about. Version two threw all of that out the airlock. No LVM. No etcd. No excuses. It’s lean, rewritten from scratch, and tuned for one thing only: raw performance.

Now, a quick correction before the average administrator hyperventilates. You might remember the phrase “ephemeral storage” from ACStor v1 and dismiss it as “temporary, therefore useless.” Incorrect. Ephemeral didn’t mean pointless; it meant local, immediate, and blazing fast—perfect for workloads that didn’t need to survive an apocalypse. V2 doubles down on that idea. It’s built entirely around local NVMe disks, the kind soldered onto the very servers running your containers. The point isn’t durability; it’s speed without taxes.

Managed disks? Gone. Yes, entirely removed from ACStor’s support matrix. Microsoft knew you already had a dozen CSI drivers for those, each with more knobs than sense. What customers actually used—and what mattered—was the ephemeral storage, the one that let containers scream instead of whisper. V2 focuses exclusively on that lane. If your node doesn’t have NVMe, it’s simply not invited to the party.

Underneath it all, ACStor v2 still talks through the standard Container Storage Interface, that universal translator Kubernetes uses to ask politely for space. Microsoft, being generous for once, even open‑sourced the local storage driver that powers it. The CSI layer means it behaves like any other persistent volume—just with reflexes of a racehorse. The driver handles the plumbing; you enjoy the throughput.

And here’s where it gets delicious: automatic RAID striping. Every NVMe disk on your node is treated as a teammate, pooled together and striped in unison. No parity, no redundancy—just full bandwidth, every lane open. The result? Every volume you carve, no matter how small, enjoys the combined performance of the entire set of disks. It’s like buying one concert ticket and getting the whole orchestra. Two NVMes might give you a theoretical million IOPS. Four could double that. All while Azure politely insists you’re using the same hardware you were already paying for.

Let’s talk eligibility, because not every VM deserves this level of competence. You’ll find the NVMe gifts primarily in the L‑series machines—Azure’s storage‑optimized line designed for high I/O workloads. That includes the Lsv3 and newer variants. Then there are the NC series, GPU‑accelerated beasts built for AI and high‑throughput analytics. Even some Dv6 and E‑class VMs sneak in local NVMe as “temporary disks.” Temporary, yes. Slow, no. Each offers sub‑millisecond latency and multi‑gigabyte‑per‑second throughput without renting a single managed block.

And the cost argument evaporates. Using local NVMe costs you nothing extra; it’s already baked into the VM price. You’re quite literally sitting on untapped velocity. When people complain that Azure is expensive, they usually mean they’re paying for managed features they don’t need—elastic SANs, managed redundancy, disks that survive cluster death. For workloads like staging zones, temporary Spark caches, Fabric’s transformation buffers, or AI model storage, that’s wasted money. ACStor v2 liberates you from that dependency. You’re no longer obliged to rent speed you already own.

So what you get is brutally simple: localized data paths, zero extra cost, and performance that rivals enterprise flash arrays. You remove the middlemen—no SAN controllers, no network hops, no storage gateways—and connect compute directly to the bytes that fuel it. Think of it as stripping latency fat off your infrastructure diet.

Most of all, ACStor v2 reframes how you think about cloud storage. It doesn’t fight the hardware abstraction layer; it pierces it. Kubernetes persists, Azure orchestrates, but your data finally moves at silicon speed. That’s not a feature upgrade—that’s an awakening.

Section 3: The NVMe Fix—How Local Storage Outruns the Cloud

OK, let’s dissect the magic word everyone keeps whispering in performance circles—NVMe. It sounds fancy, but at its core, it’s just efficiency, perfected. Most legacy storage systems use protocols like AHCI, which serialize everything—one lane, one car at a time. NVMe throws that model in the trash. It uses parallel queues, directly mapped to the CPU’s PCIe lanes. Translation: instead of a single checkout line at the grocery store, you suddenly have thousands, all open, all scanning groceries at once. That’s not marketing hype—it’s electrical reality.

Now compare that to managed storage. Managed storage is… bureaucracy with disks. Every read or write travels through virtual switches, hypervisor layers, service fabrics, load balancers, and finally lands on far‑away media. It’s the postal service of data: packages get delivered, sure, but you wouldn’t trust it with your split‑second cache operations. NVMe, on the other hand, is teleportation. No queues, no customs, no middle management—just your data appearing where it’s needed. It’s raw PCIe bandwidth turning latency into an urban legend.

And here’s the kicker: ACStor v2 doesn’t make NVMe faster—it unleashes it. Remember that automatic RAID striping from earlier? Picture several NVMe drives joined in perfect harmony. RAID stripes data across every disk simultaneously, meaning reads and writes occur in parallel. You lose redundancy, yes, but gain a tsunami of throughput. Essentially, each disk handles a fraction of the workload, so the ensemble performs at orchestra tempo. The result is terrifyingly good: in Microsoft’s own internal benchmarking, two NVMe drives hit around 1.2 million input/output operations per second with a throughput of roughly five gigabytes per second. That’s the sort of number that makes enterprise arrays blush.

To visualize it, think of Spark running its temporary shuffles, those massive intermediate tables you never see but always wait for. Ordinarily, every shuffle operation bounces through managed storage, throttled by IOPS caps and latency spikes. With local NVMe, that same shuffle barely touches the network; it writes to drives milliseconds away from the CPU. One Fabric architect reported Spark job runtimes dropping by fivefold after moving intermediate data to ephemeral NVMe volumes. Fivefold—without changing a single line of code. That’s what proximity buys you.

Now, some will immediately panic—“Ephemeral? But if the node dies, the data disappears!” Correct. And also irrelevant for the right workloads. NVMe ephemeral storage is not for your final transactional ledger; it’s the pit lane, not the race finish. Use it for caches that can regenerate, replicas that can resync, or transient model weights your AI workload can reload on demand. The data here doesn’t need to live forever; it just needs to move fast while it’s alive. You’d never store your only copy of mission‑critical telemetry in volatile memory; same logic applies here. Ephemeral is fine when redundancy lives higher up the stack.

Here’s a small parable. Picture a data platform team wrestling with Power BI Direct Lake’s refresh times. Reports render slower than users can blink—tragic for dashboards meant to impress executives. They experiment by caching pre‑aggregations locally using NVMe ephemeral volumes on AKS nodes. Suddenly, those same reports refresh like web pages. Users assume someone upgraded the network; in reality, all they did was let the data sit closer to its compute spine. The difference between “we’ll have the numbers soon” and “they’re already on screen” was measured in microseconds.

And this goes beyond convenience. Every byte you don’t send over Azure’s network is a byte you don’t pay egress for. Performance improvement is rarely free in cloud architecture, yet NVMe is the rare exception—it’s already on your node, prepaid. Each managed disk you rent to “buy speed” duplicates something your VM quietly provides. ACStor v2 just exposes the hardware you forgot you owned. It’s like finding out your car has a turbocharger that’s been unplugged since delivery.

Let’s hit the economics again because it’s almost impolite how much you save. Managed ultra disks can cost five, ten, even twenty times what your local NVMe equivalent would, if you price by throughput. Yet the NVMe sitting inside an L‑series or NC‑series VM delivers comparable, sometimes better, performance without incremental billing. Your container workloads stop waiting and start working, at no extra cost. That’s not optimization; that’s correction of a systemic oversight.

And then there’s latency—the invisible tax you’ve been paying without a receipt. Traditional managed disks hover at a few milliseconds per request. NVMe operates in microseconds. That thousand‑fold reduction compounds throughout your system: Spark tasks complete sooner, Power Apps respond faster, Copilot models load on demand instead of buffering. In human terms, your users stop noticing the system and start noticing the results.

Every few years, cloud architects rediscover the same ancient truth: data is heavy, distance is costly, and physics doesn’t negotiate. NVMe local storage simply concedes to that truth with elegance. It doesn’t abstract away hardware limitations; it obeys them by staying close.

Of course, perfection has a price, and it’s spelled “ephemeral.” All this speed and thrift trade one thing—durability. When a node deallocates or fails, its NVMe cache evaporates. The solution? Not panic—planning. Because the next problem to solve isn’t performance; it’s persistence. And that’s exactly where we’re headed next.

Section 4: The Durability Dilemma

So let’s address the nervous pause in every architect’s voice: “But what if the node dies?”
Yes—when the node dies. Not “if.” Compute is mortal, disks come and go, and Azure will gladly recycle your container host the moment it feels like reshuffling capacity. That’s the price of leasing hardware in the cloud—it sometimes remembers the hardware part. So, no, local NVMe storage isn’t immortal; it’s astonishingly fast and equally temporary.

Ephemeral means you lose the data the moment the node deallocates or crashes. And while people say that like it’s a deal‑breaker, it’s only a problem if you architect like it’s still 201—when storage had to persist everything, forever. Modern workloads don’t need that kind of coddling. They need speed where it matters and resilience where it counts. The entire trick is separating the two instead of demanding both from the same disk.

Let’s quantify that risk clearly. When an NVMe‑backed AKS node stops, the volume literally evaporates. The next node might have fresh drives, different identifiers, clean slates. Your data? Gone. But so what? You were supposed to treat that tier as hot cache—temporary, regenerable, dispensable. If you stored your only copy of production data there, that’s not engineering; that’s gambling with misplaced confidence chips.

The professional way to handle durability is to escalate responsibility up the stack. Databases do it through replication—PostgreSQL streaming replicas, Redis clustering, or Fabric’s own Lakehouse replication mechanisms. You create multiple copies, often zonally distributed, each ready to take over when the local node disappears in a puff of deallocation. The compute vanishes, the copy elsewhere persists, and the overall system shrugs. That’s redundancy done right.

Some prefer geographic redundancy. Azure zones are perfect for that—think of them as separate silos within the same region, each sitting on different power grids and hardware pools. By spreading pods across zones, you can lose an entire building and still keep your dataset alive. Then there’s scheduled dumping: regular exports, snapshots, or transaction logs pushed to Azure Blob Storage. Blob is slow, yes, but functionally indestructible and scandalously cheap. Perfect for preservation, atrocious for live performance—which makes it an ideal teammate for NVMe’s hot data.

And that’s the design philosophy: speed local, safety global. Treat NVMe as the racetrack, not the museum. Hot data races there; cold data lives elsewhere, sipping tea in Blob archives. Want to minimize exposure? Automate the housekeeping—scheduled commits that export intermediate results at intervals or upon task completion. Lose a node mid‑task, and you only lose moments of work, not months of data.

A lot of people obsess over having storage‑level replication. That’s a comforting illusion that costs performance. If you mirror blocks across nodes in real time, congratulations—you’ve reinvented the very latency you tried to escape. That’s why Microsoft deliberately withheld storage‑level durability in Azure Container Storage v2. It’s pushing you, gently but firmly, to manage resilience in the application layer where you control timing and frequency. Every gigabyte copied is a conscious trade‑off instead of an involuntary tax.

Think of durability like insurance—it’s supposed to live outside the car, not welded onto the engine. You only notice it when you crash. Keeping it separate means your everyday driving stays economical and responsive. The same principle applies here: compute should race without dragging a safety net on the ground.

So the rule of thumb emerges: NVMe for hot, short‑lived, high‑velocity data—Spark shuffles, temporary cache layers, AI inference models, Power Platform staging. Blob or managed disks for records, backups, or anything senior leadership might demand during an audit. You don’t store family photos in your GPU cache; you don’t archive telemetry in ephemeral NVMe.

Durability isn’t free; it’s strategic. Handle it with rotation, redundancy, and replication above the storage tier, and you’ll get the best of both worlds—speed that thrills and safety that sleeps quietly elsewhere. Refuse to adapt, and you’ll keep paying for managed storage just to protect data your workloads already outgrow daily.

The truth? Cloud hardware is transient but predictable. Nodes die, disks vanish, and that’s fine. What matters is designing systems that expect decay and operate calmly through it. ACStor v2 doesn’t make your data invincible; it simply stops pretending that it should be.

Section 5: Fabric Use Cases Reimagined

At this point, you might be wondering, “Fine, it’s fast—but what does that look like in the real world?” Well, let’s stop admiring the wiring and start watching the machinery spin. Because when ACStor v2 meets Microsoft Fabric, Power Platform, and AI workloads, the changes stop being theoretical—they become embarrassingly measurable. Entire workflows you once described as “overnight jobs” start finishing before lunch.

Take Power BI Direct Lake mode. It’s meant to tap your data lake directly, bypassing imports, keeping datasets “live.” A clever concept—until physics crashed the party. The weak link was storage latency: every Direct Lake query had to fetch data over remote-managed tiers, each request politely waiting its turn in queue. Add concurrency, and refresh speed declined faster than your patience. Now replace that remote tier with local NVMe caching using ACStor v2’s ephemeral volumes. Suddenly, every Direct Lake request stops shouting across the data center and whispers to local silicon instead. What used to be 20‑second refreshes drop into the low single digits. The user thinks you optimized DAX. You just brought the bits home.

Or let’s talk Dataflows Gen2—the unsung heroes of Power Platform ETL. They gulp, clean, and transform data, often thrashing managed disks as they stage intermediate files. It’s like cooking dinner but rinsing the utensils in another country. With local NVMe, staging happens directly beside compute; those transformation buffers that used to crawl now sprint. Developers don’t rewrite pipelines—they just notice that jobs complete before Teams finishes syncing. That’s performance translated into working hours, not marketing slides.

Now zoom out to AI and Copilot scenarios. Every Copilot, from text summarizers to custom GPT extensions in Fabric, relies on pre‑loaded model weights. Those files can run into tens or hundreds of gigabytes—and must stream fast to keep inference smooth. Managed storage turns this into a loading‑screen festival. But cache those model weights on local NVMe, and models come alive instantly. An AI Copilot loading a 40‑GB bundle from local PCIe lines doesn’t “wake up”; it’s just there. Users stop describing your chatbot as “laggy” and start calling it “thoughtful.” You buy goodwill measured in microseconds.

For a data engineer, this shift isn’t abstract; it’s operational liberation. In one real-world deployment, Spark workloads processing multimillion-row joins ran five times faster simply by caching intermediate Parquet files on ephemeral NVMe. Costs plummeted because shorter runtimes meant fewer billed compute minutes—and no bloated managed disk bills. Nothing about the data changed; only its distance did. Bandwidth stayed local, latency plummeted, accountants smiled, and spark‑cluster tantrums quietly stopped.

The business translation is simple: speed equals money saved and users retained. Lower query time equals fewer capacity units burned in Fabric, smaller pipeline runtimes, and snappier dashboards that executives actually believe are “real time.” Every millisecond you claw back from storage latency is a millisecond you don’t lease from Azure’s expensive time bank.

Here’s the cruel irony: most organizations already own the horsepower they need. Their AKS clusters run on L‑series or Dv6 virtual machines that ship with gorgeous NVMe drives welded into the motherboard. But because managed services hide that hardware behind abstraction layers, it remains unused—silicon luxury sealed away by convenience. ACStor v2 is the crowbar that opens that vault. As I’ve said before, you’re buying ultra disks when your nodes already have Ferraris bolted inside.

And integration? Painfully easy. ACStor v2 plugs into AKS through an extension; Fabric rides on top through Kubernetes‑backed capacities. No redesign, no grand migration, just smarter configuration. You deploy the storage class once, target it in your workloads, and—poof—everything touching temp data or transformations inherits NVMe performance. It’s modernization without trauma.

Looking ahead, the trajectory is obvious. Microsoft has already said Elastic SAN support will return to ACStor later, but the message between the lines is clear: NVMe‑first is the future. They’re quietly turning ephemeral storage from “temporary scratch space” into “primary performance tier.” Once SAN support comes back, it’ll likely complement NVMe for hybrid needs—fast local lanes with optional durable spines. But the optimization path won’t reverse. Data gravity demands proximity, and NVMe is physics’ most obedient accomplice.

So rethink what you call “slow.” In a Fabric world, “slow” often means “remote.” The new rule: don’t chase compute; shorten distance. Your lakehouse, dataflows, and AI copilots deserve to sprint on hardware that’s literally under their feet. Azure Container Storage v2 just reminds you that the perfect upgrade was hiding beneath your workloads all along. Stop ignoring the hardware—the fastest acceleration sometimes begins with noticing you already have brakes off.

Conclusion: The Local Secret Microsoft Already Gave You

Here’s the punchline: Azure Container Storage v2 doesn’t add something new; it exposes what you’ve been wasting. The local NVMe drives soldered into your AKS nodes are now your lowest‑latency, zero‑cost data tier—if you choose to use them. That’s the secret Microsoft already handed you and almost no one noticed.

Performance has never been about buying bigger clusters—it’s about smarter proximity. Move the data closer, remove the intermediaries, and let compute breathe. Forget managed everything for a moment; use the velocity you’re already paying for.

If understanding this saved you time—and it should have—repay that efficiency debt properly: subscribe. Click follow, flip notifications on, and let the next update arrive like a scheduled query—automatic, fault‑tolerant, and predictably enlightening. Knowledge delivered, latency eliminated. End of transmission.

Discussion about this episode

User's avatar