M365 Show -  Microsoft 365 Digital Workplace Daily
M365 Show with Mirko Peters - Microsoft 365 Digital Workplace Daily
Unit vs. Integration vs. Front-End: The Testing Face-Off
0:00
-19:51

Unit vs. Integration vs. Front-End: The Testing Face-Off

Ever fix a single line of code, deploy it, and suddenly two other features break that had nothing to do with your change? It happens more often than teams admit. Quick question before we get started—drop a comment below and tell me which layer of testing you actually trust the most. I’m curious to see where you stand.

By the end of this podcast, you’ll see a live example of a small Azure code change that breaks production, and how three test layers—Unit, Integration, and Front-End—each could have stopped it. Let’s start with how that so-called safe change quietly unravels.

The Domino Effect of a 'Safe' Code Change

Picture making a tiny adjustment in your Azure Function—a single null check—and pushing it live. Hours later, three separate customer-facing features fail. On its face, everything seemed safe. Your pipeline tests all passed, the build went green, and the deployment sailed through without a hitch. Then the complaints start rolling in: broken orders, delayed notifications, missing pages. That’s the domino effect of a “safe” code change. Later in the video we’ll show the actual code diff that triggered this, along with how CI happily let it through while production users paid the price.

Even a small conditional update can send ripples throughout your system. Back-end functions don’t exist in isolation. They hand off work to APIs, queue messages, and rely on services you don’t fully control. A small logic shift in one method may unknowingly break the assumptions another component depends on. In Azure especially, where applications are built from smaller services designed to scale on their own, flexibility comes with interdependence. One minor change in your code can cascade more widely than you expect.

The confusion deepens when you’ve done your due diligence with unit tests. Locally, every test passes. Reports come back clean. From a developer’s chair, the update looks airtight. But production tells a different story. Users engage with the entire system, not just the isolated logic each test covered. That’s where mismatched expectations creep in. Unit tests can verify that one method returns the right value, but they don’t account for message handling, timing issues, or external integrations in a distributed environment.

Let’s go back to that e-commerce example. You refactor an order processing function to streamline duplicate logic and add that null check. In local unit tests, everything checks out: totals calculate correctly, and return values line up. It all looks good. But in production, when the function tries to serialize the processed order for the queue, a subtle error forces it to exit early. No clear exception, no immediate log entry, nothing obvious in real time. The message never reaches the payment or notification service. From the customer’s perspective, the cart clears, but no confirmation arrives. Support lines light up, and suddenly your neat refactor has shut down one of the most critical workflows.

That’s not a one-off scenario. Any chained dependency—authentication, payments, reporting—faces the same risk. In highly modular Azure solutions, each service depends on others behaving exactly as expected. On their own, each module looks fine. Together, they form a structure where weakness in one part destabilizes the rest. A single faulty brick, even if solid by itself, can put pressure on the entire tower.

After describing this kind of failure, this is exactly where I recommend showing a short code demo or screenshot. Walk through the diff that looked harmless, then reveal how the system reacts when it hits live traffic. That shift from theory to tangible proof helps connect the dots.

Now, monitoring might eventually highlight a problem like this—but those signals don’t always come fast or clear. Subtle logic regressions often reveal themselves only under real user load. Teams I’ve worked with have seen this firsthand: the system appears stable until customer behavior triggers edge cases you didn’t consider. When that happens, users become your detectors, and by then you’re already firefighting. Relying on that reactive loop erodes confidence and slows delivery.

This is where different layers of testing show their value. They exist to expose risks before users stumble across them. The same defect could be surfaced three different ways—by verifying logic in isolation, checking how components talk to each other, or simulating a customer’s path through the app. Knowing which layer can stop a given bug early is critical to breaking the cycle of late-night patching and frustrated users.

Which brings us to our starting point in that chain. If there’s one safeguard designed to catch problems right where they’re introduced, it’s unit tests. They confirm that your smallest logic decisions behave as written, long before the code ever leaves your editor. But here’s the catch: that level of focus is both their strength and their limit.

Unit Tests: The First Line of Defense

Unit tests are that first safety net developers rely on. They catch many small mistakes right at the code level—before anything ever leaves your editor or local build. In the Azure world, where applications are often stitched together with Functions, microservices, and APIs, these tests are the earliest chance to validate logic quickly and cheaply. They target very specific pieces of code and run in seconds, giving you almost immediate feedback on whether a line of logic behaves as intended.

The job of a unit test is straightforward: isolate a block of logic and confirm it behaves correctly under different conditions. With an Azure Function, that might mean checking that a calculation returns the right value given different inputs, or ensuring an error path responds properly when a bad payload comes in. They don’t reach into Cosmos DB, Service Bus, or the front end. They stay inside the bounded context of a single method or function call. Keeping that scope narrow makes them fast to write, fast to run, and practical to execute dozens or hundreds of times a day—this is why they’re considered the first line of defense.

For developers, the value of unit tests usually falls into three clear habits. First, keep them fast—tests that run in seconds give you immediate confidence. Second, isolate your logic—don’t mix in external calls or dependencies, or you’ll blur the purpose. And third, assert edge cases—null inputs, empty collections, or odd numerical values are where bugs often hide. Practicing those three steps keeps mistakes from slipping through unnoticed during everyday coding.

Here’s a concrete example you’ll actually see later in our demo. Imagine writing a small xUnit test that feeds order totals into a tax calculation function. You set up a few sample values, assert that the percentages are applied correctly, and make sure rounding behaves the way you expect. It’s simple, but incredibly powerful. That one test proves your function does what it’s written to do. Run a dozen variations, and you’ve practically bulletproofed that tiny piece of logic against the most common mistakes a developer might introduce.

But the catch is always scope. Unit tests prove correctness in isolation, not in interaction with other services. So a function that calculates tax values may pass beautifully in local tests. Then, when the function starts pulling live tax rules from Cosmos DB, a slight schema mismatch instantly produces runtime errors. Your unit tests weren’t designed to know about serialization quirks or external API assumptions. They did their job—and nothing more.

That’s why treating unit tests as the whole solution is misleading. Passing tests aren’t evidence that your app will work across distributed services; they only confirm that internal logic works when fed controlled inputs. A quick analogy helps make this clear. Imagine checking a single Lego brick for cracks. The brick is fine. But a working bridge needs hundreds of those bricks to interlock correctly under weight. A single-brick test can’t promise the bridge won’t buckle once it’s assembled. Developers fall into this false sense of completeness all the time, which leaves gaps between what tests prove and what users actually experience.

Still, dismissing unit tests because of their limits misses the point. They shine exactly because of their speed, cost, and efficiency. An Azure developer can run a suite of unit tests locally and immediately detect null reference issues, broken arithmetic, or mishandled error branches before shipping upstream. That instant awareness spares both time and expensive CI resources. Imagine catching a bad null check in seconds instead of debugging a failed pipeline hours later. That is the payoff of a healthy unit test suite.

What unit tests are not designed to provide is end-to-end safety. They won’t surface problems with tokens expiring, configuration mismatches, message routing rules, or cross-service timing. Expecting that level of assurance is like expecting a smoke detector to protect you from a burst pipe. Both are valuable warnings, but they solve very different problems. A reliable testing strategy recognizes the difference and uses the right tool for each risk.

So yes, unit tests are essential. They form the base layer by ensuring the most basic building blocks of your application behave correctly. But once those blocks start engaging with queues, databases, and APIs, the risk multiplies in ways unit tests can’t address. That’s when you need a different kind of test—one designed not to check a single brick, but to verify the system holds up once pieces connect.

Integration Tests: Where Things Get Messy

Why does code that clears every unit test sometimes fail the moment it talks to real services? That’s the territory of integration testing. These tests aren’t about verifying a single function’s math—they’re about making sure your components actually work once they exchange data with the infrastructure that supports them.

In cloud applications, integration testing means checking whether code connects properly to the services it depends on. You’re testing more than logic: connection strings, authentication, message delivery, retries, and service bindings all come into play. A function can look flawless when isolated, but if the queue binding doesn’t match the actual topic name in staging, it’s just as broken as if the code never ran at all. Integration tests are where you ask: does this still work when it’s connected to the rest of the system?

Of course, integration tests come with friction. They’re slower and less predictable than unit tests, often depending on deployed infrastructure. Running against a real database or a Service Bus takes longer and adds complexity. That’s one reason teams skip or postpone them until late in the release cycle. But skipping them is dangerous because environment-specific issues—like mismatched queue names or incorrect connection strings—are exactly the kinds of failures that unit tests never see. These problems don’t surface until real services interact, and by then, you’re firefighting in production.

Take a common example: a queue-triggered function in development is tested with perfect local settings. Unit tests confirm that once a message arrives, the handler processes it correctly. Yet in staging, nothing happens. No emails are sent, no triggers fire. The reason? The queue name in staging didn’t match the one the function was expecting. An integration test that actually produced and consumed a message in that environment would have caught the mismatch immediately. Without it, the team only learns when users start noticing missing actions downstream.

That’s where integration tests bring unique value. They surface the invisible mistakes—misconfigurations, authentication mismatches, and latency issues—from the kinds of scenarios that only occur when your app touches the real platform it’s built on. It’s like checking if a train’s engine starts versus watching it run across actual tracks. You want to know whether the train moves smoothly once it’s connected, not just whether the motor turns over.

Running integration tests does cost more in both time and setup, but there are ways to manage that. A practical approach is to run them in a disposable or staging environment as part of your pipeline. This lets you validate real bindings and services early, while containing risk. We’ll show an example later in the demo where this setup prevents a configuration oversight from slipping through. By baking it into your delivery flow, you remove the temptation to skip these checks.

Tooling can help here as well. Many teams stick with xUnit for integration tests since it fits the same workflow they already use. Pairing it with SpecFlow, for example, lets you layer in business rules on top of technical checks. Instead of only writing “a message was consumed,” you can express a scenario as “when an order is submitted, a payment entry exists and a confirmation message is queued.” That way you’re validating both the system’s plumbing and the customer workflow it’s supposed to deliver.

The point of integration testing isn’t speed or neatness—it’s grounding your confidence in reality instead of assumptions. They prove your services work together as intended before a customer ever notices something’s off. Without them, you risk deploying an app that works only in theory, not in practice.

And once you trust those back-end connections are solid, there’s still the last surface where things can silently collapse: the moment an actual person clicks a button.

Front-End Tests: Catching What Others Miss

Front-end testing is where theory meets the reality of user experience. Unit tests validate logic. Integration tests prove that services can talk to each other. But front-end tests, using tools like Playwright, exist to confirm the one thing that actually matters: what the user sees and can do in the app. It’s not about server responses or infrastructure wiring—it’s about the login screen loading, the button appearing, the form submitting, and the confirmation page showing up where and when users expect.

What makes front-end testing distinct is that it simulates a real person using your application. Instead of mocking dependencies or calling isolated endpoints, Playwright spins up automated browser sessions. These sessions act like people: opening pages, typing into inputs, clicking buttons, waiting for asynchronous data to finish loading, and verifying what renders on the screen. This approach eliminates the assumption that “if the API says success, the job is done.” It checks whether users get the actual success states promised to them.

The reason this matters becomes clear once you consider how many things can fail silently between the back end and the UI. APIs may return perfect 200 responses. Integration tests can confirm that payloads move cleanly from one service to another. But none of that protects against JavaScript errors that block a page, CSS changes that hide a button, or browser-specific quirks that stop navigation. Playwright finds those UI misfires because it is staring at the same surface your users interact with every day.

Take the checkout flow. Your back end and services show green lights. Responses come back correct, payments get processed, and integration checks all pass. Then Playwright runs a browser scenario. A cart gets filled, the checkout button clicked, and nothing happens. A minor JavaScript error from a dependency update froze the flow, and the browser swallowed the error silently. The server says things succeeded, but the user sees no confirmation, no receipt, no finish line. They refresh, get frustrated, and maybe abandon the order entirely. Which test would have caught this: unit, integration, or Playwright? We’ll reveal the answer in the demo.

That checkout scenario highlights the core truth: only front-end validation captures issues visible through the user’s eyes. It’s like building a car where all the systems work on paper—engine, gears, suspension—but the doors won’t open when the driver arrives. Without front-end checks, you risk leaving the actual entry point broken even though the mechanical parts look flawless.

Front-end bugs unfortunately get dismissed too often as “user mistakes” or “rare edge cases.” In practice, they’re often genuine failures in the UI layer—missing validations, broken event handlers, misaligned frameworks. Brushing them off is risky because perception is everything. Customers don’t care if your integrations are technically correct. If the interface confuses or fails them, the product as a whole appears broken. And repeated failures at that layer corrode trust faster than almost any back-end misconfiguration.

By automating UI flows early, Playwright reduces dependence on frustrated users filing complaints after release. It validates critical actions like logging in, submitting data, and completing workflows before production customers ever encounter them. Running across major browsers and devices adds another layer of confidence, capturing variations you would never test manually but your users inevitably face. That broader coverage makes sure a login works as smoothly on one system as it does on another.

Of course, front-end testing comes with its own practical challenges. The most common culprit is flakiness—tests failing intermittently because of unstable conditions. Authentication and test data setup are frequent sources of this. A login may expire mid-run, or a test record may not be reset cleanly between sessions. In our demo, we’ll show a simple strategy to handle one of these issues, so that your Playwright suite stays stable enough to trust.

The real strength of this testing layer is how it closes a blind spot left open by unit and integration checks. Unit tests catch logic bugs early. Integration tests prove services and connections work as expected. Front-end tests step in to ensure that everything technical translates into a working, satisfying user journey. Without that last layer, you can’t fully guarantee the experience is reliable from end to end.

With all three types of testing in place, the challenge shifts from what to test into how to run them together without slowing down delivery. The coordination between these layers makes or breaks release confidence. And that’s where the structure of your pipeline becomes the deciding factor.

The Azure Pipeline: Orchestrating All Three

In practice, the Azure DevOps pipeline is what ties all the testing layers together into one workflow. Instead of leaving unit, integration, and front-end tests scattered and inconsistent, the pipeline runs them in order. Each stage adds a different checkpoint, so errors are caught in the cheapest and earliest place possible before anyone has to debug them in production.

Unit tests almost always run first. They’re fast, lightweight, and able to stop simple mistakes immediately. In Azure DevOps, this usually means they execute right after the build. If a test fails, the pipeline fails, and nothing moves forward. That’s by design—it prevents wasted time deploying code that a developer could have fixed locally in minutes. Teams typically run these on every commit or pull request because the cost of running them is so low compared to the benefit of instant feedback.

The next stage is integration testing, where things slow down but also get closer to reality. Here, the pipeline pushes your app into a staging slot or a temporary environment. Tests call live Azure services: publishing a message into Service Bus, hitting a Cosmos DB instance, or validating a Key Vault reference. Integration checks don’t just confirm that individual functions behave—they prove the wiring between components works under realistic conditions. Many teams configure these to run in gated builds or as part of pull requests, so bugs like a mismatched queue name or an expired connection string are caught before code ever approaches production.

Once integration tests pass, the pipeline moves into front-end validation. This is where Playwright is often wired in to simulate browser interactions against the deployed app. Instead of mocks, the pipeline launches automated sessions to log in, submit forms, and complete workflows. The value here is perspective: not “did the API return success?” but “did the user see and complete their task?” A small JavaScript regression or a misplaced element will fail this stage, where unit or integration tests would have declared everything fine. Given their heavier runtime, these browser-based flows are often scheduled for nightly runs, pre-production deployments, or release candidates rather than every commit. That cadence balances coverage with speed.

There’s also an optional performance layer. Azure Load Testing can be slotted into the tail end of major releases or milestone builds. Its role is not to run on every pipeline but to validate your system under stress in controlled bursts. For example, you might simulate thousands of concurrent sign-ins or flooding order submissions to see if scaling behavior kicks in correctly. Running these against a pre-production environment exposes bottlenecks without risking live users. The key is using them selectively—when you need assurance about performance trends, not as part of every daily push.

Visualizing this sequence helps. Imagine a YAML snippet or a pipeline view where unit tests gate the build, integration tests pull in after staging, and UI automation runs against a deployed slot. Adding conditions makes the flow smarter: unit tests on every commit, integration in gated builds, front-end runs nightly, and load tests on pre-production milestones. In this part of the video, we’ll show exactly that—a pipeline snippet with each stage clearly ordered. That way it’s not just theory, you’ll see how the orchestration actually plays out in Azure DevOps.

The advantage of this approach is not complexity—it’s balance. Instead of relying too much on one type of test, each layer contributes according to its strengths. Unit tests guard against trivial logic errors. Integration tests validate service wiring. Front-end tests measure usability. Load tests stress the system. Together, they create a flow that catches issues at the most appropriate stage without slowing down everyday development. Teams end up spending fewer hours firefighting regressions and more time delivering features because confidence rises across the board.

The broader effect of orchestrating tests this way is cultural as much as technical. You move from a mindset of patching after complaints to one of structured assurance. Developers get faster feedback. Releases require fewer hotfixes. Customers see fewer failures. Each test layer checks a different box, but the result isn’t box-checking—it’s fewer surprises when code reaches real users.

And that brings us to the bigger picture behind all this. Testing in Azure pipelines isn’t about the mechanics of YAML or automation frameworks. It’s about reducing the chances that hidden flaws reach people who depend on your software.

Conclusion

Testing isn’t about filling a checklist or keeping dashboards green. It’s about building confidence that your app behaves the way you expect once it leaves your hands. The marks on a report matter less than the assurance you feel when releasing changes.

Here’s one concrete next step: if you only have unit tests, add a single integration test that hits a real queue or database this week. If you’re missing front-end coverage, script a simple Playwright scenario for your most critical workflow.

Drop a comment and share which layer you’re missing and why. Stronger testing means fewer surprises in production and more predictable releases.

Discussion about this episode

User's avatar