M365 Show -  Microsoft 365 Digital Workplace Daily
M365 Show with Mirko Peters - Microsoft 365 Digital Workplace Daily
SharePoint Online Permission Auditing at Scale
0:00
-15:55

SharePoint Online Permission Auditing at Scale

Your SharePoint permissions are probably a mess. Not because you don’t manage them — but because nobody can keep up with thousands of sites changing daily. The shocking part? Most organizations have no single report showing who has access to what. In this session, I’ll show you the exact steps to scan every site, every library, every user — without touching a single site manually. By the end, you’ll know how to automate the work that normally takes weeks into something that delivers daily, accurate reports — and actually sleep better knowing you have control.

Why Traditional Permission Reviews Break at Enterprise Scale

You know that annual permissions review everyone gets so excited about? The spreadsheet goes out, site owners tick through their lists, managers sign off, and for about twenty-four hours it feels like you’ve got everything under control. By the next week, someone’s shared a folder with a new contractor, a project site has been spun up without notice, and the “final” record you just archived is already missing reality by a mile.

On a small collection, it’s still possible to catch changes before they spiral. You pull the list of site members, maybe check a couple of groups, and confirm no one has oddball access. In that world, manual review works. The permissions tree is short enough to see in one screen, and the number of hands making changes is small enough to track. It’s boring, but it’s manageable.

At enterprise scale, that model falls apart fast. You’re no longer looking at a tidy set of five intranet sites. You might be staring down ten thousand sites across departments, regions, and business units — and they’re not static. Teams create new sites daily, archived projects never quite disappear, and content churn means permission changes happen constantly. The window between your review and the next significant change is sometimes measured in hours.

Even worse, SharePoint is deceptive when you try to eyeball it. Permissions can be inherited from the parent site, overridden at the library level, tweaked on a folder, and then patched again on a single file. A user’s access might not be obvious because they’re coming in through a nested group — maybe even through a security group synced from Azure AD that itself holds other groups. One missing click into those layers, and you have no clue they’re in there.

Compliance teams still expect clean audit logs and evidence of regular reviews. The reality is, you’d need an army of admins to manually walk through each site’s structure, note every permission, and confirm it’s valid. That’s without factoring in time to re-check inherited and group-based access, which changes the moment someone moves a user between teams. The practicalities just don’t match the scale.

I worked with an organization that dedicated over 80 admin hours to one quarterly review. They split the workload, pulled membership reports, even had a formal process mapped out. The end file looked thorough — but two weeks later, a penetration test found guests with edit access to confidential folders that had been missed entirely. Not because anyone failed at their job, but because the access came through a nested group that never appeared on the manual report.

That’s the gap that will keep you awake. Stale permissions hiding deep in site structures. Terminated employees whose accounts linger in synced groups. Guest accounts that were supposed to expire but didn’t. They’re easy to miss, and if you’re relying on a manual sweep, you’re counting on luck as much as process.

You start to realise the “snapshot once a year” model isn’t broken because people are lazy — it’s broken because the system it’s trying to capture moves constantly. Permissions are living data. Treating them like a static list means you’re always in the past, never in the live state of your environment.

The solution isn’t throwing more people at the review. It’s building a way to query and consolidate this data automatically, so the moment something changes, your reports reflect that. The next step is connecting to every site without needing to click through them one by one — and that’s where a more capable tool comes in.

Building the Foundation with PnP PowerShell

Imagine opening a PowerShell window, running one command, and being connected to every SharePoint site in your tenant. No browser tabs, no endless clicking through site collections — just a direct line into the entire environment from a single place. That’s exactly what PnP PowerShell gives you, and if you’ve only used it for small ad‑hoc scripts, it can be a bit of a shock how far it can actually stretch.

PnP PowerShell is essentially your bridge between SharePoint Online and your automation environment. It wraps Microsoft’s APIs into commands that are easier to work with, while still giving you access to advanced functionality under the hood. At a small scale, you can get away with running `Connect-PnPOnline` interactively, logging in with your account, and pulling some site data. But at scale, interactive logins become a nightmare — you can’t expect scheduled processes to sit there waiting for someone to type a password or approve MFA.

That’s where the cracks start to show in naïve scripts. You might get halfway through enumerating sites before your token expires. You might hammer the service too quickly and hit throttling limits. Or you discover that not every site fits the same neat structure — some use modern team templates, others are classic collections with oddball permissions and settings in unexpected places. The more you try to brute‑force it, the more brittle it becomes.

A better way is to shift to app‑only connections. In practice, this means creating an Azure AD app registration, granting it the necessary SharePoint and Graph permissions, and authenticating with a certificate rather than a user account. That certificate‑based auth is far more stable for unattended processes. PnP PowerShell supports it out of the box, so once you have the certificate stored securely — preferably somewhere like Azure Key Vault — your scripts can connect without prompts and without risking expired passwords.

Now, how do you actually find all the sites to connect to? At tenant scale, you can’t maintain a hardcoded list. You can use `Connect-PnPOnline` with the Search‑based site enumeration or integrate with Microsoft Graph to pull every site collection URL dynamically. Graph tends to be better for consistency, but PnP’s Search approach can give you quick wins in smaller tenants. The key is that the enumeration itself has to be tenant‑wide and automated — no manual curation.

Once you have a list, you still need to be respectful to the service. Batch your requests. Use pauses or throttling controls. It’s not just about avoiding 429 errors from Microsoft; it’s about making sure your process finishes in a realistic timeframe without overwhelming the endpoints. Handling this well means structuring your loops so they process a manageable subset of sites at a time, writing interim results, and resuming gracefully if a session drops.

An example of secure handling in action would be using a PowerShell runbook that pulls your certificate from Key Vault at runtime, connects to the admin center, retrieves all site URLs using Graph, and then iterates through them in controlled batches. No login prompts. No hardcoded credentials. Fully repeatable. You could run that on demand today, and tomorrow on a schedule.

At this point, you’ve essentially wired your console into the nervous system of your SharePoint tenant. You can reach every site programmatically without ever touching the UI. That solves the first hurdle for enterprise‑scale permission auditing — discovery and connection. But what you have right now is still surface‑level. You can grab site properties, maybe top‑level groups, but you’re not yet seeing the nested, inherited access that actually matters for compliance.

Getting that depth means tapping into a richer dataset than PnP alone provides. The commands here are great at orchestrating connections and traversing sites, but to unpick the full permission story across every file and folder, we need to bring in another API that was built to expose those relationships cleanly. That’s where the next layer of this approach comes into play.

Mining Permission Data with Microsoft Graph API

If connecting with PnP PowerShell gives you the keys to every site, using Microsoft Graph API is like walking into each one and actually seeing the full guest list — who’s there because they were invited directly, who’s part of a group, and who’s passing through from an inherited door you didn’t even notice. It’s the part where you stop guessing and start getting a clear, unified view across thousands of sites and libraries at once.

Graph sits underneath a lot of Microsoft 365 services. For permissions, it acts as the backbone that lets you query SharePoint, OneDrive, and Teams in a consistent way. The difference is it doesn’t just hand you a flat list. It lets you pull site objects, lists, libraries, files, and the associated permission objects for each. That matters, because nothing in SharePoint permissions lives neatly in one place. Direct assignments live alongside group memberships, which may be sitting in Azure AD groups that have their own nested groups inside.

For example, the `/sites/{site-id}/permissions` endpoint can tell you about sharing links and access grants at the site level, but that doesn’t give you everything. List-level permissions might require `/sites/{site-id}/lists/{list-id}/permissions`, and item or file-level access calls might need `/drives/{drive-id}/items/{item-id}/permissions`. To make sense of who actually has what, you have to stitch those results together. That includes looking up group memberships using `/groups/{group-id}/members` and resolving user objects so you know exactly who’s behind a group entry.

Where it gets messy is that inheritance is invisible if you only look at direct permissions. A file might say it has no unique permissions, which really means “look up a level.” If you stop there, you’ll miss whole categories of access. So, you need logic in your process that steps up the chain — from file to folder to library to site — checking at each level and consolidating that data until you see the complete inherited path.

Pagination and throttling are another reality here. Graph responses will often cut off after 200 items, and you need to follow `@odata.nextLink` tokens to pull the rest. At scale, that means building request loops that can handle thousands of responses without timing out or losing context. Throttling is handled through 429 responses with a suggested retry-after value, so your code has to respect that or you’ll get nowhere fast.

One trap admins fall into is only collecting direct permissions. That produces a clean-looking dataset that’s also dangerously incomplete. Using multiple Graph calls together solves that — file-level permissions plus library-level, list-level, and site-level data, cross-referenced with full Azure AD group membership expansions. The end goal is not separate spreadsheets for each type, but one flattened, normalized dataset where each row shows the resource, the resolved user, and the effective access level they have, regardless of how it was granted.

A practical approach is to run collection in two passes. First, enumerate all resources — sites, lists, and critical libraries or document sets. Second, for each resource, query direct permissions and then walk upward to collect inherited entries. During that, resolve any group IDs you find into actual user accounts by calling the group membership endpoints. That way, by the time you run analysis, you’re working only with tangible user and guest objects, not cryptic IDs.

The result is a dataset that’s usable. You can sort by user and see every resource they touch, or sort by resource and see every account with access. You can apply filters for things like “guest” or “external” and have instant answers without pulling fresh reports. This is the kind of visibility that an annual manual review could never match — because you can run it any time you want and be confident that nothing’s hidden in a group nesting three levels deep.

With that level of accuracy, the next obvious step is to stop running it manually at all. If you can make the queries run on their own, on a schedule, you’ll always have a fresh picture without someone hovering over a PowerShell window. That’s where orchestration kicks in.

Automating the Audit with Azure Automation

Picture starting your day and finding a complete, up-to-date permissions report sitting in your inbox — no late‑night scripts, no one remoting into a server, no manual exports. That’s the appeal of putting the whole process on autopilot, and Azure Automation is one of the best ways to make it happen. It’s essentially the scheduler and execution engine for all the PnP PowerShell and Microsoft Graph work you’ve already put together, but without you having to be in front of a keyboard.

Azure Automation runbooks are where your scripts live and run in the cloud. Instead of leaving them on a server that someone might reboot or lose access to, you upload them into a managed service. That service handles the execution, logs the results, and lets you trigger them on a schedule. But when scripts run without you watching, things can go wrong in ways that are easy to miss — like an expired certificate stopping authentication, or a long‑running job hitting a timeout halfway through. If you don’t plan for those, you’ll have a report that fails silently, or worse, delivers incomplete data that looks fine at a glance.

The starting point is securing your authentication. Pasting credentials into a script is a quick way to make a security team very unhappy. The right approach is to store your certificate or client secret in Azure Key Vault and have the runbook pull it at runtime. Key Vault keeps the sensitive material encrypted, and role‑based access controls make sure only your automation account can retrieve it. When the certificate expires, you can roll it over in one place without editing every script.

Scheduling in Azure Automation is flexible. Daily runs capture a near‑real‑time picture, but if your environment changes more slowly, a weekly schedule might be enough. You can set exact times, align with off‑peak hours to reduce load on the tenant, and even kick off runs in response to events instead of just the clock. If the job needs more resources than the Azure sandbox can offer — for example, if you’re dealing with extremely large tenants and running very long enumerations — Hybrid Runbook Workers let you execute those scripts on‑premises or in a dedicated VM while still managing them from Azure Automation.

Logging is just as important as the output itself. Without logs, troubleshooting becomes guesswork. Azure Automation can capture both standard output and error streams into job logs, which you can review in the portal or export for longer‑term storage. Keeping that history means you can prove the audit ran at a given time and see exactly what happened if it didn’t finish. For compliance, that audit trail can be as valuable as the permission report itself.

When the script completes, you have options for where the data lands. You could write the CSV or JSON output to a SharePoint document library, drop it into Azure Blob Storage, or attach it to an automated email via an SMTP relay or a Logic App. Each has trade‑offs — SharePoint is great for team access, Blob Storage handles very large files, email is instant but less secure for sensitive datasets. The point is, you choose the delivery that fits your review process.

One more layer is identifying when something truly needs attention. It’s possible to integrate basic change detection — for example, compare today’s dataset with yesterday’s, and if a new guest user appears in a sensitive site, post an alert in Teams or send a flagged email to the security group. That turns your scheduled job from just a reporting tool into an active early‑warning system.

By combining Azure Automation’s scheduling and credential management with the data collection you’ve already built using PnP PowerShell and Graph, you move from reactive, ad‑hoc checks to a baked‑in, hands‑off process. Now, those three parts — connection, data retrieval, and automation — work together as one continuous, proactive posture instead of three disconnected tasks.

Conclusion

At enterprise scale, guessing at permissions isn’t an option. Without a live, accurate view, you’re hoping nothing slips through — and hope isn’t a security strategy. The tools are there to make this effortless once you connect them.

If you do one thing this week, set up a PnP PowerShell connection to your tenant. That’s the base you can build on. From there, expand into Graph queries and automation.

When you move from chasing problems to monitoring in real time, you stop firefighting. You start managing with intent — and that shift changes both your productivity and your peace of mind.

Discussion about this episode

User's avatar