Deploying Microsoft Copilot across your organization gives users powerful AI-driven insights from Microsoft 365 data. However, Copilot can only surface information it can access. If sensitive or poorly governed SharePoint sites are exposed, Copilot might return confidential data to users who should not see it. This article explains how to identify high-risk sites before deployment so you can apply the right permissions and data governance controls.
High-risk sites are those with broad permissions, stale content, or missing sensitivity labels. Without proactive scanning, these sites become attack surfaces for accidental data leakage through Copilot. You will learn the specific site attributes to audit, the tools Microsoft provides for assessment, and the steps to remediate risks before turning on Copilot for your tenant.
By the end of this guide, you can run a site inventory, check permission inheritance, and enforce labeling policies. This preparation ensures Copilot returns only appropriate, well-governed content to authorized users.
Key Takeaways: High-Risk Site Identification for Copilot
- SharePoint admin center > Active sites > Permissions column: Shows sites with broken permission inheritance or external sharing enabled, both high-risk indicators.
- Microsoft Purview > Data Classification > Content Explorer: Lists sensitive content types and their locations across SharePoint and OneDrive.
- Microsoft 365 admin center > Reports > Usage > SharePoint site usage: Reveals stale sites with no activity for 90+ days that may contain outdated permissions.
Why Some SharePoint Sites Are High Risk for Copilot
Copilot grounds its responses on Microsoft Graph data, including SharePoint sites, OneDrive files, and Teams conversations. When a site has overly permissive access, Copilot can return content from that site to any user who has even minimal access to it. The primary risk factors are:
Broad Permission Inheritance
Sites that inherit permissions from a parent site collection or use the default “Everyone except external users” group expose content to all internal users. If a site contains financial projections, HR records, or legal documents, Copilot may surface those details to employees in other departments.
External Sharing Enabled
Sites with external sharing set to “Anyone” or “New and existing guests” allow users outside your organization to access content. Copilot does not bypass sharing settings, but it can return content from externally shared sites to internal users who have access. This creates a compliance risk if guest users inadvertently view sensitive internal data.
Missing Sensitivity Labels
Sites without sensitivity labels cannot be automatically classified or protected. Copilot respects sensitivity labels when returning content, but unlabeled sites are treated as unprotected. This means Copilot might include data from an unlabeled HR site in a response to a sales team member who has read access.
Stale Sites with Outdated Permissions
Sites that have not been accessed or updated in more than 90 days often have permissions that were set years ago. These sites may contain legacy data that is no longer relevant but still accessible. Copilot can surface this old data, causing confusion or accidental disclosure.
Steps to Identify and Remediate High-Risk Sites
Follow these steps to audit your SharePoint environment and reduce risk before enabling Copilot. Perform these steps in the order listed to build a complete picture of your site health.
- Run a site inventory from SharePoint admin center
Go to SharePoint admin center > Active sites. Click Export to download a CSV of all sites. Open the CSV in Excel and review the Permissions column. Filter for sites with “Inherited” set to No — these have unique permissions that may be overly broad. Also filter for External sharing set to Anyone or New and existing guests. - Check site sensitivity labels in Microsoft Purview
In Microsoft Purview > Information Protection > Sensitivity labels, select Label analytics. Look for sites with no label assigned. These sites are unprotected. Create a report by exporting the Sites with no label list. Prioritize sites that contain sensitive data types, such as credit card numbers or health records. - Scan for sensitive content with Content Explorer
Open Microsoft Purview > Data Classification > Content Explorer. Filter by location type SharePoint. Review the list of sites that contain sensitive info types. Note the site URLs and the number of files with sensitive data. These sites require immediate permission review and labeling. - Identify stale sites using usage reports
Go to Microsoft 365 admin center > Reports > Usage > SharePoint site usage. Select Last 7 days and look for sites with zero active users. Also check the Last 90 days view. Sites with no activity for 90 days are stale. Review their permissions and consider archiving or deleting them if they are no longer needed. - Apply sensitivity labels to high-risk sites
For each site identified as high risk, apply a sensitivity label. In SharePoint admin center > Active sites, select a site, then click Sensitivity. Choose a label that matches the data classification, such as Internal or Confidential. This ensures Copilot respects the label when generating responses. - Restrict external sharing on sensitive sites
In SharePoint admin center > Active sites, select a site, then click Sharing. Change the external sharing setting to Only people in your organization or lower. For sites with sensitive data, set it to Only existing guests or Disable sharing. This prevents external user access through Copilot. - Review unique permissions and remove broad access
For sites with broken permission inheritance, go to Site permissions > Advanced permissions settings. Remove groups like Everyone except external users or NT AUTHORITY\authenticated users. Replace them with specific security groups or individual users who need access. Check unique permissions on subsites as well.
If Copilot Still Returns Unexpected Content After Remediation
Even after auditing and fixing sites, you may encounter situations where Copilot surfaces content you did not expect. Here are common issues and how to resolve them.
Copilot Returns Generic Output Instead of Tenant-Specific Data
If Copilot gives vague answers that do not reference your SharePoint content, the issue is likely that Copilot cannot access the site. Check that the site is indexed by Microsoft Search. Go to SharePoint admin center > Search and verify the site URL appears in the search schema. If not, reindex the site by clicking Reindex site in site settings.
Copilot Shows Content from a Site You Thought Was Restricted
This usually happens when a user has direct access through a file link or a shared folder even though the site-level permissions seem restrictive. Check file-level permissions. Use Microsoft Purview > Data Classification > Content Explorer to find files with broad sharing links. Revoke any “Anyone with the link” sharing and replace it with specific user access.
Copilot Does Not Respect Sensitivity Labels on Files
Sensitivity labels applied at the site level do not automatically apply to all files within that site. Files must have their own labels. Use Microsoft Purview > Information Protection > Labeling to create auto-labeling policies for SharePoint. Set a policy that applies a label to files containing specific sensitive info types, such as “Credit Card Number” or “Social Security Number.”
High-Risk Site Attributes: Before and After Copilot Deployment
| Attribute | Before Copilot Deployment | After Copilot Deployment |
|---|---|---|
| Permission inheritance | Inherited from parent site collection, often including “Everyone” group | Unique permissions with only required users or groups |
| External sharing setting | Set to “Anyone” or “New and existing guests” | Set to “Only people in your organization” or disabled |
| Sensitivity label | No label assigned | Label assigned matching data classification (e.g., Internal, Confidential) |
| Site activity | No activity for 90+ days | Archived or deleted, or permissions reviewed and updated |
| Content sensitivity | Contains sensitive info types without protection | Sensitive files auto-labeled and access restricted |
Identifying high-risk sites before Copilot deployment is a critical governance step. Use the SharePoint admin center, Microsoft Purview, and usage reports to find sites with broad permissions, external sharing, missing labels, and stale content. Apply sensitivity labels, restrict sharing, and remove broad access groups to reduce the risk of accidental data exposure. After deployment, monitor Copilot responses and use Purview auditing to catch any remaining issues. As an advanced tip, create a recurring Power Automate flow that runs weekly to export site permissions and flag any new sites with external sharing enabled, so you can address risks continuously.