Azure Information Protection‘ is a cloud-based solution that helps an organization to classify, label, and protect its documents and emails. This can be done automatically by administrators who define rules and conditions, manually by users, or a combination where users are given recommendations. The classification is identifiable always, regardless of where the data is stored or with whom it’s shared. The labels include visual markings such as a header, footer, or watermark. Metadata is added to files and email headers in clear text. The clear text ensures that other services, such as data loss prevention solutions, can identify the classification and take appropriate action.

To be able to utilize ‘Azure Information Protection’, specific subscriptions are required within Office 365 or Azure Active Directory. The following are the licenses and what is supported for each:

  • Azure Information Protection P2: Support for all classification, labeling, and protection features.
  • Azure Information Protection P1: Support for most classification, labeling, and protection features, but not automatic classification or Hold Your Own Key (HYOK).
  • Office 365 that includes the Azure Rights Management service: Support for protection but not classification and labeling.

In conjunction with ‘Azure Information Protection‘ document and email protections are handled via ‘Azure Rights Management‘. This protection technology uses encryption, identity, and authorization policies. Similarly, to the labels that are applied, protection that is applied by using Rights Management stays with the documents and emails, independently of the location—inside or outside your organization, networks, file servers, and applications. This information protection solution keeps you in control of your data, even when it is shared with other people.

Though this technology is great, it is isolated to end users using it on the content they create and store. When it comes to being GDPR compliant for this, these two technologies clearly help organizations meet that requirement when the policies are defined and applied. However, one of the first tasks in a GDPR exercise is to know and find the content you have and then classify as needed. This can often be the biggest problem, as most organizations have content stored in collaboration tools such as SharePoint, file servers, local machines or other applications. Asking business users to go to these locations and classify is a task that will never happen.

So how can we automate this process?

Microsoft recently launched the ‘Azure Information Protection scanner‘, that can help with this process. This scanner runs as a service on Windows Server and lets you discover, classify, and protect files in the following data stores:

  • Local folders on the Windows Server computer that runs the scanner
  • UNC paths for network shares that use the Common Internet File System (CIFS) protocol
  • Sites and libraries for SharePoint Server 2016 and SharePoint Server 2013

How does the scanner work?

Firstly, as an organization, you need to create the ‘Azure Information Protection‘ policies for the labels that you want to apply for automatic classifications. More details on creating the policies can be found here:

The basic steps however are:

  1. Make sure that you are signed in to the Azure portal by using one of these administrative roles: Information Protection Administrator, Security Administrator, or Global Administration. See the preceding section for more information about these administrative roles.
  1. If necessary, navigate to the Azure Information Protection blade: For example, on the hub menu, click All services and start typing Information Protection in the Filter box. From the results, select Azure Information Protection.
  1. The Azure Information Protection – Global policy blade automatically opens for you to view and edit the global policy that all users get.
  1. The Azure Information Protection policy contains the following elements that you can configure:
    1. Labels that let you and users classify documents and emails.
    2. Title and tooltip for the Information Protection bar that users see in their Office applications.
    3. The option to enforce classification when users save documents and send emails.
    4. The option to set a default label as a starting point for classifying documents and emails.
    5. The option to prompt users to provide a reason when they select a label that has a lower sensitivity level than the original.
    6. The option to automatically label an email message, based on its attachments.
    7. The option to provide a custom help link for users.

The scanner itself can discover files, that can be labeled based on the policies created, Labels apply classification, and can optionally apply or remove protections. The scanner inspects any file that Windows can index. By utilizing iFilters that are installed on the computer any file type could be accessed. The scanner uses the Office 365 built-in data loss prevention (DLP) sensitivity information types and pattern detection, or Office 365 regex patterns to determine if a label or protections should be assigned. Because the scanner uses the Azure Information Protection client, it can classify and protect the same file types.

The scanner can also run in ‘discover only mode‘. Where reports are generated for review later, showing you what would happen if the content was classified and labeled. As a note, the scanner does not discover and label in real time. It systematically crawls through files on data stores that you specify, and you can configure this cycle to run once, or repeatedly.

How can I install the ‘Azure Information Protection’ scanner?

Right now, the scanner is a separate download using the following URL, but in the future, it will be bundled as part of the ‘Azure Information Protection‘ client.

From the download select the ‘AzInfoProtectionScanner’.

Once downloaded, click to install. If needed you can install a ‘demo policy‘ as part of the setup, which can be used for testing, until the application is connected to your ‘Azure Information Protection‘ service instance.

Once it is installed you need to launch a PowerShell window, so the commands to use it can be run. The first task is to setup the SQL database that will be used by the scanning tool. This is done by typing the following:

For a default instance: Install-AIPScanner -SqlServerInstance SQLSERVER

For a named instance: Install-AIPScanner -SqlServerInstance SQLSERVER\AIPSCANNER

For SQL Server Express: Install-AIPScanner -SqlServerInstance SQLSERVER\SQLEXPRESS

For this example, we will use the SQL Express but you could use a full-blown SQL implementation if that is installed on your network. Firstly, however, you may need to import the ‘AIP‘ DLL for the commands to work.

Next, we can run the installation command:

When prompted type the service account you wish to use for the scanner.

Once it has completed successfully you can check in the ‘Services‘ console and the new service will be listed.

Next, within the PowerShell console, we need to run the command that checks the status and mode of the scanner.

The scanner by default is set to ‘Discover’ mode, and outputting informational reports only. These settings can be changed easily by using the ‘Set-AIPScannerConfiguration‘ command.

Next, you need to set the repository of data that you wish to scan. You can check what is being used by typing ‘Get-AIPScannerRespository‘. The first time you run this, it will be blank as nothing has been set. For this example, I am going to use a network share on a server. To set this we can use the following command:

As part of setting the repository, you can also set some options that determine the configuration of the default label to apply.

Next, you need to get an ‘Azure Information Protection‘ authentication token for the scanner. This is done using the ‘Set-AIPAuthentication‘ command. When prompted login with your Office 365/Azure AD account that has access to your ‘Azure Information Protection‘ service. Once completed this will return an access token that will then be used by the scanner.

Now that the scanner is configured, the mode set, and a repository set, you are now ready to run a scan. This done by using the following command:

Once done, open the ‘Services‘ console and start the service ‘Azure Information Protection Scanner‘. When this completes the scanner, service will stop automatically. Any reports, that were generated are stored within ‘%localappdata%\Microsoft\MSIP\Scanner\Reports‘ and have a ‘CSV‘ file extension.

Open the ‘CSV‘ to review the files found, the label applied, and the condition used and found on the content. Of course, for this example, it was only set to ‘Discover‘ not ‘Enforce‘ which would have applied the labels and protections.

What are file types supported by the ‘Azure Information Protection’ scanner?

The scanner uses Windows iFilters to scan the core set of Office file type.

Application type File type
Word .docx; .docm; .dotm; .dotx
Excel .xls; .xlt; .xlsx; .xltx; .xltm; .xlsm; .xlsb
PowerPoint .ppt; .pps; .pot; .pptx; .ppsx; .pptm; .ppsm; .potx; .potm
Project .mpp; .mpt
PDF .pdf
Text .txt; .xml; .csv

For any other file types due to limitations on opening the files, ‘Azure Information Protection‘ will apply the default label.

Application type File type
Project .mpp; .mpt
Publisher .pub
Visio .vsd; .vdw; .vst; .vss; .vsdx; .vsdm; .vssx; .vssm; .vstx; .vstm
XPS .xps; .oxps; .dwfx
Solidworks .sldprt; .slddrw; .sldasm
Jpeg .jpg; .jpeg; .jpe; .jif; .jfif; .jfi
Png .png
Gif .gif
Bitmap .bmp; .giff
Tiff .tif; .tiff
Photoshop .psdv
DigitalNegative .dng
Pfile .pfile

More details can be found here:

So how does this help with General Data Protection Regulations (GDPR)?

As you can see, using this scanner, combined with the policies created within the cloud, ALL content can have policies applied which can then be used to control the flow of data, especially when it comes to Personally Identifiable Information (PII).

In the next post, we will look at scanning content within SharePoint, as well as File shares, and then apply live policies instead of just discovery.