Subscribe to our blog
Thanks for subscribing to the blog.
February 28, 2022
Topics: Cloud Data Sense Advanced7 minute readData Governance
The more tools at your disposal to execute a data governance strategy, the better. With Cloud Data Sense’s new tagging and Microsoft Azure Information Protection (AIP) labeling capabilities, users are getting key tools that can help ensure data quality, completeness, and ownership.
This article will show how the NetApp Cloud Data Sense file tagging feature and integration with AIP can help you classify your data from a single pane of glass and make it easier to govern.
Read on below as we cover:
- Data Sense File Tagging
- Azure Information Protection Labels and Cloud Data Sense
- AIP Integration with Data Sense
- Tagging Files in Cloud Data Sense
- Example Data Governance Workflow: Migrating Non-Sensitive Data to Cloud-Based Storage
- Conclusion
Data Sense File Tagging
Tracking data can be difficult using only the name of the file. Labels and file tagging offer an easy way to sort, organize, and report on data based on unique characteristics of the data.
Data Sense now has the ability to tag files and add labels. This is a useful capability that can make managing data much easier. First, users can run a scan with Data Sense to sort and filter data in any repository based on a wide variety of parameters. When the results of the search are presented, users can then classify the results in two ways:
- Add Data Sense tags: These tags are unique to Data Sense and are fully customizable by the user. These tags are only for internal use within Data Sense.
- Add AIP labels: Labels can be added using Azure Information Protection Labels (AIP), which is now integrated with Data Sense. Tagging files this way will add to the file’s metadata. That means the tag can be used outside of Data Sense, as the tag information applies to the files themselves.
Both choices give users a way to easily sort and classify data, making it a great way to better govern the data estate.
Azure Information Protection Labels and Cloud Data Sense
AIP is an application hosted in the Azure cloud that helps organizations organize, track, and secure their data through the use of labels. AIP builds on Azure Rights Management Services (RMS) technology, managing file security across Azure and Microsoft 365 clouds.
AIP helps you manage access to information both within and outside of your organization, such as keeping HR documentation only accessible by HR staff. Another example of its use would be in sharing a document with a specific third party for a limited time and restricting access to just that person. Using AIP, the file cannot be opened by anyone else, even if they have the file.
Now that Data Sense integrates with AIP, users can filter information through Data Sense and then add AIP labels to the results directly through the Data Sense interface.
AIP Integration with Data Sense
With NetApp Data Sense's deep integration of AIP, AIP users can leverage Data Sense to create custom filtered searches and then add AIP labels directly to the results. Data Sense policies can automatically apply AIP labels to groups of files, or labels can be applied manually. All native AIP labels are also fully visible on Data Sense.
The Data Investigation tab.
For example, the screenshot above shows the Data Investigation tab in Data Sense. The AIP Label “New Personal” has been used as a filter, returning a result of 19.4k files. Using the Label dropdown on the first file, you are presented a menu listing all of the existing labels and the option to create a new one.. Depending on what you choose, Data Sense will update the label information in the file's metadata. This label action can also be done with multiple search results, with Data Sense updating the metadata for each of the selected files.
Tagging Files in Cloud Data Sense
Whereas AIP labels are structured, Data Sense tags are more free form and can be more meaningful or descriptive, such as "Change Permissions," "Delete Monday," "Possible Duplicates," or any descriptor you can think of. This gives users a lot of flexibility for how data can be understood and governed.
Classification tags, for example, are often used as part of a workflow. By adding a meaningful tag to a group of files and then assigning those files to another Data Sense user for further work, the user would complete the required actions and update the tag or add a new tag.
Unlike AIP labels, Data Sense tags are only visible within Data Sense. Adding, updating, or removing a tag from a file does not change the file on the file system. Since the file is left unchanged, this feature is very useful for read-only files or file systems where changing several thousand files may automatically trigger security measures, such as an antivirus scan.
Usually, files are tagged in the Data Investigation view of Data Sense. Here users can apply any number of filters to find the relevant files, selecting one, many, or all of the results, and giving them a meaningful tag. Users can also perform other actions in this view, such as assigning items to another user or moving them to another storage area.
If there is a need to use this configuration of filters again, it is possible to create a custom policy containing those filters, which would appear with the Data Sense predefined policies. You can also have the policy run regularly and perform actions on matching files, from simply tagging all matching files to sending email alerts for new files that have matched the policy. The email will not contain a list of the files; instead, the email would include a link to the matching policy, which would open with the filtered results.
Using the Data Investigation tab to add Data Sense tags.
The screenshot above shows the same Data Investigation tab in Data Sense we looked at previously. The AIP Label “New Personal” has also been used as a filter here, and again we have picked the first file, but this time we chose to add a Data Sense tag. You can see the file already has the two existing tags, so we would click on “New Tag” and type in something indicative. If we had selected multiple files, they would all get the new tag.
Example Data Governance Workflow: Migrating Non-Sensitive Data to Cloud-Based Storage
Imagine tasking a user with planning to migrate their organization's data to cloud-based storage but keep any sensitive data in-house. And to be cost-effective, you would like to move any stale data to an archive tier in the cloud. The workflow for such a project using Data Sense file tagging would look like this:
- Firstly the user would need to connect Data Sense to the company’s storage systems, then scan the data using the Data Investigation view of Data Sense.
- Data Sense can discover the sensitive data in the scanned files. This avoids the painstaking task of having to manually identify such data. Once the results appear, the AIP label of "Sensitive" can be added.
- Assign the “Sensitive” labeled files to another user for actioning
- Then the user would investigate and filter for any stale files, specifically data that hasn’t been accessed within a certain amount of time, and add a tag of “Stale Data” to all the results.
- The user could then assign the “Stale Data” files to a different user for actioning.
Workflows can be created for many different uses, from complex compliance workflows to simply notifying users waiting for file uploads. The workflows can also leverage Data Sense’s adjustable RBAC policies.
Conclusion
NetApp Data Sense can enable agile data governance across many different storage systems and databases, all viewed and managed from a single pane of glass. Files can be classified, tagged, labeled, assigned, copied, moved, or deleted within one view, reducing the complexity of managing data in different locations and optimizing data governance time and storage budget.
Building workflows around Data Sense classification can streamline processes and reduce complexity, and it does it across any type of storage system, whether in the cloud or on-prem, and a wide range of databases. Read how Data Sense helped this company pinpoint sensitive data on their systems that they didn’t even know was there.