October 24, 2021
Topics: Cloud Data Sense Data ProtectionDatabaseAdvanced7 minute read
Databases have played a major role in business systems for decades. In early use, a database was a monolithic repository: software focused on a narrow range of tasks was built around it, with the database acting somewhat like the nucleus of a single-cell organism. Fast forward a few decades, and databases have become real-time, infinite, ever-expanding sets of data interacting with other systems and subsystems, like the bloodstream in a sophisticated organism, where data branches off in workloads to nurture the various organs and subsystems that make the business’ body run.
The challenge now is finding a way to govern this huge volume of data, while protecting confidential and private information. This blog post is part of a series that will cover the data security features that help protect information in MongoDB, PostgreSQL, and other database engines now supported by NetApp Cloud Data Sense to show you how to solve data security problems using their feature sets.
- Conceptual Overview
- MySQL Data Encryption
- MySQL Data Masking
- MySQL Data De-Identification
- MySQL with Cloud Data Sense
Oracle acquired the MySQL relational database engine in 2010. It pledged to keep maintaining the free (Community) edition of MySQL. It then added more robust tools for the paid (Enterprise) version. What you’ll find as we walk through the topics of data masking, data de-identification and data encryption is that Community MySQL offers some support for data privacy, but (unsurprisingly) Enterprise MySQL offers more.
Customers require database vendors to house ever-increasing volumes of data from various sources, such as, transactional systems, analytics, third-party vendors and even IT infrastructure log files. Much of the data collected is confidential and/or personal information. Because of global privacy regulations, most businesses are obligated to protect this type of data from misuse, thereby leading.
Toward that goal, we’ll look at three database platform security features that assist businesses with implementing data protection and adhering to global privacy regulations:
- Data Encryption: Simply put, encryption scrambles the data, at-rest or data-in-transit, so that only an actor with the proper credential and key can access it.
- Data Masking: Data masking is a process that creates a version of the data that is structurally similar to the actual information but hides or “masks” the real data.
- Data De-Identification: Data de-identification is any process that prevents a person’s private information from being traced back to them.
Let’s see what MySQL has to offer for these important data security features.
MySQL Data Encryption
Protecting data stored in a database has been on the wish list for system builders for many years. To begin with, this had to be done outside of the database engine. The options included:
- Encrypted file system: The database engine ultimately has to store its data in files on the operating system. Operating system vendors have provided encryption for decades, and third-party tools provide performance enhancements and other features.
- Encrypting data prior to storage: At the application level, data could be encrypted prior to inserting into the database. This is an odious process for application developers, though one advantage is that they can select which parts of the data actually needed encryption.
Database vendors started to offer at-rest database-level encryption starting in 2015, and MySQL was among the leaders. It provides database encryption at-rest and in transit, which helps businesses implement encryption security in order to meet privacy obligations under various national and international privacy rules and regulations.
MySQL has had in-flight encryption options using SSL (Secure Socket Layer) for much longer. There are also third-party tools to use in place of MySQL SSL that offer performance enhancements and additional features.
Oracle has added to the at-rest MySQL encryption options since MySQL 5.7. Here’s how at-rest support breaks down between the two editions.
Community Edition Data Encryption
Community Edition provides you with following set of encryption features:
- File data: Encryption can be applied per tablespace and per table to provide flexibility
- Redo logs: MySQL uses files other than for the tables to support various operations, such as redo logs. Encryption can be applied to the files used by the redo log mechanism.
- Undo logs: Similar file-level support for undo logs.
Enterprise Edition Data Encryption
MySQL Enterprise edition support for encryption adds the following capabilities:
- The Enterprise Transparent Data Encryption (TDE) product provides additional protection by storing keys separately from the data, two-tier encryption architecture and zero-downtime implementation
- More in-flight encryption options such as asymmetric cryptography, the ability to create public and private keys and support for digital signatures and cryptographic hashing.
MySQL’s encryption offering allows application developers to encrypt and decrypt data using functions at the database level.
MySQL Data Masking
Oracle recently introduced data-masking functionality for MySQL exclusively for Enterprise edition.
Community Edition Data Masking
You can still get data-masking in the Community edition, but you’ll have to look toward the diverse third-party tool market to find a suitable implementation. There are a number of plugins available, such as from Percona and DataSunrise.
Enterprise Edition Data Masking
The MySQL data-masking feature installs as a plugin. Once installed, you apply the masking functionality as part of SELECT statements using the database functions supplied by the plugin. By extension, you can also create views that implement your business rules for masking.
Examples for masking include:
- Inner: “Mary” becomes “Xary”, “Barbara” becomes “XXXXara”
- Outer: “Mary” becomes “####”, “Patricia” becomes “##tric##”
- Pan: Useful for things like credit card numbers, where you need to return the “Last N” numbers
As with encryption, developers manage how masking gets applied at the SQL statement level, which provides data masking on views as needed.
MySQL Data De-Identification
MySQL data de-identification comes as part of the same package as the data-masking, and is therefore only available in the Enterprise edition.
The approach MySQL takes on de-identification is to leverage the same tools as data masking while relying on the internal business processes to dictate the appropriate form of de-identification, whether that is to pseudonymize or anonymize data. MySQL does not offer a feature to bifurcate data to de-identify it but rather provides the developer with options that should be used in line with existing privacy regulations and corporate policy and procedure.
Community Edition De-Identification
You can look at the same third-party plugins that you find for data masking to see what they offer in terms of data de-identification at the database level.
Enterprise Edition De-Identification
Additional SQL functions that you can apply for the purpose of data de-identification are as follows:
- Random data substitution: This function replaces the actual values of the data with randomized values that continue to maintain the same format.
- Blurring: Varies the actual data by introducing random variance to the data. An example of this would be randomizing the numeric range for something like employee benefits.
- Dictionary substitution: Replaces values at random from a predefined task-related word bank.
Blocklisting and substitution: Any blocklisted data is replaced, while non-blocklisted data is left as is.
MySQL with Cloud Data Sense
Database administrators and system builders have a lot to think about when it comes to securing data. MySQL provides tools for at-rest and in-flight data. The rich MySQL third-party vendor community offers many plugins to help in your efforts to secure data as well. Performance depends on the features used and how they are configured.
NetApp Cloud Data Sense supports MySQL, as well as a number of other popular databases, including Postgres, MSSQL, Oracle, and SAP HANA, and MongoDB. Cloud Data Sense gives database deployment an additional utility for data governance and privacy: AI-driven data mapping that can identify the data stored in your database so you can pinpoint and report on that data to find the data that needs the highest level of care, where it’s stored, and how it’s used.