In this post we are going to perform a case study using the Oracle Data Masking and Subsetting Pack.
Data masking is the process of permanently replacing sensitive data with dummy data to remove sensitive data from test, development, analytics and other non-production environments.
Before using it, keep in mind that if you are not in an architecture where licenses are included (Oracle ExaCC e.g.), these are specially licensed for use.
Oracle Data Masking and Subsetting features:
Oracle Data Masking and Subsetting is pre-installed with Oracle Enterprise Manager.
To use Oracle Data Masking and Subsetting, you must have a special license to use it.
You need to purchase the full license for the "Oracle Data Masking and Subsetting Pack".
The version of Oracle Data Masking and Subsetting is based on the version of the Plug-in installed in Oracle Enterprise Manager.
P R A C T I C E CASE
For this case study we will use the example schemas included with Oracle (HR schemas).
If you have not included it during installation, or need to recreate it, you can follow this link:
https://docs.oracle.com/en/database/oracle/oracle-database/19/comsc/installing-sample-schemas.html#GUID-1E645D09-F91F-4BA6-A286-57C5EC66321D
Data model creation
First of all it is necessary to enter the OEM13 console and discover the data (Application Data Modeling).
Enterprise -> Quailty Management -> Application Data Modeling
We start filling in the data:
Screen.1
We choose the schema where we are going to perform our data masking.
At this point we can launch the job that will discover the data model and the relationships between tables, as we indicated in the first point(Screen 1).
We launch the process:
If everything is correct, we will find ourselves in this situation:
If we click on the "Open" button, we can see the result of the data discovery performed by the job.
At this point, we can proceed to the discovery of Sensitive data.
Sensitive Data Discovery
Oracle has by default standard types of sensitive columns that you can add or you can add new sensitive columns in a customized way.
Execution of the sensitive data discovery job in the data model:
We can see the columns that the job has discovered how sensitive.
Data Subsetting
It is possible to choose which data set we want to mask.
It can be the total of the data, or it is possible to choose a set of the data.
Example Jobs table at 50%:
Define subsetting criteria
Once the set of data to be masked has been chosen, we proceed to define the anonymization criteria to be applied to the sensitive data.
We choose anonymization criteria.
For example, for the telephone field, we will apply a "Random Numbers" criterion of 7 characters. And for the e-mail field, we will choose "shuffle".
The criteria are many and can be customized, so it is highly recommended that you read the documentation and perform different tests to obtain the desired result.
Options - script export
Once the criteria have been defined, there are several options for generating the script. One is to perform a data export with the anonymized data and import them into a different database, another option, which is the one we will choose, is the creation of a script (PL/SQL) that will apply the masking on the database where it is executed. As a general rule, it is usually a copy of the main database.
To do so, click on the "Generate Script" button.
Options:
We perform the data export and import into another database on the same schema:
Options - export / import
Original data
Data after import data pump
Data masking steps with sql script (pl/sql).
Options - export script
Original data:
Data after launching anonymization scripts, in test environment.
In this entry we have made only a small approximation of this tool. There are many options that this tool offers, therefore I advise to deepen as much as possible in the same to be able to get the most out of it.
You can have more information in this official link: https://www.oracle.com/uk/security/database-security/data-masking/