As data protection regulations evolve, a multifaceted approach is required to securely manage your cloud data. It’s no longer enough to control external data risks. Internal misuse is just as important and requires a consistent approach to ensure your data remains protected. In addition to tools such as data masking and tokenization, data anonymization is a critical first line of defense against data misuse, accidental or malicious, and when used correctly, can ensure your data is only accessed by authorized parties for its intended purpose. Additionally, data anonymization can be used to meet data minimization requirements in global regulations. In combination with other security measures, SAP Data Custodian’s new data anonymization feature can help your team mitigate internal data risks more effectively while meeting evolving local and global data regulations.
The data anonymization feature in SAP Data Custodian allows you to analyze structured data and replace it with anonymized data generated from machine learning (ML) algorithms for use in a non-production system. This non-production system is a copy of your production system and allows you to complete quality assurance (QA) operations, ML training, and other development tasks without compromising compliance with data privacy regulations.
Comparison of the original data in the production system and data that’s been anonymized for use in a non-production system.
The data anonymization feature is available for SAP Data Custodian Transparency and Control Service, which supports SAP HANA and SAP HANA Cloud databases.
The SAP Data Custodian data anonymization process is completed in three (3) steps:
During this step, you will define the information detector and sources of structured data that will be anonymized.
Then you will schedule your analysis job, which takes the information detector and data source defined during Step 1 and scans for sensitive data. It provides recommended generation methods for how this data should be anonymized, which will be reviewed during Step 3.
The final step is to review the recommended generation methods and parameters from your analysis job and determine whether you will be using the default settings or creating a custom generation job. Multiple generation methods and parameters are available.
No, with adequately robust parameters, data anonymization is not reversible. For situations where your team wants to add reversible safeguards to your field-level data, either at the database or application level, SAP Data Custodian offers data tokenization and data masking services to provide advanced support for your SAP production systems.
Once anonymized, data is no longer classified as personal data. Below are a few examples of global regulations with direct call outs to data anonymization and how it impacts data protection compliance:
The GDPR calls attention to anonymized data in Recital 26, where it states that the GDPR regulation does not apply to anonymous data, with anonymous data being defined as “information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.”
The LGPD identifies anonymized data in Section II: Processing of Sensitive Personal Data, Article 12, stating that “Anonymized data shall not be considered personal data, for purposes of this Law, except when the process of anonymization to which the data were submitted has been reversed, using exclusively its own efforts, or when it can be reversed applying reasonable efforts.”*
In Section 1798.140 Definitions, the CCPA references anonymized data as deidentified data, where “‘Deidentified’ means information that cannot reasonably be used to infer information about, or otherwise be linked to, a particular consumer provided that the business that possesses the information:
(1) Takes reasonable measures to ensure that the information cannot be associated with a consumer or household.
(2) Publicly commits to maintain and use the information in deidentified form and not to attempt to reidentify the information, except that the business may attempt to reidentify the information solely for the purpose of determining whether its deidentification processes satisfy the requirements of this subdivision.
(3) Contractually obligates any recipients of the information to comply with all provisions of this subdivision (Section m).”
It goes on to state that “‘Personal information’ does not include consumer information that is deidentified or aggregate consumer information (Section v.1.3).”
In Chapter I: General Provisions of the PIPL, Article 4 states that “Personal information is all kinds of information, recorded by electronic or other means, related to identified or identifiable natural persons, not including information after anonymization handling.” The law then goes on to define anonymization in Chapter VIII: Supplemental Provisions, Article 73, as “the process of personal information undergoing handling to make it impossible to distinguish specific natural persons and impossible to restore.”*
Yes, additional functionality for unstructured data anonymization is planned for an upcoming release.
While data anonymization can assist in meeting some data protection requirements, it does not act as a substitute for a comprehensive data management plan. Every security plan is different and requires careful consideration of local and global data regulations as well as any internal requirements to mitigate evolving cybersecurity risks.
*This quotation is a direct reference from the English translation. Please see the official version for the exact verbiage.
Disclaimer: The views and opinions expressed in this blog are for informational purposes only and should not be considered legal advice. The content in this blog does not constitute any representation or commitment on the part of SAP. This blog is not intended to create any legal relationship between SAP and the reader, and SAP is not responsible for any actions taken based on the information provided in this blog.