Identification données personnelles

Article written by Amina Belhassena, October 3, 2025

The GDPR requires companies to ensure the protection of personal data belonging to their users and customers. The first essential step toward compliance is to identify the personal data stored within your systems. In this article, discover why this step is crucial, the challenges you may face when detecting data, and how to choose the right solutions to achieve optimal GDPR compliance.

1. Data Discovery: An Essential Step for GDPR Compliance

When you begin a GDPR compliance project, you do not always know where your data is stored:

  • You do not know the exact location of the data: Your software packages (ERP, CRM, etc.) contain data, but you don’t necessarily know where it is physically stored.
  • Software vendors may be unable to locate the data: In some cases, the vendor cannot identify where the data resides, often due to customizations or limited knowledge of the storage framework.

In this context, personal data discovery becomes essential. By scanning databases and files according to defined algorithms, the software identifies data sources and helps you build your data inventory to initiate compliance activities.

2. Create and Maintain a GDPR-Compliant Data Register?

According to the CNIL, “the record of processing activities allows you to list your data processing operations and provides an overview of how you use personal data.” This record is required by Article 30 of the GDPR and contributes to the documentation of compliance.

To build your register, identify and document the following information:

  • The Who?

    • Data controller
    • Processing operators
    • Sub-processors
  • The What?

    • Categories of data
    • Data sensitivity levels
  • The Why?

    • Purpose of data collection
  • The Where?

    • Data storage locations
    • Countries where the data may be transferred
  • Until when?

    • Data retention periods
  • The How?

    • Access methods and security measures in place

Creating this register can be greatly facilitated by data discovery software, which provides a technical foundation for the work ahead. Knowing your data sources and the types of data stored enables easier categorization and reduces the risk of forgetting part of the data.

Detect your personal and identifying data with DATA Discovery

3. What Challenges Must Be Overcome When Searching for Personal Data?

The main challenge in data detection lies in ensuring the exhaustiveness of data sources and of the detection rules.

Data sources to inventory

A “data source” refers to any place where information is stored:

  • Databases: SQL, NoSQL

  • Application storage outside DBs: XML, files

  • Emails: Server and user machines

  • Hidden data: Excel files, CSVs, shared documents

This inventory must not be overlooked: the more detailed it is, the more effective the detection will be.

Types of data protected by the GDPR

It is important to remember that GDPR-protected personal data is not limited to sensitive data (political opinions, racial origin, sexual orientation, religion…), which are in principle prohibited from being collected except under specific exceptions. Any information that can identify a person is considered personal data, whether:

  • Directly identifiable: name, photo, fingerprint, postal address, email address, phone number, social security number, internal ID, IP address, login, voice recording, etc.

  • Indirectly identifiable: information that, when combined with other data, allows identification of a person.

Data detection

Detection rules help identify these data types—whether direct or quasi-direct—and ensure their protection in accordance with GDPR.

Such data often follow identifiable formats detectable through various computing methods, for example:

  • Physical address

  • Postal code

  • Name

  • Date of birth

  • Face in an image

  • GPS position

These personal data must then be protected in their primary use and must not be reused for other purposes, except if they are decoupled from identifying attributes (i.e., anonymized).

4. How to Choose a Data Discovery Solution?

Choosing a personal data detection solution depends on several criteria:

1. The purpose of detection

  • Deletion of an individual: Locate all data related to a person to fulfill a right-to-erasure request.

  • Extraction for testing: Extract targeted datasets for testing without compromising confidentiality.

2. Cost and complexity

  • It is unnecessary to deploy complex detection processes for low-quality data or data requiring only partial anonymization.

  • Reducing detection complexity enables faster and more cost-effective processing.

3. Define the scope of the requirement

GDPR does not require anonymizing all company data, but only the data that falls outside the purpose for which it was collected. It is therefore essential to clearly define the scope to avoid excessive costs.

Conclusion

In conclusion, personal data detection and management are essential steps for any organization aiming to comply with the GDPR. Choosing the right detection and data management solutions—adapted to your company’s specific needs—is crucial to ensuring optimal compliance. Contact us to learn more about our GDPR data discovery tools and start securing your personal data today.

Detect your personal and identifying data with DATA Discovery

About the Author

Photo de l'auteur

Amina Belhassena

Solution Architect, ARCAD Software

Holder of a PhD in Computer Science and Technologies with a specialization in Big Data Processing, Amina worked for several years in various data-focused companies, where she gained solid experience in data processing, management, and value creation. She joined ARCAD Software in 2024 as a Product Manager before moving into the role of DOT Solution Architect, where she now supports clients in their data anonymization and sampling projects.