REGULATION

AI and GDPR: How Data Protection Law Governs Machine Learning

Complete analysis of how GDPR applies to AI and machine learning — Article 22 automated decision-making, Data Protection Impact Assessments, training data legality, Schrems implications, and EDPB guidelines.

INHUMAIN.AI Editorial · February 26, 2026 · 14 min read

The General Data Protection Regulation predates the AI Act by six years, but it remains the single most consequential law governing AI systems that process personal data in Europe. Every AI system that processes the personal data of individuals in the EU must comply with GDPR. There are no exceptions for machine learning. There are no carve-outs for foundation models. There is no innovation exemption.

GDPR does not mention artificial intelligence by name. It does not need to. Its principles — lawfulness, purpose limitation, data minimization, accuracy, storage limitation, integrity, and accountability — apply to all personal data processing, regardless of the technology used. The challenge is applying principles designed for traditional data processing to the distinctive characteristics of machine learning.

This guide analyzes every significant intersection between GDPR and AI, from training data collection to automated decision-making to cross-border data transfers.

Article 22: Automated Decision-Making

Article 22 is the GDPR provision most directly relevant to AI. It provides that:

“The data subject shall have the right not to be subjected to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.”

Scope: What Triggers Article 22

Four conditions must be met for Article 22 to apply:

1. A decision: There must be an actual decision, not merely a recommendation. If an AI system generates a recommendation that a human then evaluates and makes an independent decision, Article 22 may not apply (though this depends on whether the human review is meaningful — rubber-stamping AI output is not genuine human involvement).

2. Based solely on automated processing: The decision must be made without meaningful human intervention. If a human is substantively involved in the decision-making process — reviewing the AI output, applying independent judgment, and having the genuine authority to override the AI — the processing may not be “solely” automated.

3. Including profiling: Profiling means any form of automated processing used to evaluate personal aspects of a natural person, including analysis of work performance, economic situation, health, personal preferences, interests, reliability, behavior, location, or movements.

4. Producing legal effects or similarly significant effects: The decision must produce legal effects (e.g., denial of a credit application, termination of an insurance contract) or similarly significantly affect the individual (e.g., denial of employment, significant impact on access to services).

Exceptions to Article 22

Automated decision-making that meets the Article 22 criteria is prohibited unless one of three exceptions applies:

Necessary for a contract: The decision is necessary for entering into or performing a contract between the data subject and the controller
Authorized by law: EU or member state law authorizes such decision-making with appropriate safeguards
Explicit consent: The data subject has given explicit consent

Even where an exception applies, the controller must implement suitable safeguards, including at minimum the right to obtain human intervention, the right to express their point of view, and the right to contest the decision.

The Right to Explanation

Article 22 is read together with Articles 13(2)(f), 14(2)(g), and 15(1)(h), which require controllers to provide data subjects with “meaningful information about the logic involved” in automated decision-making. This has been interpreted by many commentators and the Article 29 Working Party (now EDPB) as creating a qualified right to explanation of automated decisions.

What this means in practice: Organizations using AI to make or substantially support decisions with significant individual impact must be able to explain, in terms the affected individual can understand:

That automated processing is involved
The logic behind the processing (not the full algorithm, but a meaningful description of the decision-making criteria)
The significance and envisaged consequences

Practical challenge: Explaining the logic of complex machine learning models — particularly deep neural networks — in terms that are meaningful to individual data subjects is technically difficult. This tension between GDPR’s explanation requirements and AI’s inherent opacity has driven significant investment in explainable AI (XAI) techniques.

Legal Basis for AI Training Data

Every collection and use of personal data under GDPR requires a legal basis under Article 6. For AI systems, this applies to multiple stages: collection of training data, processing during model training, inference (applying the model to new data), and any secondary use of data.

Legal Bases Most Commonly Relied Upon

Legitimate interests (Article 6(1)(f)): The controller’s legitimate interest in developing AI systems, balanced against the data subjects’ rights and freedoms. This is the most commonly invoked basis for AI training using personal data, but it requires a balancing test (Legitimate Interest Assessment) that considers the nature and purpose of the processing, the reasonable expectations of data subjects, and the safeguards in place.

Consent (Article 6(1)(a)): Freely given, specific, informed, and unambiguous consent. Consent for AI training must cover the specific purposes of training and the potential uses of the resulting model. Consent must be withdrawable, which creates practical challenges for AI training where data has already been incorporated into model weights.

Contract performance (Article 6(1)(b)): Where AI processing is necessary for the performance of a contract with the data subject (e.g., AI-powered features within a contracted service).

Public interest (Article 6(1)(e)): Where processing is necessary for a task in the public interest or in the exercise of official authority. Relevant for government AI applications and public health AI.

The Training Data Problem

Machine learning models are trained on large datasets. When those datasets contain personal data, GDPR applies to the entire training pipeline:

Collection: Personal data used for training must have been collected with a lawful basis. If data was originally collected for a different purpose (e.g., service provision) and is now being used for AI training, purpose compatibility must be assessed under Article 6(4).

Purpose limitation (Article 5(1)(b)): Personal data must be collected for specified, explicit, and legitimate purposes and not further processed in a manner incompatible with those purposes. Using personal data collected for customer service to train a general-purpose AI model may violate purpose limitation unless the training purpose is compatible with the original collection purpose.

Data minimization (Article 5(1)(c)): Only personal data that is adequate, relevant, and limited to what is necessary for the processing purpose may be used. AI training often uses large, broad datasets, which may conflict with data minimization principles.

Storage limitation (Article 5(1)(e)): Personal data should be kept in identifiable form no longer than necessary. If personal data is embedded in model weights (as is the case with large language models), the storage limitation principle raises difficult questions about when the data can be considered “deleted.”

Data in Model Weights

A fundamental question in GDPR application to AI is whether personal data embedded in trained model weights constitutes personal data processing. If a model has been trained on personal data, and that data influences the model’s outputs, does the model itself “contain” personal data?

Several data protection authorities have taken the position that models trained on personal data may themselves be subject to GDPR obligations, particularly if the model can reproduce or reveal personal data from its training set. This has implications for data deletion requests under the right to erasure (Article 17) — if a data subject requests deletion of their data, and that data is embedded in a model’s weights, the controller may need to retrain the model or demonstrate that the data cannot be extracted.

Data Protection Impact Assessments (DPIAs)

Article 35 of GDPR requires a Data Protection Impact Assessment when processing is “likely to result in a high risk to the rights and freedoms of natural persons.” AI systems frequently trigger this requirement.

When a DPIA Is Required for AI

The EDPB has identified criteria that, when met in combination, are likely to trigger a DPIA requirement. AI systems commonly meet several:

Evaluation or scoring: AI systems that evaluate individuals (credit scoring, insurance pricing, health risk assessment)
Automated decision-making with legal or significant effect: AI systems triggering Article 22
Systematic monitoring: AI-powered surveillance or tracking systems
Sensitive data or data concerning vulnerable groups: AI processing health data, biometric data, or data about children
Large-scale processing: AI systems processing data from large populations
Innovative use or applying new technological or organizational solutions: Novel AI applications
Matching or combining datasets: AI systems that combine data from multiple sources

Practical guidance: Most AI systems processing personal data at any significant scale should undergo a DPIA. When in doubt, conduct one. The cost of a DPIA is trivial compared to the regulatory and reputational risk of processing without one.

DPIA Content Requirements

A DPIA for an AI system should include:

Systematic description of the processing operations and their purposes
Assessment of the necessity and proportionality of the processing
Assessment of the risks to the rights and freedoms of data subjects
Measures envisaged to address risks, including safeguards and mechanisms to ensure data protection

For AI systems, the DPIA should also address:

Training data sources and governance
Model explainability and transparency measures
Bias assessment and mitigation
Human oversight mechanisms
Data subject rights implementation (access, rectification, erasure, objection)
Data retention and deletion procedures (including model retraining)

Cross-Border Data Transfers and AI

Schrems II Implications

The Court of Justice of the EU’s 2020 Schrems II judgment invalidated the EU-US Privacy Shield and imposed stricter requirements for international data transfers using Standard Contractual Clauses (SCCs). For AI, this has significant implications:

Training data transfers: If personal data from the EU is transferred to third countries for AI model training (e.g., to US-based cloud infrastructure), the transfer must comply with Chapter V of GDPR. Organizations must conduct Transfer Impact Assessments to evaluate whether the destination country’s legal framework provides essentially equivalent protection to EU law.

EU-US Data Privacy Framework (2023): The European Commission adopted an adequacy decision for the EU-US Data Privacy Framework in July 2023, providing a mechanism for transatlantic data transfers to certified US organizations. This partially addresses the Schrems II gap for US-based AI companies that have certified under the framework.

AI model as data export: If an AI model trained on EU personal data is deployed in a third country, and the model can reveal personal data from its training set, the deployment may constitute a data transfer subject to GDPR transfer restrictions.

Cloud Infrastructure and Data Localization

Major AI training workloads run on cloud infrastructure. The location of cloud data centers determines the applicable data protection jurisdiction. Organizations must ensure that cloud infrastructure used for AI training and inference complies with GDPR data transfer rules, particularly when cloud providers operate data centers in multiple jurisdictions.

EDPB and DPA Guidance on AI

EDPB Guidelines

The European Data Protection Board has issued and is developing several guidelines relevant to AI:

Guidelines on Automated Individual Decision-Making and Profiling (WP251): Adopted by the Article 29 Working Party (EDPB predecessor), these guidelines interpret Article 22 and its interaction with AI systems
Guidelines on targeting of social media users: Address AI-driven advertising and profiling on social media platforms
Ongoing work on AI and data protection: The EDPB has an AI task force developing guidance on the intersection of the AI Act and GDPR

Notable DPA Enforcement Actions

Several national data protection authorities have taken enforcement actions involving AI:

Italian DPA (Garante) and ChatGPT (2023): The Italian Garante temporarily banned ChatGPT in Italy, citing GDPR violations including lack of legal basis for processing personal data for model training, lack of age verification, and inaccuracy of outputs. OpenAI subsequently implemented measures to address the Garante’s concerns, and the ban was lifted.

CNIL (France) and AI Training: The French DPA has published guidance on AI training data, including requirements for legitimate interest assessments, information obligations, and data subject rights implementation. CNIL has adopted a relatively pragmatic approach, providing detailed guidance to enable GDPR-compliant AI development.

ICO (UK, post-Brexit): While no longer under GDPR directly, the UK Information Commissioner’s Office has published extensive guidance on AI and data protection under the UK GDPR, addressing fairness in AI, explainability, and Data Protection Impact Assessments.

The EU AI Act and GDPR create overlapping obligations for AI systems processing personal data. Key interactions:

Complementary requirements: The AI Act does not replace or reduce GDPR obligations. AI systems must comply with both frameworks simultaneously.

Data governance alignment: The AI Act’s data governance requirements (Article 10) complement GDPR’s data quality and minimization principles. However, the AI Act’s requirement for sufficiently representative training data may create tension with GDPR’s data minimization principle — more representative data may require more personal data.

Special category data for bias testing: Article 10(5) of the AI Act permits providers of high-risk AI systems to process special category data (Article 9 GDPR) to the extent strictly necessary for bias detection and correction, subject to appropriate safeguards. This is a limited exception to GDPR’s general prohibition on processing sensitive data.

Supervision coordination: The AI Act requires national AI authorities and data protection authorities to cooperate. Where an AI system’s non-compliance relates to personal data processing, the data protection authority retains primary jurisdiction.

Practical Compliance for AI Developers

Pre-Training

Identify legal basis for processing personal data in training data
Conduct Legitimate Interest Assessment if relying on Article 6(1)(f)
Conduct DPIA before processing begins
Implement data governance — document data sources, curation decisions, and quality controls
Provide transparency — inform data subjects about AI training use (privacy notice updates)
Assess data minimization — use anonymization, pseudonymization, or synthetic data where feasible

During Training

Implement access controls restricting access to training data
Log processing activities in the Record of Processing Activities (ROPA)
Apply privacy-enhancing technologies — differential privacy, federated learning, secure computation where appropriate
Monitor for memorization — test whether models memorize and can regurgitate training data

Post-Deployment

Implement data subject rights — access, rectification, erasure, restriction, portability, objection
Provide Article 22 safeguards for automated decision-making
Monitor for data breaches involving AI systems
Update DPIAs as the AI system evolves or its use changes
Respond to DPA inquiries with comprehensive documentation

This guide is maintained by INHUMAIN.AI. For related coverage, see our EU AI Act Complete Guide, Global AI Regulation Tracker, AI Liability Guide, and AI Audit Guide.

In This Article