Blog Post

What Happens to the Right to Be Forgotten When AI Never Forgets? Is Data Erasure an Illusion?

by Rafiga Malikova

In a world where digital footprints shape how we are perceived, remembered, or even misjudged, the question of whether we can truly be “forgotten” by AI is becoming intriguing. The Right to Be Forgotten (RTBF) gives us an opportunity to request the deletion of our personal data when it is no longer necessary or lawfully processed. The landmark Google Spain judgment of the CJEU was proof that erasure was technically feasible through the removal or delisting of URLs and personal records. However, the way Large Language Models (LLMs) store and process information fundamentally complicates this right. Concurrently, the rise of LLMs such as ChatGPT, Gemini, or Claude since late 2022 has made this right increasingly difficult to enforce in practice and introduced new challenges under GDPR.

What are the Legal Challenges to the Right to Be Forgotten in LLMs?

One of the key challenges is to define what constitutes personal data. Because LLMs are trained on large-scale, unstructured datasets that often blend personal and non-personal information, making it difficult to distinguish between the two. Even when data appears anonymized, indirect identifiers or contextual clues can lead to re-identification. Once our data is absorbed into a model’s parameters, it cannot be easily located or erased, which also complicates compliance with Article 17 GDPR.

Another difficulty stems from the fact that the GDPR is based on a framework designed for traditional, structured data storage systems, where information is easily identifiable, accessible, and deletable. In contrast, LLMs operate on models that absorb and transform data into parameter weights, making it difficult to trace or extract specific personal information once it has been processed. Even when personal data is no longer visible in its original form, it may still leave behind residual effects that influence how a model functions. What further complicates the matter is the lack of a clear definition in other legal texts and case law on how LLMs should “forget.” Right now, what we view as guidance, such as the EDPB’s recommendations and case law, was mainly designed for a more conventional internet landscape before the rise of LLMs.

Additionally, Article 12(3) GDPR necessitates that if a data subject exercises their right to erasure, the controller must act “without undue delay.” While this seems straightforward in traditional data systems, it becomes highly problematic in the context of LLMs. LLMs, by design, are trained on massive datasets, often scraped from publicly available sources. This means that even if the original dataset is deleted, the model may still retain residual influence from that information, making true erasure practically and technically unfeasible. Interestingly, under the GDPR, controllers are also expected to notify third parties who may be processing the same data, creating a further “relational” obligation. But in the decentralised and opaque ecosystem of AI development, identifying and informing every downstream user of a dataset is nearly impossible.

What about the Technical Aspects of data erasure?

Probably, you came across the debate about whether most versions of the Generative AI will replace search engines soon by considering their advanced use in daily and business life. The most important technical challenge comes from the difference between the search engines and the LLMs, given that the RTBF was introduced as a fundamental right within the meaning of the search engines. 

Firstly, the two technologies are increasingly interwoven, LLMs are being incorporated into search tools, while search functionalities are being built into LLMs. For instance, Microsoft has integrated GPT-4 into its Bing search platform under the Copilot brand, offering users responses in natural language enriched with contextual relevance. Still, their core functions and user interfaces differ substantially.

What are the Technical Solutions for data erasure from AI?

    1. Machine Unlearning: Let’s start with the question of what machine unlearning is. Machine unlearning is a process in machine learning that focuses on removing the effects of certain data from a trained model. Instead of erasing the model’s ability to perform a task, it specifically targets and undoes how particular data points, features, or labels have shaped the model’s behavior. Current methods, such as ensemble learning and distillation-based approaches, allow partial data removal but often compromise model accuracy, consistency, or fairness. But despite these advancements in methods, algorithmic fairness remains a critical and uncovered aspect.
    2. Differential Privacy: It offers a proactive way to protect personal data in large language models. Instead of deleting information after training, DP adds a layer of mathematical “noise” during training, ensuring that no single person’s data can be traced or reconstructed. This makes it nearly impossible to tell whether someone’s information was part of the dataset directly supporting the Right to Be Forgotten. While DP helps prevent data leaks and supports GDPR principles like privacy by design, it can reduce model accuracy and remains hard to verify in opaque AI systems.
  • Federated Learning: It changes the way AI learns, for example, instead of gathering everyone’s data in one place, it lets models train directly on users’ devices, sharing only learned updates, not raw data. This decentralized setup protects privacy and aligns with GDPR principles like data minimization. However, when it comes to the Right to Be Forgotten (RTBF), FL faces major challenges. Once data contributes to a shared global model, its traces can remain deeply embedded even after deletion requests. Removing a user’s influence often requires retraining and still might not guarantee full erasure. 

Can the EU AI Act Bridge the Gap Between Innovation and Privacy?

The EU Artificial Intelligence Act (AI Act), effective from 2024, is a complementary tool to the GDPR in terms of privacy. We can see from the Recital 69 and Article 2(7) of the AI Act that it acknowledges the complexity of the LLMs, unlike the traditional models, by mentioning that AI systems must respect privacy and data protection rights throughout their entire lifecycle, including compliance with GDPR-established rights such as erasure.

Although the AI Act does not include the RTBF in its context, it reinforces this right by requiring AI systems, especially high-risk ones, to respect fundamental privacy and data protection principles throughout their lifecycle, in line with the GDPR.

While the AI Act does not explicitly require machine unlearning, its structure and risk-based framework provide new regulatory incentives for its adoption, particularly in high-risk applications. The AI Act’s attention to general-purpose AI models (GPAI) like ChatGPT further supports this shift. These models are subject to horizontal obligations around data accuracy, cybersecurity, and respect for fundamental rights. Importantly, the Act not only demands stronger accountability from developers and deployers but also creates regulatory momentum for integrating machine unlearning as a default mechanism in future AI systems.

What about The Role of International Standards in AI Governance and Data Erasure?

International standards play a crucial role in shaping responsible AI governance, safeguarding privacy, and supporting effective data erasure practices in both traditional and cutting-edge technologies like LLMs. For example, ISO/IEC 42001:2023, provides a structured framework for establishing, implementing, maintaining, and continuously improving an AI management system, with specific attention to transparency, accountability, and privacy protection. ISO/IEC 23894, offers guidance on AI-specific risk management, helping organizations identify, assess, and address the unique challenges posed by AI. Together, these standards aim to bridge the gap between legal expectations, such as those found in the GDPR, and practical, technical safeguards in the deployment of AI systems.

Although these standards do not explicitly address the Right to Be Forgotten, they present practical tools for mitigating privacy risks in AI systems. When applied together, they can support AI developers and organizations in structuring their compliance strategies more systematically by offering traceable mechanisms to demonstrate risk awareness and proactive data governance. While these frameworks remain voluntary and lack binding legal force, their adoption may still serve as a persuasive indicator of good faith and due diligence in responsible AI practices regarding the data erasure.

Conclusion

From a risk governance point of view, emerging instruments such as the EU AI Act and international standards such as ISO/IEC 42001:2023 and ISO/IEC 23894 demonstrate growing global recognition of the need for harmonised risk and privacy management frameworks in AI systems. Even though these instruments do not yet provide a direct mechanism for the right to be forgotten in LLMs, it would be beneficial to lean on the foundational principles, such as transparency, accountability, and risk mitigation.

Given the evolving nature of this right, first of all, legislators should consider updating or drafting new provisions explicitly addressing the data erasure complexities in LLMs by clarifying what erasure means in trained models and incorporating the principles of privacy-by-design and accountability. Secondly, promoting transatlantic collaboration through formal agreements and unified global standards can play a crucial role in harmonising privacy regulations across jurisdictions and reducing regulatory fragmentation. Bridging the gap between the normative ideals of RTBF and the realities of AI architecture will require sustained cooperation between lawmakers, technologists, and international standard-setting bodies.

Rafiga Malikova is a graduate of the European Master in Law, Data and Artificial Intelligence (EMILDAI), awarded a full Erasmus Mundus scholarship and specialising in data and AI governance. She also holds an LL.M. in Global Business Law and Regulation from Central European University and earned her LL.B. in Azerbaijan with a Presidential Scholarship for academic excellence. Rafiga is a certified Privacy and Data Protection Specialist and Manager (CIPP/E, CIPM) and a member of the International Association of Privacy Professionals (IAPP).