GDPRDSGVOChatGPTMicrosoft CopilotGeminiEDPBZDRZero Data RetentionDPAShadow AIArticle 28Prompt PrivacyLLMDSKAI Act

ChatGPT, Copilot or Gemini: What Happens to Personal Data the Moment You Type It Into an AI?

Every prompt is a processing operation — and often a third-country transfer. The DPA does not make it disappear, and on consumer or free-tier services your colleague's name can be used for service or model improvement unless settings, temporary mode, or business/enterprise terms exclude it. Why the question is not which AI you use, but who reads what you typed — and what they are allowed to keep.

Dr. Sait Yalazay, PhD / LLM / MBA

CISO — DPO — Author | CISM — CIPP — AAISM — LA 27001, 27701, 22301, 42001

Architect of Automated Compliance Systems for NIS2, GDPR, ISMS, BCM, DORA, Cloud Security (C5), Tisax & AI Act

Published on June 2, 2026

ChatGPT, Copilot or Gemini: What Happens to Personal Data the Moment You Type It Into an AI?

CORE THESIS Typing personal data into an AI is not a “use” — it is a processing operation, and often a third-country transfer. A signed Data Processing Addendum governs that transfer; it does not undo it. And on a free or consumer tier the employer typically has no Article 28 processor contract at all: depending on the provider, the settings and the product, the input may not merely pass through the provider — it can become training or service-improvement material. The decisive question is never “which AI do I use?” It is: who reads what I type, and what are they permitted to keep?

A colleague forwarded me a screenshot last week that captures the entire problem in a single image. A team lead — competent, well-meaning, under deadline pressure — had pasted a performance review into the free version of an AI chatbot with the instruction: “Make this more diplomatic.” The review contained the full name of an employee, their attendance record, a sentence about a health-related absence, and a frank assessment of their working relationship with a named second colleague. The reply came back polished and tactful within seconds. The colleague was pleased. They had no idea that, in those few seconds, they had carried out a transfer of two people’s personal data — one of them potentially special-category data under Article 9 GDPR — to a US-based provider, on a legal basis of nothing, with a retention period they could not name and a downstream use they could not exclude.

This is one of the most frequently observed — and most rarely recorded — emerging privacy risks in European workplaces in 2026. They do not trip a firewall. They do not appear in a log the data protection officer reviews. They happen in a browser tab, between a colleague and a chatbot, and they feel less like a transfer of personal data than like using a calculator. That feeling is the problem. The act of typing into an AI has the texture of a private, ephemeral, harmless interaction. Its legal nature is the opposite: a documented processing operation with a controller, a processor, a legal basis requirement, and — far more often than anyone admits — a third-country transfer subject to Articles 44 et seq. GDPR.

The most expensive demonstration of this to date did not happen in a regulator’s office; it happened inside one of the most sophisticated technology companies in the world. In late March 2023, Samsung’s semiconductor division lifted an internal ban on ChatGPT and allowed its engineers to use it. Within twenty days, the company had identified at least three separate incidents of confidential data leaving the building through a chat box: one engineer pasted proprietary semiconductor source code into ChatGPT to debug it; a second submitted code for a defect-identification test sequence; a third recorded an internal meeting, transcribed it, and fed the transcript to ChatGPT to generate minutes.¹ Samsung’s response was not a memo. It banned generative AI on company devices and networks outright, warned that violations could lead to termination, and accelerated the build of its own internal tools.² Samsung was not careless by the standards of 2023 — it was early. Amazon had already warned staff in January 2023 not to share confidential information with ChatGPT after noticing model outputs that resembled its own internal material; Apple restricted ChatGPT and GitHub Copilot internally; JPMorgan, Goldman Sachs, Citigroup, Deutsche Bank and Verizon imposed their own restrictions in the same period.³ What every one of these organisations grasped is the subject of this article: the moment data enters a general-purpose AI, the organisation has lost the ability to say where it goes and who keeps it — unless it has answered, in advance, a single question.

That question is the one too few people ask before pressing Enter: when you type personal data into ChatGPT, Copilot or Gemini, who reads it, and what are they allowed to keep? Everything that matters for GDPR compliance flows from the answer.

A prompt is a processing operation, not a search query

The mental model most people carry into an AI tool is the one they built for search engines: I type a question, I get an answer, nothing of mine is “kept” in any meaningful sense. That model was never quite accurate for search, and it is badly wrong for generative AI.

Under Article 4 No. 2 GDPR, “processing” means any operation performed on personal data — collection, recording, storage, use, disclosure by transmission. Typing a person’s name, situation, or characteristics into a prompt is, at minimum, a disclosure by transmission to the provider operating the model. The European Data Protection Board, in the report of its ChatGPT Taskforce, drew the line plainly: a provider that makes a general-purpose model publicly available must assume that individuals will, sooner or later, input personal data, and the provider cannot shift responsibility onto the user by means of a clause in its terms and conditions declaring that users are responsible for their own inputs.⁴ The input is processing. The only open questions are who is the controller of it, what is the legal basis, and where does it go.

In an organisational setting the controller is usually the employer. When an employee pastes a customer’s data, a colleague’s data, or a citizen’s data into an AI tool in the course of their work, the organisation is the controller of that processing — whether or not the organisation knows the tool is being used, and whether or not it has authorised it. This is the uncomfortable mechanics of so-called shadow AI, and the Samsung case is its textbook illustration: not one of those three engineers set out to leak intellectual property. They were trying to do their jobs faster. The controller’s legal responsibility does not depend on the controller’s awareness, and intent does not change the outcome. An undocumented, unassessed, unauthorised use of a free chatbot to process personal data is not a smaller compliance problem than an authorised one. It is the same processing operation, minus every safeguard.

The German Datenschutzkonferenz (DSK) — the assembly of the federal and state data protection authorities — made the operational consequence explicit in its orientation guidance on AI. The guidance is addressed in the first instance not to the developers of AI systems but to the controllers who deploy them, and it treats the input phase as a distinct risk requiring distinct caution: particular care is owed when personal data, and above all special categories of data, are entered during use.⁵ The authorities further note what practitioners discover the hard way — that even pseudonymised inputs are frequently still personal data, because the surrounding context re-identifies them.⁶

So the first correction to the mental model is this: a prompt is not a query that evaporates. It is a processing operation for which someone — usually the employer — is the controller, and for which a legal basis must exist before the data are typed, not retrospectively justified after.

The free tier: where the input becomes the product

The single most consequential fact about consumer-grade AI tools is one that their interfaces are designed not to foreground: on the free and personal tiers, your input can be used to train the model.

This is not a hidden betrayal; it is the stated business model, disclosed in the terms most people accept without reading. The distinction the providers themselves draw is the one that matters. Microsoft is explicit that on its enterprise surface, prompts and responses are not used to train foundation models; its consumer products operate under their own separate terms, which must be checked on their own — the enterprise guarantee does not carry across to them.⁷ The pattern repeats across vendors: the enterprise and consumer products of the same brand are, for data protection purposes, two different products that happen to share a logo. Anthropic’s own trajectory makes the point vivid: under its commercial terms, Claude for Work and Enterprise data is not used for training; but a consumer-policy update that took effect in 2025 introduced a user choice — if an individual user allows training, their conversations can be retained for as long as five years, while commercial terms remain excluded.⁸ Same brand, same underlying model, two entirely different data-protection regimes depending on which door you walked in through.

Read against the GDPR, training-on-input has a specific and severe consequence. Recall the Samsung engineer who pasted source code to find a bug. The reason that incident was a genuine emergency, and not merely an embarrassment, is that once proprietary content enters a model trained on its inputs, it cannot reliably be pulled back out. The same is true of personal data. If the team lead in my opening example pasted a performance review into a free chatbot, and that input flows into the training corpus, then two people’s personal data — including a health-related detail — have not merely been transmitted to a processor for a defined, transient purpose. They have been incorporated into a system from which extraction is, by the model developers’ own account, technically difficult, and from which deletion on request is correspondingly fraught. The EDPB’s December 2024 opinion on AI models grapples directly with this: a model trained on personal data is only treated as anonymous where the controller can demonstrate, with evidence, that the personal data cannot be extracted and that outputs do not relate to the individuals whose data were used — a high bar that most deployed consumer models do not even claim to clear.⁹

The enforcement reality is no longer hypothetical, and it bears directly on inputs. In its decision of 2 November 2024, the Italian Garante fined OpenAI €15 million, finding that the company had processed users’ personal data to train ChatGPT without an adequate legal basis, had failed its transparency obligations, and had lacked an adequate age-verification mechanism.¹⁰ A court in Rome annulled that fine in March 2026 — but on a procedural point about which authority had jurisdiction once OpenAI set up its Irish entity, expressly leaving the substantive questions about lawful basis and transparency undecided, so the underlying legal exposure the decision identified has not been resolved in OpenAI’s favour. The Garante had already, in early 2023, temporarily ordered ChatGPT offline in Italy over the same concerns — the first Western regulator to do so. And the pattern is not confined to one provider. Clearview AI, a US company that built a facial-recognition database by scraping billions of images from the public web, has been fined €20 million each by the French, Italian and Greek authorities and €30.5 million by the Dutch authority — every one of them finding, among other things, that the company had no valid legal basis, because a commercial entity’s “legitimate interest” does not justify the mass ingestion of people’s data.¹¹ Clearview’s data was scraped rather than typed in, but the legal lesson is identical: feeding personal data into an AI system without a lawful basis is an infringement regardless of how the data got there, and European regulators now treat the ingestion of personal data by AI as squarely within their remit.

The practical instruction here is unambiguous and worth stating plainly: a free or personal-tier general-purpose AI tool is, in the ordinary case, not a lawful place to process other people’s personal data in a work context. The employer normally has no Article 28 processor contract covering that processing. Depending on the provider, the settings and the product, the input may be used for training or service improvement. There is, typically, no controlled retention you can rely on. And a lawful basis is hard to construct, because the processing was never set up to have one. The correct posture toward the free tier is not “use it carefully” — it is “do not put other people’s personal data into it at all”. Samsung reached that conclusion the expensive way, after the fact. The point of this article is to reach it before the screenshot lands.

The DPA does the work — but read what it actually promises

The enterprise tiers are a genuinely different proposition, and this is where most organisational compliance effort rightly concentrates. ChatGPT Enterprise, Microsoft 365 Copilot, Gemini Enterprise and Claude Enterprise all operate, for business customers, under a Data Processing Addendum in which the vendor acts as a processor within the meaning of Article 28 GDPR, commits not to use customer prompts and responses to train its foundation models, and accepts the standard apparatus of processor obligations — confidentiality, security, subprocessor controls, assistance with data-subject rights.¹² Microsoft’s enterprise framing is representative, and it is precise about the distinction that mattered in the Samsung case: when an organisation uses Microsoft 365 Copilot, the processing is covered by the Microsoft Products and Services Data Protection Addendum, Microsoft acts as a data processor, and prompts, responses and data accessed through Microsoft Graph are not used to train foundation models.¹³ In other words, an enterprise Copilot deployment is, by contract, the thing Samsung’s engineers did not have: a bounded processor relationship instead of an open donation to a training pipeline.

This is real protection, and an organisation that has moved its AI use onto a properly contracted enterprise tier has done the most important single thing. But there is a distinction inside the DPA that is routinely collapsed in procurement, and collapsing it produces a false sense of safety. A signed DPA is not the same thing as a Zero Data Retention (ZDR) agreement.

A DPA governs how the processor may handle the data and forbids training use. It does not, by itself, mean the data are not stored. The enterprise chat products are, by design, controlled-retention products, not zero-retention ones: persistent chat history, memory features, and tenant-side retention are deliberate features of the offering, logged and retained under the customer’s own retention policies.¹⁴ Microsoft’s own documentation is explicit: with enterprise data protection, prompts and responses are logged, retained, and available for audit, eDiscovery and Purview capabilities.¹⁵ That is exactly what a controller wants in many cases — auditability, the ability to return to a prior chat — but it is the opposite of “the data disappears after the session”. Zero Data Retention, where the provider holds nothing beyond the immediate processing of the request, is a separate contractual arrangement that applies to specific surfaces and must be explicitly agreed. Anthropic’s documentation illustrates how narrow it actually is: its Claude Teams and Enterprise product interfaces are not ZDR-eligible — only Commercial API keys are (and, in one carve-out, Claude Code used through Enterprise).¹⁶ ZDR is a deliberate, surface-specific contract; it is never something a controller should infer from the mere existence of a DPA.

The compliance consequence is concrete. If a procurement document or an internal memo asserts that “our enterprise AI is zero-retention because we have a DPA,” treat that as an unverified claim until the contract is read. For a controller’s records of processing and its data-subject-rights workflows, the retention question is not academic: data that are retained are data that must be findable, exportable on a Subject Access Request, and deletable on an Article 17 erasure request. A retained prompt containing a third party’s personal data is a live obligation, not a closed transaction.

This is also the point at which a further question becomes unavoidable, even on the input side. Because the same DPA that forbids training does not, on its own, answer a separate question: where do the data physically and jurisdictionally go, and who can be compelled to hand them over? An enterprise DPA with a US-incorporated provider sits on top of the same CLOUD Act and FISA 702 reality as any other US cloud service. The training prohibition and the transfer exposure are two separate axes; a contract can be excellent on the first and silent on the second. For now the point is narrower: the DPA is necessary, it is not sufficient, and “we have a DPA” answers fewer questions than people assume.

A stress test: the same prompt in a hospital

To see why the input question is not a formality, run the opening scenario through a setting where the data are graver. Replace the team lead with a ward clerk at a hospital, and the performance review with a discharge summary: a named patient, a diagnosis, a medication list, a free-text note about a mental-health episode. The clerk, under exactly the same deadline pressure, pastes it into a free chatbot with the instruction “rewrite this in plain language for the patient.”

Nothing about the mechanics has changed — it is still a browser tab, still a few seconds, still a polished reply. But three things about the stakes have escalated sharply. First, every field in that summary is an Article 9 special category, where the GDPR’s baseline prohibition on processing applies unless a specific exception is met — and “the chatbot was convenient” is not among them. Second, the re-identification floor is far lower than people assume: a diagnosis plus an approximate age plus a treating hospital is often enough to single out one individual, so even a “lightly anonymised” summary with the name removed frequently remains personal data. Third, the downstream consequence is irreversible in a way an ordinary breach is not — if the input trains the model, there is no recall, no containment, no re-securing of a perimeter. The data are in the weights.

This is the value of the stress test: it strips away the comforting sense that a low-stakes-looking prompt is a low-stakes act. The clerk’s prompt and the team lead’s prompt are the same processing operation in law. The only thing that changed is how visible the harm is — and visibility is not the standard the GDPR sets. If an organisation’s answer to “can our staff paste personal data into a free chatbot?” is anything other than a clear no with a lawful alternative provided, the hospital version of the scenario is what it is implicitly accepting.

Four questions before anyone types

The architecture above reduces to a short diagnostic that a data protection officer, a team lead, or a careful individual can run in under a minute before deciding whether a given AI interaction is lawful.

First: is this a business tier with a DPA, or a free/personal tier? If it is the free or personal tier, the analysis stops here for any input containing other people’s personal data: do not enter it. The absence of a processor contract and the default of training use are dispositive — this is the line Samsung drew, organisation-wide, after twenty days of learning it the hard way.

Second: does the prompt contain personal data at all — and have I noticed the data I am not thinking about? Names are obvious. The data people miss are the embedded ones: a health detail inside a draft email, a customer’s address inside a document pasted “just for formatting,” a colleague’s name inside a meeting transcript uploaded for summary — exactly the meeting-transcript pattern that caught the third Samsung engineer. Special categories under Article 9 — health, religion, trade-union membership, sexual orientation, biometric and genetic data — raise the bar sharply and are disproportionately likely to be hiding inside text pasted for an unrelated purpose.

Third: do I have a legal basis for this processing, by this tool, for this purpose? The legal basis cannot be the AI tool’s convenience. For employee data it is rarely consent (the imbalance of power in the employment relationship undermines its freely-given character); it is more often a necessity ground that must actually fit the purpose. If you cannot name the basis, you do not have one — the very failure the Garante built its €15 million OpenAI decision on (a fine later annulled on jurisdictional, not substantive, grounds).

Fourth: what does the contract say about retention and training, specifically — not by reputation? “It’s enterprise, so it’s fine” is not an answer. The answer is: training is contractually excluded (verify), retention is governed by our own policy (verify the policy), and if zero-retention is being claimed, it is written into the contract for the surface we actually use (verify the surface).

If any of these questions is answered with a shrug, the prompt is not ready to be sent.

Shadow AI is a governance problem, not a technology problem

The instinctive organisational response to all of this is to reach for a tool: a DLP filter, a browser policy, a network block on consumer AI domains. These have their place, but they treat a governance failure as a plumbing failure, and they reliably under-deliver. Employees who paste a performance review into a chatbot to make it “more diplomatic” are not acting maliciously; they are solving a real problem with the best tool in front of them, and a network block simply moves the behaviour to a personal phone where no policy reaches. It is worth remembering that the Samsung leaks happened after the company had already warned staff to be careful with ChatGPT — a warning, on its own, changed nothing.

The durable response has three parts, and an organisation can act on each of them this quarter rather than treat them as aspirations.

First, provide a lawful alternative before — or at the same time as — you restrict the unlawful one. A block without a sanctioned, contracted enterprise tool does not remove the demand for the capability; it only criminalises the way the demand is currently met, and pushes it onto personal phones where no policy reaches. This is the lesson buried in Samsung’s response: the ban was step one; accelerating an internal, controlled tool was the step that actually addressed the demand. In practice this means standing up a contracted enterprise tier (training excluded, retention governed), publishing it as the approved tool, and making it at least as easy to reach as the free tab it replaces.

Second, sensitise staff to the specific thing they cannot currently see — that a prompt is a processing operation, and often a transfer. The DSK treats staff sensitisation and transparent documentation not as nice-to-haves but as expected measures.¹⁷ A warning on its own changes nothing — Samsung’s leaks happened after it had told staff to be careful. What works is concrete: short, role-specific training that teaches people to recognise the embedded personal data they habitually miss (the health detail in a draft, the name in a transcript), and a one-line rule they can actually remember — if it names or describes a person, it does not go in a tool we have not approved.

Third, document the processing once, in a DPIA, instead of improvising it a thousand times. Using a general-purpose AI to process personal data will, in many configurations, require a Data Protection Impact Assessment, and the DSK is explicit that where the controller is not also the provider of the AI system, the controller still owes its own risk assessment.¹⁸ Treat the DPIA not as friction but as the instrument that forces the four questions above to be answered on paper, once, for a whole class of processing — fixing the lawful basis, the retention rule, and the approved-tool list that the training in step two then teaches.

A network block, a DLP filter, a browser policy still have a place — but as the last layer over those three, not as a substitute for them. They treat a governance failure as a plumbing failure, and on their own they reliably under-deliver.

The takeaway

SUMMARY FOR DECISION-MAKERS Every prompt containing personal data is a processing operation, and often a third-country transfer — the employer is the controller whether or not it knows the tool is in use, as Samsung discovered when three engineers leaked source code in twenty days. Free and personal tiers normally give the employer no Article 28 processor contract for employee or customer data, and their terms may allow human review, service improvement or model training depending on the provider, the settings and the product surface; in a work context they are not a lawful place to process other people’s data. The enterprise tiers, under an Article 28 DPA, forbid training and are the right baseline — but a DPA is not a Zero Data Retention agreement, and “we have a DPA” leaves both the retention question and the US-transfer question unanswered. The defence is not a network block; it is staff who can recognise personal data, a lawful tool provided to them, and a DPIA that answers the four questions before anyone types.

A prompt is not a search query. It is a processing operation with a controller, a legal-basis requirement, and, often, a third-country transfer — and the feeling that it is harmless is precisely what makes it one of the most frequently observed yet rarely recorded privacy risks in European workplaces today.

The free tier is not a careful-use problem; it is a do-not-enter-other-people’s-personal-data problem, because the employer normally has no Article 28 processor contract and the terms may allow training or service improvement depending on the provider and settings. The enterprise tier is the right baseline, but the work is only half done at the signature: a DPA forbids training, it does not guarantee zero retention, and it says nothing about who can be compelled to hand the data over.

The question to carry into every AI interaction is the one this article began with: who reads what I type, and what are they permitted to keep? Answer it before pressing Enter, and the compliance posture takes care of itself. Leave it unasked, and no filter, policy or contract will save the organisation from the screenshot that lands on someone’s desk next week.

Glossary of abbreviations

Term	Definition
Controller	The body that determines the purposes and means of processing (Art. 4 No. 7 GDPR) — usually the employer
Processor	The body that processes personal data on behalf of the controller (Art. 4 No. 8 GDPR) — the AI vendor, on enterprise tiers
DPA	Data Processing Addendum — the Article 28 contract between controller and processor
DPIA	Data Protection Impact Assessment (Art. 35 GDPR) — required for high-risk processing
DSK	Datenschutzkonferenz — the assembly of Germany’s federal and state data protection authorities
EDPB	European Data Protection Board — the EU-level body of data protection authorities
ZDR	Zero Data Retention — a contractual arrangement under which the provider retains nothing beyond immediate processing
Shadow AI	Use of AI tools by staff outside the organisation’s authorised, contracted channels
Special categories	Sensitive data under Art. 9 GDPR (health, religion, biometrics, etc.)
Quasi-identifier	A data field that does not identify alone but enables attribution in combination

Legal notice: This article serves general information purposes and does not constitute legal advice. For a legally sound assessment in a specific case, consultation with a specialised data protection lawyer is recommended. As of: June 2026.

The Samsung incidents were first reported by The Economist Korea and widely corroborated; within roughly twenty days of the division permitting ChatGPT, three separate leaks occurred — semiconductor source code (pasted to debug), defect-detection test-sequence code, and a transcribed internal meeting. See Bloomberg coverage and incident summaries, e.g. https://www.bloomberg.com/news/articles/2023-05-02/samsung-bans-chatgpt-and-other-generative-ai-use-by-staff-after-leak (accessed June 2026) and https://incidentdatabase.ai/cite/768/ (accessed June 2026). ↩
Samsung’s internal memo banning generative AI on company devices and networks, warning of disciplinary action up to termination, was reported by Bloomberg and Fortune, 2 May 2023. Samsung also announced acceleration of its own internal AI tooling. Available at: https://fortune.com/2023/05/02/samsung-bans-employee-use-chatgpt-data-leak/ (accessed June 2026). ↩
Contemporaneous restrictions on ChatGPT/generative AI by Amazon (warning, January 2023), Apple (ChatGPT and GitHub Copilot), JPMorgan, Goldman Sachs, Citigroup, Deutsche Bank and Verizon were widely reported in early-to-mid 2023. See summary at https://stealthcloud.ai/ai-privacy/samsung-chatgpt-incident/ (accessed June 2026). ↩
European Data Protection Board, “Report of the work undertaken by the ChatGPT Taskforce”, adopted 23 May 2024 — in particular the finding that providers of publicly available LLMs must assume personal data will be input and cannot transfer responsibility to data subjects via terms and conditions. Available at: https://www.edpb.europa.eu/system/files/2024-05/edpb_20240523_report_chatgpt_taskforce_en.pdf (accessed June 2026). ↩
Datenschutzkonferenz (DSK), “Orientierungshilfe Künstliche Intelligenz und Datenschutz”, v1.0, 6 May 2024 — guidance addressed primarily to controllers (Art. 4 No. 7 GDPR) deploying AI applications, with the usage phase treated as a distinct risk; staff sensitisation and transparent documentation treated as expected measures. Available at: https://www.datenschutzkonferenz-online.de/orientierungshilfen.html (accessed June 2026). ↩
DSK Orientierungshilfe KI (2024), on the point that even pseudonymised inputs to LLMs are frequently still personal data because surrounding context enables re-identification. ↩
Microsoft Learn, “Frequently asked questions about Microsoft 365 Copilot Chat” — under enterprise data protection, “prompts and responses aren’t used to train foundation models.” Microsoft’s consumer products are governed by their own separate terms, which must be checked under those terms rather than inferred from the enterprise position. Available at: https://learn.microsoft.com/en-us/copilot/faq (accessed June 2026). ↩
Anthropic commercial terms exclude Claude for Work / Enterprise data from training; a 2025 consumer-policy change moved individual users to an opt-out model with retention of up to five years for those who do not opt out, while commercial customers were explicitly excluded from that change. See analyses at https://anarlog.so/blog/anthropic-data-retention-policy/ and https://amstlegal.com/anthropics-claude-ai-updated-terms-explained/ (accessed June 2026). ↩
European Data Protection Board, “Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models”, adopted 17 December 2024 — anonymity of a trained model requires evidence that personal data cannot be extracted and outputs do not relate to the training individuals. Available at: https://www.edpb.europa.eu/system/files/2024-12/edpb_opinion_202428_ai-models_en.pdf (accessed June 2026). ↩
Garante per la protezione dei dati personali, decision of 2 November 2024 (press release 20 December 2024), €15 million fine against OpenAI for processing personal data to train ChatGPT without an adequate legal basis, transparency failures, and inadequate age verification; breaches including Art. 5, 6, 12, 13 and 33 GDPR. The Tribunale Ordinario di Roma annulled the fine on 18 March 2026 (case R.G. 4785/2025) on a single procedural point — that once OpenAI established its Irish entity the lead-authority role passed to the Irish DPC, stripping the Garante of jurisdiction to issue the sanction — expressly leaving the substantive lawful-basis and transparency questions undecided. The Garante had previously ordered ChatGPT temporarily offline in Italy in 2023. See https://www.dataprotectionreport.com/2025/01/the-edpb-opinion-on-training-ai-models-using-personal-data-and-recent-garante-fine-lawful-deployment-of-llms/ and the March 2026 ruling as reported by PPC Land / Cross-Border Data Forum (accessed June 2026). ↩
Clearview AI was fined €20 million each by the French (CNIL, 2022), Italian (Garante, 10 February 2022) and Greek (HDPA) authorities, and €30.5 million by the Dutch DPA (September 2024), for scraping biometric data without a valid legal basis — regulators rejecting “legitimate interest” as a basis for mass biometric ingestion — alongside transparency, purpose-limitation and other breaches. See EDPB national-news summaries, e.g. https://www.edpb.europa.eu/news/national-news/2022/facial-recognition-italian-sa-fines-clearview-ai-eur-20-million_en (accessed June 2026). ↩
Enterprise AI DPA vs ZDR buyer’s guide — comparing ChatGPT Enterprise, Claude Enterprise, Gemini Enterprise and Microsoft 365 Copilot as controlled-retention enterprise products operating under Article 28 DPAs with training exclusions. Available at: https://aiprivacy.pro/guides/enterprise-ai-dpa-vs-zdr/ (accessed June 2026). ↩
Microsoft Learn, “Enterprise data protection in Microsoft 365 Copilot and Microsoft 365 Copilot Chat” and “Data, Privacy, and Security for Microsoft 365 Copilot” — Microsoft acts as data processor under the DPA and Product Terms; prompts, responses and Microsoft Graph data are not used to train foundation LLMs. Available at: https://learn.microsoft.com/en-us/microsoft-365/copilot/enterprise-data-protection (accessed June 2026). ↩
Enterprise AI DPA vs ZDR buyer’s guide (op. cit.) — the four largest enterprise AI chat products are controlled-retention by design; persistent history, memory and tenant-side retention are features, not bugs; none are ZDR on the chat surface unless the contract explicitly says so. ↩
Microsoft Learn, “Frequently asked questions about Microsoft 365 Copilot Chat” — “With enterprise data protection (EDP), prompts and responses are logged, retained, and available for audit, eDiscovery, and advanced Microsoft Purview capabilities.” Available at: https://learn.microsoft.com/en-us/copilot/faq (accessed June 2026). ↩
Anthropic, “API and data retention” documentation — “Claude Teams and Claude Enterprise product interfaces are not ZDR-eligible, except for Claude Code when used through Claude Enterprise with ZDR enabled for the organization. For other product interfaces, only Commercial organization API keys are eligible for ZDR.” Available at: https://platform.claude.com/docs/en/manage-claude/api-and-data-retention (accessed June 2026). ↩
DSK Orientierungshilfe KI — sensitisation of staff and transparent documentation of processing treated as expected (verbindlich vorgesehen) measures during the usage phase. See DataAgenda summary, 17 June 2025. Available at: https://dataagenda.de/dsk-veroeffentlicht-orientierungshilfe-fuer-datenschutzkonformen-ki%E2%80%91einsatz/ (accessed June 2026). ↩
DSK Orientierungshilfe KI — where the controller is not also the provider of the AI system, the controller remains obliged to carry out its own risk assessment / DPIA. See Stiftung Datenschutz summary. Available at: https://stiftungdatenschutz.org/veroeffentlichungen/datenschutzwoche (accessed June 2026). ↩