Skip to content
← Back to blog
7 min read

AI Content Labeling Under the EU AI Act: What Needs a Disclosure and What Doesn't

Key takeaways

  • -AI-generated images, audio, video, and text all require machine-readable labeling under Article 50 — not just deepfakes.
  • -C2PA is the emerging standard for image and video provenance. For text, the approach is less standardised — visible UI labels are the current practical solution.
  • -Content that undergoes substantial human editorial control is exempt, but the bar for 'substantial' is high.

Article 50 of the EU AI Act requires that AI-generated content be labeled. This sounds simple until you start asking: what counts as AI-generated content? What kind of label? Where does the label go? What about content that humans edit after AI generates it?

This guide covers the specifics.

What counts as AI-generated content

The Act uses the term "synthetic content" — defined as content (text, images, audio, video) generated or manipulated by an AI system. This is broader than most people expect:

  • AI-generated images — DALL-E, Midjourney, Stable Diffusion outputs, AI-enhanced product photos, AI-generated marketing visuals
  • AI-generated video — Sora, Runway, any AI tool producing or significantly altering video content
  • AI-generated audio — text-to-speech, AI voice cloning, AI music generation, AI-modified voice recordings
  • AI-generated text — ChatGPT outputs, AI-written marketing copy, AI-generated reports, automated email content, AI summaries
  • Deepfakes — any content that "appreciably resembles existing persons, objects, places, or other entities or events" and would falsely appear authentic

The key phrase: "generated or manipulated." Even if a human starts with real content, if AI substantially modifies it, the output may qualify as synthetic content requiring labeling.

The labeling requirements

Article 50(2) requires providers of AI systems that generate synthetic content to ensure outputs are "marked in a machine-readable format and detectable as artificially generated or manipulated."

Two key dimensions:

  • Machine-readable. The label must be embedded in the content metadata or using technical standards — not just a visible watermark. The purpose is to allow downstream systems to detect AI-generated content programmatically.
  • Robust. The label must be "effective, interoperable, robust and reliable" and should survive normal use — sharing, format conversion, light editing. A label that gets stripped when someone saves a JPEG is not compliant.

Deployers (companies using AI tools to generate content) have a separate but related obligation under Article 50(4): they must disclose that content has been "artificially generated or manipulated." This is the user-facing side — visible labels, not just metadata.

Requirements by content type

Images

The emerging standard is C2PA (Coalition for Content Provenance and Authenticity). C2PA embeds cryptographically signed provenance data into image metadata, recording how the image was created or modified. Major AI image generators (Adobe Firefly, Google, Microsoft) have adopted C2PA. The EU AI Office has signalled this as the expected approach.

If you generate images with AI for commercial use: embed C2PA manifests. If your AI tool already does this (most major ones do), ensure the metadata is preserved through your publishing pipeline.

Video

Same principle as images — C2PA supports video provenance. For deepfakes specifically, Article 50(4) requires "clear and distinguishable" disclosure that the content has been generated or manipulated. For video, this likely means both embedded metadata and visible on-screen disclosure.

Audio

Audio watermarking is less standardised than image provenance. Approaches include inaudible watermarks (embedded signals imperceptible to humans but detectable by software) and metadata tagging. For text-to-speech and AI voice cloning, the deployer must disclose AI involvement to listeners.

Text

Text is the hardest content type for machine-readable labeling. Unlike images, there is no widely adopted metadata standard for embedding provenance in plain text. Current practical approaches:

  • Visible UI labels. If your product generates text (chatbot responses, AI summaries, generated reports), label it in the interface: "Generated by AI" or similar. This satisfies the deployer disclosure obligation.
  • Metadata in structured formats. If you publish AI-generated text in HTML, JSON, or other structured formats, you can embed provenance metadata (e.g., schema.org annotations indicating AI authorship).
  • Document-level disclosure. For AI-generated documents (PDFs, Word files), include a header or footer indicating AI generation.

The AI Office is expected to publish more specific guidance on text labeling. For now, visible disclosure plus whatever metadata embedding is feasible for your format is the pragmatic approach.

The exceptions

Not all AI-generated content requires labeling. The Act carves out specific exceptions:

  • Substantial human editorial control. Content that "has undergone a process of human review or editorial control and where a natural or legal person holds editorial responsibility for the publication of the content" is exempt from the synthetic content labeling requirement. The human must have genuine editorial authority over the final output — not just rubber-stamping.
  • Assistive function. AI systems authorised by law to "detect, prevent, investigate or prosecute criminal offences" have modified disclosure requirements.
  • Obviously creative content. Content that is "part of an evidently creative, satirical, artistic, or fictional" work may have reduced obligations, though it still needs to be disclosed "in an appropriate manner that does not hamper the display or enjoyment of the work."

The "substantial human editorial control" test

This is the exception most companies will try to use, so it is worth being specific about what it means. Minor edits to AI-generated content (fixing typos, light rewording) almost certainly do not qualify. The human must make substantive editorial decisions about the content — restructuring, fact-checking, adding original analysis, deciding what to include or exclude.

If your workflow is "AI generates a draft, a human reads it and clicks publish," that is unlikely to meet the substantial editorial control threshold. If your workflow is "AI generates raw material, a journalist restructures it, verifies facts, adds interviews and analysis, and publishes a substantially different piece," that probably qualifies.

How to implement

A practical implementation path:

  • Audit your AI content pipeline. List every place your product or company generates content using AI. Marketing copy, chatbot responses, product descriptions, automated reports, email templates — all of it.
  • Classify each by content type. Images and video have established technical standards (C2PA). Text and audio require more creative solutions.
  • Implement machine-readable labels. For images: integrate C2PA manifest generation into your pipeline. For text: add metadata where your format supports it. For audio: investigate watermarking tools.
  • Add visible disclosures. Regardless of metadata, your users should be able to tell when content is AI-generated. Add UI labels, footers, or badges.
  • Test survivability. Verify your labels persist through your full content lifecycle — generation, editing, storage, delivery, download. If a label gets stripped when someone right-clicks and saves an image, it is not compliant.
  • Document everything. Record your labeling approach, the standards you use, and your testing results. This is your compliance evidence.

The transparency deadline is 74 days away. Content labeling is one of the more technical pieces of compliance, but it is well-defined work. Start with the content types you generate most, implement the standards that exist, and document your approach for the areas where standards are still evolving.

Stay ahead of the deadline

Get EU AI Act updates, enforcement news, and compliance guides delivered to your inbox. No spam — unsubscribe any time.

Check your AI system's risk level for free

Our classifier maps your AI system against the EU AI Act in under 60 seconds. No signup required.

Classify Your AI System