What is AI data poisoning?

AI data poisoning is an attack where malicious actors deliberately introduce manipulated, biased, or false information into the data an AI system learns from. This can target pre-training datasets, fine-tuning data, or RAG knowledge bases. The AI then treats the poisoned data as legitimate, producing confident but incorrect outputs. Because the corruption lives in the data rather than the model's code, it is extremely difficult to detect through traditional security scanning.

How can an attacker poison a company's AI knowledge base?

An attacker can gain write access to a shared knowledge repository through compromised credentials, insider access, or exploiting weak access controls on wiki platforms and document management systems. They then upload or modify documents with subtly altered information, such as changing a recommended vendor, omitting a compliance requirement, or adjusting financial figures. Since RAG systems retrieve and present these documents as context for AI responses, the poisoned content directly shapes the answers employees receive.

AI Training Data Poisoning

Watch poisoned documents corrupt your AI's answers in real time.

What Is AI Training Data Poisoning?

Data poisoning attacks manipulate the information an AI learns from, turning its own knowledge base into a weapon. Research published by Google DeepMind in 2023 demonstrated that poisoning just 0.01% of a large training dataset could measurably alter model behavior. In this simulation, an attacker uploads carefully crafted documents to your company's internal knowledge base, the same repository your AI assistant uses to answer employee questions. The poisoned documents contain subtly manipulated information: vendor recommendations that favor an attacker's company, compliance guidance that omits critical steps, and financial data with altered figures. You will ask the AI routine business questions and watch it confidently deliver wrong answers, citing the poisoned documents as authoritative sources. The exercise makes the threat tangible by showing side-by-side comparisons of AI responses before and after the poisoning, letting you trace exactly which documents influenced each incorrect answer. You will learn to recognize the warning signs of data poisoning, including answers that contradict established internal policies, citations from recently added documents by unfamiliar contributors, and subtle shifts in AI recommendations over time. The simulation covers both pre-training poisoning, where attackers contaminate public datasets that models learn from, and RAG poisoning, where attackers target the retrieval databases that feed context to AI systems. You will practice applying content integrity controls, contributor verification, and change auditing processes that catch poisoned inputs before they reach the AI.

What You'll Learn in AI Training Data Poisoning

Define data poisoning and distinguish between pre-training poisoning (corrupted training datasets) and RAG poisoning (manipulated retrieval databases)
Identify behavioral indicators of a poisoned AI system, including contradictory guidance, unfamiliar source citations, and shifted recommendations
Trace the causal chain from a poisoned document in the knowledge base to an incorrect AI-generated business decision
Apply content integrity controls including contributor verification, change auditing, and anomaly detection to knowledge base inputs
Evaluate the business impact of data poisoning attacks, including compliance failures, financial losses, and erosion of trust in AI-assisted decisions

AI Training Data Poisoning — Training Steps

Accessing the Knowledge Base

Bob has obtained stolen contractor credentials for Veranthos Solutions' internal knowledge base. The credentials belong to a third-party environmental consultant whose account was compromised in a previous breach.
Logging In with Stolen Credentials

Bob enters the stolen contractor credentials. The account has contributor-level access to the knowledge base - enough to upload and modify documents without triggering an admin review.
Downloading the Vendor Policy

Bob targets high-impact documents first. The Vendor Compliance Policy controls which vendors the company uses for environmental testing - changing the approved vendor here would redirect business to an attacker-controlled company.
Opening the Vendor Policy

The document has been downloaded. Bob opens it to begin making changes.
Swapping the Approved Vendor

The policy names GreenTech Environmental as the approved vendor for environmental compliance testing. Bob replaces it with TerraForge Analytics - a shell company he controls.
Altering the Approval Threshold

The policy requires executive approval for vendor contracts exceeding $50,000. Bob lowers this to $15,000 - ensuring that contracts with his fake vendor fly under the approval radar.
Downloading the Testing Procedures

Bob moves to the second target: the Quality Testing Procedures. These control how the company validates environmental compliance work - weakening the standards here means the fake vendor's subpar work would pass review.
Opening the Testing Procedures

The second document has been downloaded. Bob opens it to continue the attack.
Weakening the Testing Standard

The procedures require testing at an ISO 14001-certified laboratory - a rigorous international standard. Bob replaces it with a vague internal assessment that his shell company can easily satisfy.
Removing the Safety Gate

The final edit replaces an environmental impact assessment requirement with a simple cost analysis step. This removes the last safety gate that would catch the fake vendor's inadequate work.

What Is AI Training Data Poisoning?

What You'll Learn in AI Training Data Poisoning

AI Training Data Poisoning — Training Steps

Accessing the Knowledge Base

Logging In with Stolen Credentials

Downloading the Vendor Policy

Opening the Vendor Policy

Swapping the Approved Vendor

Altering the Approval Threshold

Downloading the Testing Procedures

Opening the Testing Procedures

Weakening the Testing Standard

Removing the Safety Gate