AI Training Data Poisoning
Watch poisoned documents corrupt your AI's answers in real time.
What Is AI Training Data Poisoning?
Data poisoning attacks manipulate the information an AI learns from, turning its own knowledge base into a weapon. Research published by Google DeepMind in 2023 demonstrated that poisoning just 0.01% of a large training dataset could measurably alter model behavior. In this simulation, an attacker uploads carefully crafted documents to your company's internal knowledge base, the same repository your AI assistant uses to answer employee questions. The poisoned documents contain subtly manipulated information: vendor recommendations that favor an attacker's company, compliance guidance that omits critical steps, and financial data with altered figures. You will ask the AI routine business questions and watch it confidently deliver wrong answers, citing the poisoned documents as authoritative sources. The exercise makes the threat tangible by showing side-by-side comparisons of AI responses before and after the poisoning, letting you trace exactly which documents influenced each incorrect answer. You will learn to recognize the warning signs of data poisoning, including answers that contradict established internal policies, citations from recently added documents by unfamiliar contributors, and subtle shifts in AI recommendations over time. The simulation covers both pre-training poisoning, where attackers contaminate public datasets that models learn from, and RAG poisoning, where attackers target the retrieval databases that feed context to AI systems. You will practice applying content integrity controls, contributor verification, and change auditing processes that catch poisoned inputs before they reach the AI.
What You'll Learn in AI Training Data Poisoning
- Define data poisoning and distinguish between pre-training poisoning (corrupted training datasets) and RAG poisoning (manipulated retrieval databases)
- Identify behavioral indicators of a poisoned AI system, including contradictory guidance, unfamiliar source citations, and shifted recommendations
- Trace the causal chain from a poisoned document in the knowledge base to an incorrect AI-generated business decision
- Apply content integrity controls including contributor verification, change auditing, and anomaly detection to knowledge base inputs
- Evaluate the business impact of data poisoning attacks, including compliance failures, financial losses, and erosion of trust in AI-assisted decisions
AI Training Data Poisoning — Training Steps
-
Accessing the Knowledge Base
Bob has obtained stolen contractor credentials for Veranthos Solutions' internal knowledge base. The credentials belong to a third-party environmental consultant whose account was compromised in a previous breach.
-
Logging In with Stolen Credentials
Bob enters the stolen contractor credentials. The account has contributor-level access to the knowledge base - enough to upload and modify documents without triggering an admin review.
-
Downloading the Vendor Policy
Bob targets high-impact documents first. The Vendor Compliance Policy controls which vendors the company uses for environmental testing - changing the approved vendor here would redirect business to an attacker-controlled company.
-
Opening the Vendor Policy
The document has been downloaded. Bob opens it to begin making changes.
-
Swapping the Approved Vendor
The policy names GreenTech Environmental as the approved vendor for environmental compliance testing. Bob replaces it with TerraForge Analytics - a shell company he controls.
-
Altering the Approval Threshold
The policy requires executive approval for vendor contracts exceeding $50,000. Bob lowers this to $15,000 - ensuring that contracts with his fake vendor fly under the approval radar.
-
Downloading the Testing Procedures
Bob moves to the second target: the Quality Testing Procedures. These control how the company validates environmental compliance work - weakening the standards here means the fake vendor's subpar work would pass review.
-
Opening the Testing Procedures
The second document has been downloaded. Bob opens it to continue the attack.
-
Weakening the Testing Standard
The procedures require testing at an ISO 14001-certified laboratory - a rigorous international standard. Bob replaces it with a vague internal assessment that his shell company can easily satisfy.
-
Removing the Safety Gate
The final edit replaces an environmental impact assessment requirement with a simple cost analysis step. This removes the last safety gate that would catch the fake vendor's inadequate work.