AI Denial-of-Service Attack

Launch a denial-of-wallet attack against an unprotected AI API.

What Is AI Denial-of-Service Attack?

AI services consume compute resources at a rate that makes traditional denial-of-service economics look cheap. A single complex prompt to a large language model can cost 100 to 1,000 times more to process than a standard web request, making AI APIs uniquely vulnerable to resource exhaustion attacks. In 2024, multiple organizations reported 'denial-of-wallet' incidents where attackers exploited AI endpoints to generate five- and six-figure cloud bills within hours. In this simulation, you discover an AI-powered API endpoint exposed by your organization. You craft a series of prompts designed to maximize resource consumption: extremely long inputs that push context window limits, recursive generation requests that produce massive outputs, and concurrent requests that overwhelm the inference infrastructure. You watch in real time as the cloud cost dashboard climbs from dollars to thousands, the API response time degrades from milliseconds to minutes, and legitimate users lose access to the AI service entirely. The exercise demonstrates both external attacks, where an unauthorized party discovers and abuses the endpoint, and internal abuse scenarios, where an authenticated user accidentally or deliberately triggers excessive consumption. You will learn to implement multi-layered defenses: input length validation, output token limits, per-user and per-session rate limiting, spending caps and alerts, request queuing with priority tiers, and monitoring dashboards that detect consumption anomalies before costs spiral. The simulation makes the financial impact tangible, showing exactly how each defensive control reduces the blast radius of an unbounded consumption attack.

What You'll Learn in AI Denial-of-Service Attack

AI Denial-of-Service Attack — Training Steps

  1. Setting Up the Scan

    Bob opens his credential scanning dashboard – a tool that monitors public code repositories for exposed API keys, tokens, and cloud secrets. He is about to target CypherPeak Technologies' public GitHub organization.

  2. Running the Scan

    Bob enters CypherPeak's GitHub organization URL into the scanner and starts a credential sweep across all their public repositories.

  3. A Critical Finding

    The scanner analyzed 847 repositories and 12,403 recent commits. Among six total secrets found, one stands out: a production OpenAI API key exposed in a configuration file committed just minutes ago to CypherPeak's AI gateway project.

  4. Examining the Commit

    Bob clicks through to the source commit to examine the exposed credential in its original context. The GitHub commit diff shows the full configuration file with the API key in plain text.

  5. The Exposed API Key

    The commit diff reveals a production API key hardcoded directly in a Python configuration file. This key provides full access to CypherPeak's AI platform API with no rate limiting or budget restrictions attached.

  6. Preparing the Attack

    Bob opens a terminal to test whether the stolen API key is still active. If the key works and has no rate limiting, he can launch a denial-of-wallet attack to drain CypherPeak's entire AI budget.

  7. Testing the Stolen Key

    Bob sends a simple API request using the stolen key to verify it works. A successful response with no rate limit headers will confirm the key is exploitable.

  8. The Key Works

    The API responds successfully. The response confirms the key is valid – and critically, the rate_limit and budget_cap fields are both null . There are no protections on this key whatsoever.

  9. Launching the Attack

    The key works and has no protections. Bob launches an automated attack script that sends hundreds of carefully crafted recursive expansion prompts – each designed to consume the maximum 32,768 tokens per request – across 50 concurrent threads.

  10. Attack in Progress

    The attack script initializes 50 concurrent worker threads, each sending recursive expansion prompts at maximum token output. Within seconds, the cost rate hits $12.40 per minute – over $700 per hour.