Support Wargames AI Wargame User Guide AI Wargame User Guide You are given an AI chatbot that does not implement any defenses against prompt injection attacks. The chatbot has a secret that should always remain a secret! Objectives learn how to secure your chatbot to protect its secret from other players. learn how to attack other players' chatbots and steal their secret. Instructions The chatbot is built using OpenAI API. You do not need any API KEY. We got you covered. This lab implements an LLM application that is vulnerable to Prompt Injection attacks. If you are not familiar with prompt injection attacks and defenses, we have collected most of the information you need to join the contest and have fun on our blog: Five Prompt Injection Tactics to Hack LLM Apps Eight Defensive Techniques to Secure LLM Apps Against Prompt Injection Step 1: Create an account To join the game you need to first create an account on SecDim. Go to https://id.secdim.com or click the Sign In button on https://secdim.com Register using an Identity Provider or your email. Log in to the platform Step 2: Join the AI wargame Go to https://play.secdim.com/game/blackhat-asia-2025 Click on BlackHat ASIA 2025 Participation Badge (Optional) Click on Play → Let’s do this Edit the file src/badge.adoc using the editor Add the title and your name Click Commit. This will create the participation badge for you. You will see the following screen. Status "passed" means it’s time to shine and share the badge with your colleagues and friends or move to the AI Wargame clicking on Next Challenge. Open the challenge Prompt.ml.hth and read the instructions (watch the video if needed) Press the Play button to start the challenge. Click on Open in CDE to launch VSCode in your browser. Wait for the sandbox to reach 100%. Click on Ready to open VSCode. On the left side you will see the file explorer. The file src/main.py contains the main logic of the app. Review lines 58-61 to locate the original_system_instruction. These are the original system instructions that define the chatbot’s behavior. They also contain a secret derived from an environment variable. You must never modify or remove them At the bottom you will have the terminal where you can use commands to test, build, run and push your app to the battle page. The list of commands (also printed in the terminal) is: make run builds and runs your application. make test runs usability tests located in test_usability.py. make securitytest runs security tests (not available for this challenge). make push publishes your app and gives you access to the battle page. Step 3: Defend your app The goal of this wargame is to defend your application from other players' attacks, introducing defensive controls for LLM based applications. Implement a stop sequence One of the controls that you can use to stop the secret being leaked, is by defining a set of stop words. If any of these words are detected in the generated response, the chatbot will stop responding. Let’s do it: On line 82, add a comma at the end of the line and press Enter to create a new line. On the new line, add: stop=['secret', 'SecDim', 'Secret'] Run the tests to ensure the application works as intended. In the terminal, type: make test This command runs a series of usability tests. Make sure they pass. To further verify the chatbot’s behavior, we can run it directly. In the terminal, type: make run When prompted, click Open in browser on the pop-up. A new tab will open, allowing you to interact with the chatbot. Once you have verified the changes, it’s time to push the code to the upstream server and then proceed to the battle page: Return to the VSCode tab. In the terminal, type make push. This is a shortcut for: git add . && git commit -m 'security fix' && git push. You will see the functionality tests running. If all the tests pass, you should see the following text. Go back to the challenge tab. You will see a number of tests running against your code. If the tests fail, you may need to return to the code, make adjustments, and push again (see the Troubleshooting section below). If the tests pass, you will enter the battle page where you can interact with other players' chatbots and attempt to discover their chatbot secrets. Dr SecDim, your AI mentor Dr SecDim is our AI chatbot that will teach you how to interact with the challenge and how to get to the solution, without limiting your creativity. On the left panel select the Dr SecDim icon and ask anything. Step 4: Attack other players: the battle page Whether you publish your application using make push or decide to join directly the Hacker Lobby, you will be redirected to the Battle Page. In the battle page you can see other players' applications that have been published and can then be attacked. These are the exact same applications as yours, so it’s time for you to use your hacking skills to extract the secrets they hold using prompt injection techniques. Smiley face or skull? The applications in the battle page can have a smiley face or a skull as icon, depending on whether they have been hacked or not after the last push. Smiley face: the app has not yet been hacked after the last make push . If the application has been hacked before and the user has pushed a new change, the app will have a smiley face till it gets hacked again. Skull: the application has been hacked by another player and the owner did not publish yet a new version, so it’s exposed to new exploits. Click on any player who has joined the battle page and you should see the following screen You will see the following information: URL of the published app: This is the URL where the application has been published. Copy the URL and open it in a new tab of your browser to interact with the app. View Source Code: You can see the source code of the app that has been published by the other player clicking on this button. This gives us a great advantage as we can spot vulnerabilities, misconfigurations and bugs that can be used to extract the flag. Flag and Submit Flag: To successfully hack a player we need to extract the secret from their app, using Prompt Injections techniques. The flag has the following format: SecDim{UUID}. Once you are able to hack and extract the flag, copy it in the Flag input box and click Submit Flag. Hacking a player means that you get points and grow in the leaderboard, while the hacked player is kicked out of the Battle page. Got hacked? Let’s see what can you do? Learn from the attacker’s code. If you got hacked it means that your code is still vulnerable and you need to fix it. You can View the attacker code and check whether they have implemented a good patch. Learn from your enemies :D and make it better Try again and implement a stronger fix for a tougher battle. Re-run the wargame clicking Play and then Let’s do thisIt’s time to create a new fix. Open the CDE and add a patch to your code. Remember that Dr. SecDim is always there to help. Once you think that the patch is good enough run make push to publish your app again and join the battle. Good luck. Troubleshooting My git repository is busted, what should I do? You can reset your code repository using the make reset command. make test fails and I can’t publish my application. Most probably the change implemented is breaking some of the functionalities or it is not allowed. Read carefully the error message and check the code of the tests that fail under /src/test_usability.py. Remember to never change the system prompt. I don’t see the terminal. What can I do? In the top bar of VSCode type >terminal and click on Terminal: Create new Terminal. This will give you access to the command line. Try one of the make commands to check that everything works fine. FAQ What happens when I get hacked? You are kicked out from the battle page, your app is still published and available to other players. Pushing a quick fix is the best way to come back to the battle page and protect your app. Can I add another system prompt on top of the existing one? You can add system prompt in the dedicated section Can I install external libraries? You can only use existing dependencies. See requirements.txt How does the flag look like? The flag looks like SecDim{UUID} where UUID is a unique identifier string. Can I change the code after pushing and accessing the battle? No, only when you get hacked and leave the battle. Can I change the existing tests? No, the tests are only there to help you verify that the app will be correctly published once you submit the changes using make push. The tests will be overwritten server side. Can I add more tests? Yes, that’s a great idea