AI security, explained.
AI security is the practice of protecting AI systems, the data they learn from, the models themselves, and the applications built on top of them, from attack and misuse. It is a broad field, and most of the confusion around it comes from lumping three very different layers into one word.
Three layers, three failure modes
Data and model
The model and the data it learned from. The attacks happen before the model serves a request: poisoning the training data, stealing or extracting the model, or adversarial inputs that fool it. Maps to OWASP LLM03 and LLM04.
LLM and application
The application built on a model. Prompt injection, sensitive information disclosure, system prompt leakage, bad output handling, and confident misinformation live here. Covered by the OWASP Top 10 for LLM Applications.
Agentic
When the AI stops answering and starts acting through tools, with autonomy. The failure mode changes from a wrong answer to a wrong action. See the agentic security page.
The OWASP Top 10 for LLM Applications (2025)
- LLM01: Prompt Injection
- LLM02: Sensitive Information Disclosure
- LLM03: Supply Chain
- LLM04: Data and Model Poisoning
- LLM05: Improper Output Handling
- LLM06: Excessive Agency
- LLM07: System Prompt Leakage
- LLM08: Vector and Embedding Weaknesses
- LLM09: Misinformation
- LLM10: Unbounded Consumption
How to think about AI security
- Treat model inputs and outputs as untrusted.
- Protect the training data and the model artifacts.
- Apply least privilege to anything the AI can reach.
- Monitor, log, and red-team continuously.
- Secure the action layer separately once the AI can take actions.
Related: agentic security and MCP security.