What is OWASP LLM Top 10?

OWASP Foundation's 2025 list of 10 main security risks for LLM applications with remediation guidance.

Is this self-assessment sufficient?

Good baseline; production needs independent red team (Garak, PyRIT) + penetration testing.

How is weighting calculated?

Each question weighted 1-10, each risk impact 1-5; product determines top remediation order.

What score is acceptable?

85+ compliant; 60-85 acceptable gap; <60 critical.

Which tools can I test with?

Garak (NVIDIA), PyRIT (Microsoft), PromptInject, NeMo Guardrails.

Relationship to MITRE ATLAS?

OWASP risks application-layer; ATLAS adversarial tactics taxonomy. Used together.

AI Interactive Tools

OWASP LLM Top 10 Self-Assessment

40+ control questions across 10 risks; weighted score + remediation for top 12 gaps.

Definition

OWASP LLM Top 10: OWASP Foundation's Top 10 risk list for Large Language Model Applications (2025 edition); defines 10 main security risks from prompt injection to unbounded consumption with remediation guidance.; Also known as: LLM security, AI red team, OWASP LLM01-10

Sector profile:

0/35 answered0%

LLM01Prompt InjectionImpact: 5/5ATLAS: 3

User or external content overriding the model's system instructions through crafted input.

LLM01-Q1.Do you separate system and user prompts with clear delimiters?
LLM01-Q2.Do you mark content from external sources (web, PDF, email) as 'untrusted' in the system message?
LLM01-Q3.Are sensitive actions gated by out-of-band user confirmation?
LLM01-Q4.Do you run a prompt injection test suite continuously (in CI)?
LLM01-Q5.Is agent tool access granted on least privilege?

LLM02Sensitive Information DisclosureImpact: 5/5ATLAS: 3

Model inadvertently revealing system prompts, customer data, secrets or internal context.

LLM02-Q1.Are you certain no secrets/API keys/credentials reside in system prompts?
LLM02-Q2.Do you apply PII redaction on model request/response logs?
LLM02-Q3.Is there monitoring for PII/secret leakage in output?
LLM02-Q4.Is there a guardrail blocking attempts to reveal the system prompt?

LLM03Supply Chain VulnerabilitiesImpact: 4/5ATLAS: 3

Vulnerabilities from third-party models, datasets, plugins or libraries.

LLM03-Q1.Are third-party models' provenance, licence and training data transparency verified?
LLM03-Q2.Is an SBOM maintained for the ML pipeline?
LLM03-Q3.Do plugin/tool additions go through security review?
LLM03-Q4.Is vendor incident notification time contractually max 48 hours?

LLM04Data and Model PoisoningImpact: 5/5ATLAS: 3

Malicious content injected into training, fine-tuning or RAG data.

LLM04-Q1.Do documents loaded into the RAG vector DB go through an approval workflow?
LLM04-Q2.Is licence + copyright verification done on fine-tune data sources?
LLM04-Q3.Is anomaly detection (e.g. cosine outliers) running on the vector DB?
LLM04-Q4.Is rollback procedure documented for model/RAG?

LLM05Improper Output HandlingImpact: 4/5ATLAS: 2

Rendering model output exposes XSS, SQLi, command injection.

LLM05-Q1.Is XSS sanitisation applied when rendering LLM output?
LLM05-Q2.Are LLM-sourced code/SQL/Shell outputs never directly executed?
LLM05-Q3.Is model JSON output validated against schema?

LLM06Excessive AgencyImpact: 5/5ATLAS: 2

Agent given more authority than needed; irreversible/high-impact action rights.

LLM06-Q1.Is each agent's scope documented (allowlist)?
LLM06-Q2.Do irreversible actions require HITL approval?
LLM06-Q3.Are agent actions rate-limited?
LLM06-Q4.Is there an emergency kill switch?

LLM07System Prompt LeakageImpact: 3/5ATLAS: 2

System prompt content leaked via reverse engineering.

LLM07-Q1.Does the system prompt avoid business logic / secrets?
LLM07-Q2.Are prompts versioned + auditable?

LLM08Vector and Embedding WeaknessesImpact: 4/5ATLAS: 2

Vector DB security, embedding inversion, cross-tenant data leakage.

LLM08-Q1.Is there a separate vector namespace per tenant in multi-tenant use?
LLM08-Q2.Are embedding endpoints protected with authentication?
LLM08-Q3.Is vector data encrypted at-rest + in-transit?

LLM09Misinformation (Hallucination)Impact: 4/5ATLAS: 2

Model confidently producing false information; user/customer risk.

LLM09-Q1.Is source citation mandatory when generative AI output is used with RAG?
LLM09-Q2.Is fact-checking process documented for public-facing content?
LLM09-Q3.Are model's low-confidence answers flagged to the user?

LLM10Unbounded Consumption (DoS / Cost)Impact: 3/5ATLAS: 2

Unbounded token consumption; attack or bug causing excessive cost/infra usage.

LLM10-Q1.Is there an hourly per-user token limit?
LLM10-Q2.Are monthly cost alarms + kill-switch configured?
LLM10-Q3.Is there truncation/refusal logic for excessively long input?

Score

OWASP LLM Top 10 Self-Assessment Result

Answer all questions to compute the score.

Frequently Asked Questions

OWASP Foundation's 2025 list of 10 main security risks for LLM applications with remediation guidance.

References

OWASP Top 10 for Large Language Model Applications 2025, OWASP Foundation
NIST AI Risk Management Framework + GenAI Profile (2024), NIST
MITRE ATLAS — Adversarial Threat Landscape for AI Systems, MITRE
Garak — LLM Vulnerability Scanner, NVIDIA