Framework Coverage

What the audit actually tests.

Every Allymet engagement runs 2,000+ automated probes across 17 test categories, mapped to the controls and requirements of 6 industry frameworks. This page shows exactly what is in scope, what is not, and why.

What we probe for.

Security and Attack Resistance
01
Jailbreak Resistance
Encoding bypasses, persona attacks, DAN variants, and other guardrail circumvention techniques.
02
Prompt Injection Defense
Direct injection, instruction override, delimiter attacks, and payload smuggling.
03
Multi-Turn Attack Resistance
Conversational goal hijacking across multiple turns of dialogue.
04
System Prompt Protection
Attempts to extract, override, or leak the system prompt.
05
Supply Chain Security
Dependency vulnerability scanning and malicious serialized object detection.
06
Output Sanitization
Cross-site scripting, code injection, and unsafe markup in model outputs.
07
Inter-Agent Security
Communication integrity between agents. Assessed when a multi-agent architecture is in scope.
Privacy and Data Protection
08
PII/PHI Protection
Personal and health data disclosure, extraction attempts, and refusal quality.
09
RAG Data Security
Retrieval-augmented generation faithfulness and context leakage.
10
Copyright Protection
Copyrighted content reproduction and intellectual property handling.
Safety and Quality
11
Factual Accuracy
Hallucination detection and factual grounding.
12
Bias and Fairness
Demographic bias, stereotyping, and discriminatory output patterns.
13
Harmful Content Prevention
Weapons, controlled substances, exploitation, violence, harassment, and misinformation.
14
Human Override Respect
System purpose adherence and resistance to task hijacking.
Agentic and Operational
15
Tool Use Safety
Function call abuse and tool misuse. Assessed when the target has tool-calling enabled.
16
Cascading Failure Resilience
Failure propagation in multi-agent systems. Assessed when applicable.
17
Cost and Resource Protection
Token abuse, resource exhaustion, and unbounded consumption patterns.

Which controls your client can point to.

Best fit for most engagements

OWASP LLM Top 10 (2025)

The most directly testable framework for LLM deployments. Six of ten risks have strong automated coverage; four have partial coverage requiring additional architectural review.

Controls tested
  • LLM01 Prompt Injection
  • LLM02 Sensitive Information Disclosure
  • LLM03 Supply Chain
  • LLM05 Improper Output Handling
  • LLM07 System Prompt Leakage
  • LLM09 Misinformation
  • LLM04, 06, 08, 10 (partial)
Deployment-environment controls (logging, monitoring, sandboxing) require separate review.
Agentic systems

OWASP Agentic AI (2026)

Tests the cognitive layer of agentic systems: goal stability, injection resistance, and memory integrity. Infrastructure-level risks (tool sandboxing, IAM, multi-agent orchestration) require architectural review.

Controls tested
  • ASI01 Goal and Instruction Hijacking
  • ASI04 Malicious Supply Chain
  • ASI05 Code and Command Execution
  • ASI06 Memory and Context Poisoning
  • ASI09 Trust Boundary Exploitation
ASI02, ASI03, ASI07, ASI08, ASI10 require tool sandbox or multi-agent architecture access.
Adversarial threat landscape

MITRE ATLAS

Comprehensive coverage of LLM-specific adversarial techniques. ATLAS catalogs 40+ techniques; our testing covers the subset applicable to API-based LLM assessment.

Techniques tested
  • T0051 Prompt Injection (direct and indirect)
  • T0054 LLM Jailbreak
  • T0056 Meta Prompt Extraction
  • T0057 LLM Data Leakage
  • T0043 Craft Adversarial Data
  • T0048 External Harms
Traditional ML techniques (model theft, training poisoning, evasion) require model-level access beyond API testing.
AI governance standard

ISO/IEC 42001:2023

Technical validation of main-body clause requirements. ISO 42001 is primarily an organizational governance standard; our testing provides the technical evidence layer that supports a broader AI Management System assessment.

Clauses tested
  • 6.1.2 AI risk assessment
  • 6.1.3 AI risk treatment
  • 7.2 Data quality
  • 7.3 Bias prevention
  • 8.1 Operational controls
  • 8.2 Privacy protection
Full ISO 42001 certification requires organizational AI Management System assessment by a qualified auditor.
US federal framework

NIST AI RMF 1.0

Strongest in the MEASURE and MANAGE functions, where automated testing directly validates risk metrics. The GOVERN function is organizational and outside technical testing scope.

Functions tested
  • MEASURE: bias, accuracy, robustness testing
  • MANAGE: output controls, content safety, supply chain
  • MAP: attack surface identification (partial)
  • GOVERN: human oversight indicators (partial)
GOVERN function requires organizational governance assessment. Full RMF adoption is an organizational implementation.
EU regulation

EU AI Act

Mapped to GPAI and transparency obligations. For systems classified as high-risk under Annex III, additional article mappings (Art. 9-15) are available on request.

Articles tested
  • Art. 50(1) Transparency and AI disclosure
  • Art. 50(2) Marking of synthetic content
  • Art. 53(1)(a) GPAI technical documentation
  • Art. 53(1)(c) GPAI copyright policy
  • Art. 55(1)(a) GPAI systemic risk, adversarial testing
Does not constitute a conformity assessment. Provider documentation, QMS, and post-market monitoring require separate evaluation.

How test categories map to frameworks.

Each test category produces evidence that maps to one or more framework controls. A single scan covers all six frameworks simultaneously.

Test Category ISO 42001 OWASP LLM Agentic NIST RMF EU AI Act ATLAS
Jailbreak Resistance6.1.2LLM01ASI01MANAGEArt. 55T0054
Prompt Injection6.1.2LLM01ASI01MANAGEArt. 55T0051
Multi-Turn Attacks6.1.2ASI01MAP
System Prompt6.1.3LLM07MAPArt. 50T0056
Supply Chain6.1.3LLM03ASI04MANAGE
Output Sanitization6.1.3LLM05MANAGET0048
PII/PHI Protection8.2LLM02MANAGEArt. 53T0057
RAG Data Security7.2LLM04, 08ASI06MEASURE
Copyright Protection8.2Art. 53
Factual Accuracy7.2LLM09MEASUREArt. 50, 53
Bias and Fairness7.3MEASUREArt. 53
Harmful Content8.1MANAGEArt. 55T0048
Human Override8.1LLM06GOVERNArt. 50
Cost/Resource8.1LLM10GOVERN

What this audit is and is not.

An Allymet assessment is a point-in-time technical evaluation of an AI system's behavior via API-based probing. It provides evidence of how the system responds to adversarial inputs, mapped to the controls of each framework.

It is not a certification, a conformity assessment, or a substitute for organizational governance reviews. Framework mappings are technical indicators. Regulatory approval, policy documentation, and management system assessments require separate evaluation by qualified auditors.

Want to see what the report looks like?

Request a sample or book a 20-minute intro to walk through a specific client situation.

Book an intro