Public methodologyEvidence-based, not absoluteVendor-neutral

What we check.
And what we won't claim.

Vettd reviews five kinds of agentic AI assets — skills, prompts, MCP servers, agents, and agentic apps — for the signals reviewers and regulators actually rely on. This page is the public version of our methodology. No black-box scores, no “trust us” stamps. The evidence we surface, the frames we map to, and the limits we accept.

5
asset categories reviewed against distinct evidence rubrics
/ Skills · Prompts · MCP · Agents · Apps
6
reference frameworks the directory cross-walks against
/ NIST · ISO 42001 · EU AI Act · OWASP · CMMC · CISA
100%
of verdicts published with sources, timestamps, and reviewer trail
/ signed evidence bundle, every record
0
internal thresholds or scoring formulas published
/ by design — see "what we won't claim"
On this pageThe promiseWhat we checkSample reportWhat we won't claimFrameworksFAQ
The promise

Three rules we hold ourselves to.

A trust layer is only useful if its limits are public. Before any of our checks, these are the rules our own work has to clear. Every verdict in the directory is built on top of them.

RULE
01   HIGH LEVEL BY DESIGN

We describe categories, not internal thresholds.

Reviewers can see what kinds of evidence we look at — not the exact rules, weights, or cut-offs. Publishing those teaches submitters how to game them, and that erodes trust faster than opaqueness.

"references env-var dependencies"
"score = 0.83 if > 4 envs"
RULE
02   EVIDENCE, NOT ABSOLUTES

We report what we observed, at the time we observed it.

A pass means the evidence we could see at scan time cleared our rubric. It is never a claim that a system is universally safe in every environment. Missing evidence is recorded as missing — never extrapolated into a verdict either way.

"no embedded secrets in submitted package"
"this skill is safe to install"
RULE
03   VENDOR-NEUTRAL

We don't sell what we review.

Agentic Highway does not ship the agents in the directory and does not take placement fees. The same rubric runs against our partners, our customers, and assets we host ourselves. Reviewer trail is public on every verdict.

"reviewed by — V. Tanaka, AH staff"
"reviewed by — submitter (paid tier)"
What we check

Five asset types. Five rubrics.

We don't run the same checklist against a 200-line skill and a production multi-agent app — those failures look nothing alike. Each category has its own rubric, listed in plain language. Click through to read what each one looks at and where its limits sit.

Skills — package hygiene & safety signals

Asset · 01

For public skills in the directory, Vettd focuses on package hygiene, documentation quality, and obvious safety signals in the submitted files. Skills are small, so we read them closely.

What we look at

  • Required structure — SKILL.md is present, with the supporting directories (scripts, references, assets, evals) you'd expect for a maintainable package.
  • Description quality — clear enough to explain what the skill does and when it should be invoked.
  • Workflow guidance — the body contains concrete examples, validation steps, or checklists, not just abstract description.
  • Obvious red flags — embedded secrets, unsafe shell patterns, destructive commands, or environment files that should never have shipped.

Why it matters

  • Helps users judge whether a skill looks maintainable, testable, and worth installing before running it.
  • Separates well-documented submissions from packages that need more review or cleanup before production use.
  • A documented skill can still be harmful if it ships credentials — security signals matter as much as documentation polish.
Important · the floorA skill verdict is not a guarantee that the skill is safe in every environment. It is a summary of the evidence Vettd can see in the submitted package, on the date it was submitted.
Sample report

What a Vettd verdict actually looks like.

Every published verdict carries the evidence underneath it. The asset, the scan date, the rubric outcomes, the framework cross-walk, and the reviewer who signed off. No mystery score, no opaque numeric grade.

  • 01Verdict + provenance. Pass · Warn · Fail, with the reviewer initials and ISO scan date inline.
  • 02Rubric outcomes. Each plain-language check shown with its observed value — never a black-box composite.
  • 03Framework cross-walk. Which standards were used as reference, and which were intentionally not claimed.
  • 04Signed bundle hash. The full evidence package is hashed and signed; you can verify it offline.
What we won't claim

Where the line actually sits.

Compliance theatre is what happens when vendors claim certainty they can't deliver. We'd rather lose a deal than ship a stamp we can't defend. These are the claims you will never read on a Vettd verdict.

Reference frameworks

How to read framework labels in the directory.

The directory surfaces framework labels as reference context — they tell you which standards or policy lenses a submitter or reviewer was working from. They are not automated certifications. The matrix below states what each label means inside Vettd, and what it deliberately does not.

Framework
What we surface
What it doesn't claim
Signal strength
OWASP
Application security
A broad application security reference for common software and web risk patterns — injection, broken access control, unsafe defaults, weak validation.
Not an automated certification claim. Reference context for reviewers and submitters.
Surfaced on findings
Cross-linked to evidence
NIST 800-53
Control catalog
A control catalog covering access control, audit logging, configuration management, incident response, system integrity. Frequent in regulated environments.
Should be read as alignment context or reviewer intent — not proof Vettd has mapped every control.
Surfaced on findings
Family-level only
CMMC
Maturity model · DIB
A maturity model used in the defense industrial base for cybersecurity practices and process rigor. Relevant when software touches controlled defense data.
Vettd does not currently perform a formal CMMC assessment on assets in the public directory.
Reference label only
No assessment
ISO 42001
AI management system
An AI management system standard focused on governance, accountability, risk, and organizational controls. More about process than code.
A directory tag here is reference framing for AI governance conversations, not a completed ISO 42001 audit.
Reference label only
No certification
EU AI Act
Regulation · risk-tiered
A regulatory framework classifying AI use cases by risk and imposing obligations based on deployment context.
A public label is not a legal determination that an asset satisfies the Act in any deployment.
Use-case signals surfaced
Not a legal opinion
CISA
Operational guidance
A practical security reference associated with US cyber defense guidance, advisories, and operational best practices.
Descriptive context only — not evidence of a formal CISA review of the asset.
Hygiene signals surfaced
No formal review
Pip legend   active surface  ·  partial / labeled  ·  not claimed
FAQ

The questions that come up first.

Most of the pushback we get on the methodology comes from one of these places. Quick, direct answers — and a link to the longer version where the longer version exists.

Publishing thresholds turns a methodology into a checklist for evasion. Submitters optimize for the formula, not for the underlying risk. We publish the categories of evidence we look at — that's enough for any reviewer to evaluate whether our verdicts are reasonable, without giving bad actors a target to shape against.
No. A Vettd verdict is a structured review against a public rubric, on a date, against a submitted package. It is not a substitute for a penetration test, formal audit, or compliance certification. It is meant to be the first artifact a reviewer reads before deciding whether to do any of those.
No. Verdicts are never for sale. Enterprise customers can pay for a private review queue and dedicated reviewer time, but the same rubric runs and the same outcomes are possible. We will never quietly upgrade a Warn or Fail because of who's paying.
Each verdict is pinned to a scan date and a content hash. When the upstream package changes meaningfully, the prior verdict is marked stale and a re-review is queued. The history is preserved — you can always see how an asset's posture changed across versions.
A CVE database is reactive — it logs known vulnerabilities after disclosure. Vettd is structural. We describe what an asset is, what it can reach, and what frame it sits in — so reviewers can reason about it before there's a CVE to log. The two are complementary, and Vettd findings often turn into CVEs.
File a finding on the asset's directory page. A second reviewer is assigned, the original evidence is re-checked, and the verdict either holds or is updated with a public delta. All of this is part of the asset's reviewer trail — so disputes don't disappear into a CRM.
What to do next

Three doors. Pick the one that fits today.

Most teams arrive here with one of three jobs: ship an asset that reviewers will trust, evaluate one that someone else built, or set up governance for a fleet. We have a starting point for each.

/ For builders

Submit an asset for review.

List a skill, prompt, MCP server, agent, or app in the public directory. Free for individuals. Verdict is signed, dated, and linkable.

Open Vettd Directory →
/ For reviewers

Look up a specific asset.

Search the directory by name, vendor, or hash. Read the verdict, the rubric outcomes, and any findings filed by other reviewers.

Browse the directory →
/ For enterprise

Set up governance for a fleet.

Private review queue, dedicated reviewer time, framework cross-walks against your compliance frame. Outcomes still public-rubric.

Book a 30-min briefing →