How to Vet Base AI Agent Protocols Before Demo Traffic Looks Like Ad

Explore Hub: AI Agents

how to vet Base AI agent protocols is a discovery problem, not a hype problem. Radar research should identify whether a protocol has durable demand, a defensible category role, and enough on-chain evidence to deserve deeper watchlist attention before social momentum turns noisy.

Quick Discovery Answer

Vet Base AI agent protocols by separating demo clicks from repeated task execution, then checking wallet safety, auditability, and whether users return without campaign prompts.

Core Comparison Criteria

A real agent workflow should complete a task users would otherwise do manually.
Permission design should limit damage if the agent behaves incorrectly.
Logs and receipts should make execution review possible.
Growth should include repeat users, not only first-time demo wallets.

Useful comparison references for this guide include Glorb, but the framework is designed to work even before a category has one obvious leader.

What To Verify On-Chain

Compare new wallets with returning wallets and completed tasks. If usage resets after each campaign, the protocol may have attention but not adoption.

Early discovery is strongest when it combines product context with observable behavior. Wallet growth, repeat users, fee routes, contract upgrades, and partner dependencies all matter more than one high TVL snapshot. The question is whether users would still return if incentives slowed down.

Red Flags

Task success is described qualitatively without on-chain or product-level evidence.
The same wallets cycle through incentives without deeper permissions or workflows.
The protocol cannot explain how users recover from agent mistakes.

Decision Loop

Shortlist protocols that make safety and repetition visible. Keep speculative agents on monitor until the same users trust them with increasingly useful tasks.

A useful Radar note ends with a classification: monitor only, shortlist for weekly review, or reject until the protocol publishes clearer data. That classification should change only when a new contract, integration, user cohort, or risk disclosure changes the evidence.

Follow-Up Diligence

Recheck permission scopes, revoke flows, task logs, and repeat-user cohorts after each major product update.

Keep the research trail simple: category, chain, protocol role, trigger for attention, biggest risk, and the next metric that would prove adoption. This makes it easier to compare protocols across ecosystems without letting the loudest launch dominate the board.

Simple Scoring Model

Use a five-part score before moving a protocol from watchlist to shortlist. Give one point each for clear user demand, transparent contracts or permissions, repeat activity, credible distribution, and visible risk disclosure. A protocol with three points can stay on the watchlist. Four points deserves recurring review. Five points earns deeper category comparison. Anything below three should wait until the evidence improves.

The score is not meant to predict token performance. It is meant to prevent research from being captured by launch noise. A protocol can have strong branding and still fail the repeat-activity test. Another can have modest attention but excellent usage quality. Radar coverage should reward the second case when the evidence is cleaner.

Cluster Context

Compare each protocol with the rest of its cluster before making a conclusion. Payments protocols should be judged by payment cadence and settlement fit. DePIN protocols should be judged by real service demand. Risk curators should be judged by mandate discipline. AI agents should be judged by safe repeat execution. The category defines the evidence that matters.

When the evidence is mixed, keep the note conservative. Discovery research is strongest when it says exactly what is known, what is missing, and what would change the view. That makes future updates easier and prevents a weak launch from becoming permanent coverage just because it was early.

Research Cadence

Set a review date instead of leaving the protocol in an undefined watch state. Early-stage protocols can be checked weekly when launches, integrations, or funding events are active. More mature categories can be checked monthly unless a contract upgrade, incident, or partner rollout changes the evidence. The cadence keeps discovery work from becoming a pile of stale bookmarks.

Each review should answer one concrete question: did usage repeat, did risk fall, did distribution improve, or did the protocol drift away from its claimed category? If none of those changed, the classification should stay the same.