Detection & Scoring System

SpotBlock uses two complementary engines: a classic scanner + quality workflow and a Super-Analyst AI pipeline.

🛡️ Reliability & Governance Controls

SpotBlock’s detection pipeline is designed to be explainable, auditable, and risk-aware by default. Outputs are only published once they meet strict promotion criteria, ensuring conservative labeling and minimizing false positives. Every decision is fully traceable, with end-to-end provenance stored for each contract, including scanner signals, enrichment data, AI-generated assessments, and human validation actions.

The system operates across three controlled stages. The Scam Scanner performs initial detection, producing a calibrated 0–100 confidence score alongside structured risk signals. These signals are then enriched by the Super-Analyst worker, which incorporates external data sources and applies a structured LLM-based evaluation to standardize reasoning. Cases that fall into defined uncertainty thresholds are escalated to human analysts, ensuring that ambiguous or higher-risk classifications receive manual review.

Governance is enforced through a unified promotion workflow that acts as a gating mechanism across all stages. This workflow ensures separation of duties between detection, enrichment, and approval, and enforces consistent publication standards. Contracts are scanned in real time as requests arrive, while scanner upgrades are versioned and documented when changes could impact results. Each analysis is stored with enough context to be reproducible and reviewable, and it supports rollback or re-processing using the relevant scanner versions—enabling internal audits, external due diligence, and continuous model monitoring. This layered approach provides a balance between automation efficiency and human oversight, aligning with institutional expectations for transparency, accountability, and controlled risk exposure.

1) 🔍 Scam Scanner (1st detection layer)

The classic scam scanner is the front line: fast, signal-driven, and optimized for continuous intake. It assigns a 0–100 confidence score from bytecode + token heuristics, then routes outcomes into auto-approval or human review.

Signal scoring: the scanner analyzes bytecode patterns and token/ownership/liquidity signals, then sums configured severity scores into a final confidenceScore (capped at 100).
Hard threshold for staging: only contracts with a minimum score (10+) are inserted into staging.
Label mapping: the scanner converts the final score + evidence into labels. For non-phishing cases: >=80 maps to critical and label scam if malicious evidence is found, otherwise suspicious; >=60 maps to the same critical + evidence rule; >=40 maps to suspicious; >=20 maps to dangerous; >=10 maps to dangerous; otherwise maps to unknown. Token name/symbol phishing is handled as a special case: it sets the score to 99, severity to critical, label to phishing scam, and can trigger auto-approval behavior.
Auto-approve vs human review: high confidence (>=80) is auto-promoted; medium confidence (40–79) is routed to human analyst review; low confidence is kept as low-confidence staging until handled by workflow.
Provenance captured: staging records store detectionReasons, tags, and metadata (so promotion can be traced back to specific signals).

2) 🧠 Super-Analyst + 🧩 Vulnerability Scanner (2nd layer)

The second layer combines Super-Analyst AI and a dedicated vulnerability scanner for deeper contract-level analysis. Super-Analyst is used when richer context is needed, while the vulnerability scanner focuses on known smart-contract risk patterns; together they improve confidence before promotion and review decisions.

📦 Retrieval Layer

Verified retriever: verified source and ABI inputs, plus lightweight on-chain and provenance enrichment when useful.
Unverified retriever: bytecode-first retrieval with contract-structure, proxy, and activity context when source is unavailable.
Both retrievers use bounded, best-effort enrichment so deeper scans gain context without exposing full internal detection logic.

⚙️ Analysis Layer

Super-Analyst AI mode: structured contract risk reasoning from retrieved evidence.
Vulnerability scanner mode: pattern-based security findings for common exploit classes and dangerous logic.
Simple/advanced AI depths are available based on required analysis intensity.
Request deduplication avoids repeated expensive analyses for the same input/version window.

🚀 Promotion output

Super-Analyst returns structured JSON including a rating in safe | suspicious | threat | unknown. Only suspicious / threat results are promoted into production entries for public explorer display; safe / unknown stays in history for traceability.

3) 👥 Human Analyst Review

Human analysts review medium-confidence candidates to reduce false positives and to finalize production labels using auditable verdicts. They can also verify additional contracts when deeper investigation is warranted, using specialist expertise to detect complex threats and strengthen the overall detection system.

Review queue: medium-confidence scanner findings are placed into staging with fields like confidenceScore, label, and detectionReasons.
Human verdict: analysts submit safe or threat.
Safe outcome: marks the staging record as rejected (false positive) without promoting to the public malicious explorer.
Threat outcome: approves the staging record and promotes it into the public malicious explorer, recording analyst identity, timestamps, and review metadata for transparency.
Audit trail: analyst comments and verdicts are stored under metadata.humanReview.latest (with timestamps and reviewer metadata) and promotion metadata is persisted alongside scanner provenance.
Discord & API workflow: validators work from configured channels (routine vs requested queue when both exist); the first analyst to resolve a staging ticket earns programme rewards. Payout structure (contributor points → SPOT claim, plus USD-stable pool where applicable) is documented in the contributor manual.

4) 📘 Compliance-Friendly Data Usage

SpotBlock is built so institutions can use the underlying data safely: each entry is traceable, evidence-backed, and conservatively published.

Conservative publication: only suspicious / threat outcomes reach the public malicious explorer.
Evidence & provenance: promoted entries carry scanner detectionReasons plus Super-Analyst output fields (including structured evidence lists) under stored enrichment fields.
Traceable lifecycle: scan requests and promotions are stored with explicit timestamps and queue states (for reproducible investigation and internal recordkeeping).
Human accountability: manual analyst verdicts and comments are recorded with reviewer metadata and timestamps, enabling internal governance workflows.

5) 🧭 User Scan Experience

The explorer experience is built to be simple for users while preserving backend safeguards against redundant runs.

Search address in explorer.
If not found, connect wallet and sign scan request.
Request enters queue and worker processes retriever + analyzer.
If promoted, user is redirected to the full explorer card with details.
Re-run for same depth/version is blocked; rerun becomes available after analyzer version updates.
Super-Analyst (Deep AI) runs typically finish in on the order of ~15 seconds but can take longer under load or for heavy contracts.