securitysoliditypythondeterministic

TrustGraph

Deterministic trust-boundary analysis for Solidity contracts with executable PoC generation.

Status: Shipped
Role: Solo project
Stack: Python · LangGraph · Foundry · Gemini 2.5 Flash

01Problem

Trust-boundary vulnerabilities — externally callable functions that accept unverified input and mutate critical state without a caller guard — are a recurring source of smart contract exploits. The CrossCurve exploit pattern illustrates this: an external function modifies protocol state based on caller-supplied data, with no check on who is calling. The result is arbitrary state manipulation by any address.

Manual auditing is slow and doesn't scale to CI. LLM-based static analysis is fast but non-deterministic — it can miss the same vulnerability on one run and flag a non-issue on another. Neither gives a CI pipeline what it needs: a reproducible signal that produces the same result on every run.

02Predicate model

Vulnerability is defined as a conjunction of four deterministic predicates: E — external visibility (callable from outside the contract); P — unverified payload (accepts external data without validation); V — critical state mutation (modifies state with economic or control-flow significance); ¬G — absent caller guard (no modifier, require, or access control restricts the caller).

A function satisfying E ∧ P ∧ V ∧ ¬G is flagged high-severity. Three of four predicates is medium-severity. The model is narrow by design — it catches a specific, auditable vulnerability class without requiring symbolic execution or cross-file reasoning.

03Scanner architecture

An 8-node LangGraph pipeline: AST parse → function node extraction → four predicate evaluations (one node each) → severity scoring → Foundry PoC generation → report emission.

The AST parser uses solidity-parser-antlr to extract function signatures, visibility modifiers, access control patterns, and state write operations. Each predicate node is a deterministic check over the extracted AST — no LLM involvement at the scan or scoring stage.

04Foundry exploit generation

For each high-severity finding, the pipeline generates a Foundry test that attempts to reproduce the trust-boundary violation: import the vulnerable contract, call the flagged function from an unpermissioned address with a crafted payload, and assert that the critical state mutation occurs.

The README demo shows two findings and a passing Foundry exploit test — the PoC confirms the vulnerability is reachable, not just analytically plausible. A passing exploit test is a stronger signal than a static annotation alone.

05VS Code diagnostics

A VS Code extension surfaces findings as inline diagnostics — flagged functions are underlined in the editor, and the hover tooltip shows the predicate match, severity, and a link to the generated PoC test.

The extension reads the JSON report emitted by the scanner. It does not re-run the analysis — the editor view is a display layer over the deterministic output.

06Optional AI explanations

Gemini 2.5 Flash is optionally invoked after severity scoring to generate a trust-assumption explanation: given the flagged function and the predicate match, it explains what trust the function incorrectly assumes and what an attacker could do.

Gemini receives the completed severity assessment as input. It cannot add or remove findings, raise or lower severity, or modify predicate scores. Its output is explanation text only — the finding is already determined.

07Tradeoffs

The four-predicate model catches trust-boundary vulnerabilities matching E ∧ P ∧ V ∧ ¬G and misses everything outside that scope. This is intentional — a narrow, auditable definition is more useful for CI integration than a broad heuristic with a high false-positive rate.

The predicate checks are AST-level, not semantic. Complex modifier patterns or access control implemented through state variables rather than direct require statements can confuse the predicate evaluation.

08Limitations

No symbolic execution. The scanner cannot reason about value flows, integer overflow conditions, or reentrancy paths. These require dedicated tools (Slither, Echidna, Halmos).

No cross-file reasoning. Cross-contract trust relationships — proxy patterns, delegatecall chains, inherited permission models — are out of scope. The scanner analyses a single Solidity file.

Narrow scope by design. Expanding the predicate set increases coverage but increases false-positive rate and reduces auditability. The current model is a deliberate point on that tradeoff curve.