Methodology

Every number on this site can be traced back to a source. This page is the trail. If a stat is wrong, this is also the page you'd email us to argue about.

Last updated: 2026-04-30 (UTC)

Sections

Data sources
How disclosure lag is computed
How simulator returns are computed
How conflict scoring works
How the benchmark works
Limitations
Reproducibility

Data sources #

Primary: Capitol Trades (capitoltrades.com) — aggregates House and Senate STOCK Act filings, scraped every 30 minutes.
Secondary (currently degraded): house-clerk PTR XML feed and Senate EFD search results — both upstream of Capitol Trades, ingested for redundancy.
Disclosure data is exactly what Congress members file under the STOCK Act. We do not collect non-disclosed trades, private investments without ticker symbols (~15,902 rows in our DB), or trades by family members not legally required to be disclosed.

How disclosure lag is computed #

Lag = (disclosure_date - trade_date), clamped at 0 (negative values quarantined as date-parser errors).
Median: 27 days. We display median, not mean, because disclosure lag is heavy-tailed: a small number of very-late filings (some over a year) drag the mean to 72 days, which misrepresents typical behavior.
"% past 45-day deadline" = count(rows where lag > 45) / total. Currently 17%.
31 rows with corrupted dates (lag > 1825 days, almost certainly date-parser errors from source filings) are excluded from all aggregations. They retain their ticker, politician, and amount data — only time-based aggregates skip them.

How simulator returns are computed #

Entry price: adjusted close on trade_date (or nearest prior trading day if trade_date was a market holiday/weekend).
Exit price: adjusted close on most recent trading day.
Adjusted close handles stock splits and dividend payments — so a 4-for-1 split doesn't show as a -75% drop.
Hypothetical return = (exit - entry) / entry × 100.
Trade size: STOCK Act discloses ranges ($1,001-$15,000, $15,001-$50,000, etc.) not exact amounts. Simulator uses the midpoint of each range. True trade size could differ by up to ±50% per trade.
We compare against SPY (S&P 500 ETF) total return over the same window.
"If you copied this politician" assumes equal dollar weight per trade, on the disclosure date. Disclosure date is the earliest a public observer could have known about the trade — using trade_date here would be lookahead bias.

How conflict scoring works #

Each politician's committee assignments are stored in committees.json (sourced from clerk.house.gov primary).
Each committee has a set of sector jurisdictions (e.g. House Armed Services covers Defense, Aerospace, Industrials, Technology, Cybersecurity, Communications).
Each ticker has a GICS sector classification (sourced from Finnhub).
A trade is "conflicted" if the politician sits on a committee whose sectors include the ticker's GICS sector.
"High-Conflict Activity" cluster: 3+ politicians trade the same ticker within 7 days, AND at least one of them has a committee jurisdiction overlap with that ticker's sector.
Conflict scoring is mechanical, not editorial. We don't claim individual conflicts are unethical — we surface the overlap. Readers draw their own conclusions.

How the benchmark works #

28 questions, hand-authored, with ground truth computed live from the D1 database at run time (not cached, not pre-baked).
Two conditions: cold (no tools, model answers from training data only) and MCP-augmented (model has access to our 12-tool MCP server).
Models tested: Claude Opus 4.6, Sonnet 4.6, GPT-5, Grok-4 (automated harness, last run 2026-04-17).
One additional row: Claude Opus 4.7 hand-fed via chat (manual, single run, not comparable to the automated rows — labeled as such).
Scoring: numeric tolerance (±5%) for number questions, Jaccard set overlap for list questions, LLM-as-judge (Claude Haiku) for free-text questions.
Results are reproducible: harness, questions, and scoring scripts are in the repo.
Disclosure lag stats computed on records with valid trade_date and disclosure_date (corrupt date entries excluded).

Full leaderboard and per-category scores at /benchmark.

Limitations #

Trade size is range-based, not exact (STOCK Act limitation).
Disclosure lag is bounded by what Congress files — late filings inflate the long tail.
Capitol Trades is the primary source; if their site changes format we go blind until we adapt.
Benchmark covers 28 questions; not exhaustive coverage of "everything an LLM could be asked about congressional trading."
Returns calculations assume held-since-trade. Real politicians may have sold positions we don't see if the sale wasn't disclosed yet (45-day window).
Public API endpoints are rate-limited per IP to prevent abuse. See /mcp-docs for MCP-specific limits.

Reproducibility #

Public dataset: every trade visible on the site with date, politician, ticker, amount range, source-link to the original filing.
Open MCP server at /mcp (no auth, JSON-RPC).
Benchmark harness, questions, scoring scripts, and frozen result snapshots in the repo: https://github.com/freshcod3r/congress-trade-alerts.
All copy on the site that cites a number is traceable to a query against the public dataset.