Methodology
Every number on this site can be traced back to a source. This page is the trail. If a stat is wrong, this is also the page you'd email us to argue about.
Last updated: 2026-04-30 (UTC)
Data sources #
- Primary: Capitol Trades (capitoltrades.com) — aggregates House and Senate STOCK Act filings, scraped every 30 minutes.
- Secondary (currently degraded): house-clerk PTR XML feed and Senate EFD search results — both upstream of Capitol Trades, ingested for redundancy.
- Disclosure data is exactly what Congress members file under the STOCK Act. We do not collect non-disclosed trades, private investments without ticker symbols (~15,902 rows in our DB), or trades by family members not legally required to be disclosed.
How disclosure lag is computed #
- Lag =
(disclosure_date - trade_date), clamped at 0 (negative values quarantined as date-parser errors).
- Median: 27 days. We display median, not mean, because disclosure lag is heavy-tailed: a small number of very-late filings (some over a year) drag the mean to 72 days, which misrepresents typical behavior.
- "% past 45-day deadline" =
count(rows where lag > 45) / total. Currently 17%.
- 31 rows with corrupted dates (lag > 1825 days, almost certainly date-parser errors from source filings) are excluded from all aggregations. They retain their ticker, politician, and amount data — only time-based aggregates skip them.
How simulator returns are computed #
- Entry price: adjusted close on
trade_date (or nearest prior trading day if trade_date was a market holiday/weekend).
- Exit price: adjusted close on most recent trading day.
- Adjusted close handles stock splits and dividend payments — so a 4-for-1 split doesn't show as a -75% drop.
- Hypothetical return =
(exit - entry) / entry × 100.
- Trade size: STOCK Act discloses ranges ($1,001-$15,000, $15,001-$50,000, etc.) not exact amounts. Simulator uses the midpoint of each range. True trade size could differ by up to ±50% per trade.
- We compare against SPY (S&P 500 ETF) total return over the same window.
- "If you copied this politician" assumes equal dollar weight per trade, on the disclosure date. Disclosure date is the earliest a public observer could have known about the trade — using
trade_date here would be lookahead bias.
How conflict scoring works #
- Each politician's committee assignments are stored in
committees.json (sourced from clerk.house.gov primary).
- Each committee has a set of sector jurisdictions (e.g. House Armed Services covers Defense, Aerospace, Industrials, Technology, Cybersecurity, Communications).
- Each ticker has a GICS sector classification (sourced from Finnhub).
- A trade is "conflicted" if the politician sits on a committee whose sectors include the ticker's GICS sector.
- "High-Conflict Activity" cluster: 3+ politicians trade the same ticker within 7 days, AND at least one of them has a committee jurisdiction overlap with that ticker's sector.
- Conflict scoring is mechanical, not editorial. We don't claim individual conflicts are unethical — we surface the overlap. Readers draw their own conclusions.
How the benchmark works #
- 28 questions, hand-authored, with ground truth computed live from the D1 database at run time (not cached, not pre-baked).
- Two conditions: cold (no tools, model answers from training data only) and MCP-augmented (model has access to our 12-tool MCP server).
- Models tested: Claude Opus 4.6, Sonnet 4.6, GPT-5, Grok-4 (automated harness, last run 2026-04-17).
- One additional row: Claude Opus 4.7 hand-fed via chat (manual, single run, not comparable to the automated rows — labeled as such).
- Scoring: numeric tolerance (±5%) for number questions, Jaccard set overlap for list questions, LLM-as-judge (Claude Haiku) for free-text questions.
- Results are reproducible: harness, questions, and scoring scripts are in the repo.
- Disclosure lag stats computed on records with valid
trade_date and disclosure_date (corrupt date entries excluded).
Full leaderboard and per-category scores at /benchmark.
Limitations #
- Trade size is range-based, not exact (STOCK Act limitation).
- Disclosure lag is bounded by what Congress files — late filings inflate the long tail.
- Capitol Trades is the primary source; if their site changes format we go blind until we adapt.
- Benchmark covers 28 questions; not exhaustive coverage of "everything an LLM could be asked about congressional trading."
- Returns calculations assume held-since-trade. Real politicians may have sold positions we don't see if the sale wasn't disclosed yet (45-day window).
- Public API endpoints are rate-limited per IP to prevent abuse. See /mcp-docs for MCP-specific limits.
Reproducibility #
- Public dataset: every trade visible on the site with date, politician, ticker, amount range, source-link to the original filing.
- Open MCP server at /mcp (no auth, JSON-RPC).
- Benchmark harness, questions, scoring scripts, and frozen result snapshots in the repo: https://github.com/freshcod3r/congress-trade-alerts.
- All copy on the site that cites a number is traceable to a query against the public dataset.