Methodology
Every number on this site can be traced back to a source. This page is the trail. If a stat is wrong, this is also the page you'd email us to argue about.
Last updated: 2026-04-30 (UTC)
Sections
Data sources #
- Primary: Capitol Trades (capitoltrades.com) — aggregates House and Senate STOCK Act filings, scraped every 30 minutes.
- Secondary (currently degraded): house-clerk PTR XML feed and Senate EFD search results — both upstream of Capitol Trades, ingested for redundancy.
- Disclosure data is exactly what Congress members file under the STOCK Act. We do not collect non-disclosed trades, private investments without ticker symbols (~15,902 rows in our DB), or trades by family members not legally required to be disclosed.
How disclosure lag is computed #
- Lag =
(disclosure_date - trade_date), clamped at 0 (negative values quarantined as date-parser errors). - Median: 27 days. We display median, not mean, because disclosure lag is heavy-tailed: a small number of very-late filings (some over a year) drag the mean to 72 days, which misrepresents typical behavior.
- "% past 45-day deadline" =
count(rows where lag > 45) / total. Currently 17%. - 31 rows with corrupted dates (lag > 1825 days, almost certainly date-parser errors from source filings) are excluded from all aggregations. They retain their ticker, politician, and amount data — only time-based aggregates skip them.
How simulator returns are computed #
- Entry price: adjusted close on
trade_date(or nearest prior trading day iftrade_datewas a market holiday/weekend). - Exit price: adjusted close on most recent trading day.
- Adjusted close handles stock splits and dividend payments — so a 4-for-1 split doesn't show as a -75% drop.
- Hypothetical return =
(exit - entry) / entry × 100. - Trade size: STOCK Act discloses ranges ($1,001-$15,000, $15,001-$50,000, etc.) not exact amounts. Simulator uses the midpoint of each range. True trade size could differ by up to ±50% per trade.
- We compare against SPY (S&P 500 ETF) total return over the same window.
- "If you copied this politician" assumes equal dollar weight per trade, on the disclosure date. Disclosure date is the earliest a public observer could have known about the trade — using
trade_datehere would be lookahead bias.
How conflict scoring works #
- Each politician's committee assignments are stored in
committees.json(sourced from clerk.house.gov primary). - Each committee has a set of sector jurisdictions (e.g. House Armed Services covers Defense, Aerospace, Industrials, Technology, Cybersecurity, Communications).
- Each ticker has a GICS sector classification (sourced from Finnhub).
- A trade is "conflicted" if the politician sits on a committee whose sectors include the ticker's GICS sector.
- "High-Conflict Activity" cluster: 3+ politicians trade the same ticker within 7 days, AND at least one of them has a committee jurisdiction overlap with that ticker's sector.
- Conflict scoring is mechanical, not editorial. We don't claim individual conflicts are unethical — we surface the overlap. Readers draw their own conclusions.
Committee jurisdiction overlap (Panel 4) #
Panel 4 on the homepage — "trades within member's committee jurisdiction" — counts trades disclosed in the last 7 days where the trader sits on a committee whose jurisdiction sectors include the security's GICS sector. Same join logic as the High-Conflict Activity widget above, but applied per-trade rather than per-cluster, so the count answers "how many recent trades carry committee overlap" not "how many clusters."
- For every trade with a non-null ticker disclosed in the last 7 days, we resolve the ticker to a GICS sector (preferring
trades.sector, falling back to a static map). - We batch-pull the trader's committee assignments from
politician_committees. - The trade is counted if any of those committees has a jurisdiction sector that exactly matches the ticker's sector (case-insensitive, no adjacent-sector dilution). One trade counts once even when the trader sits on multiple matching committees.
Coverage (read this before quoting the number)
Backfill from the @unitedstates/congress-legislators committee-membership feed plus a server-side legal-name alias mirror last ran 2026-05-06. Current coverage:
- 7 of 7 (100%) distinct members who disclosed trades with a ticker in the last 7 days are covered. The remaining client-side display-vs-legal mismatches (e.g. Lizzie/Elizabeth Fletcher, Felix Barry Moore, John P Ricketts, Ladda Tammy Duckworth, Nanette Barragan, Rafael E Cruz) are now mirrored into
politician_committeesunder both forms via a small bioguide-keyed alias map in the backfill script. - 203 of 380 (53%) distinct trade-name strings ever seen in our trades table are covered. The remaining 177 are almost entirely historical members who left Congress in 2018–2024 and rarely disclose now — they don't affect the recent-window panel numbers.
- Coverage progression: 102 / 379 (27%) pre-backfill → 196 / 379 (52%) after the upstream backfill (CTA-17) → 203 / 380 (53%) after the legal-name alias mirror (CTA-19). Panel 4 count progression on the 7-day window: 11 → 23 → 31 (+182% from baseline) on the same query logic.
The methodology is unchanged — only coverage changes. The Panel 4 number remains a floor, not a ceiling: any uncovered member with genuine jurisdictional overlap is silently missed. We re-run the backfill whenever upstream membership data updates or new alias mismatches surface in trades.
How the benchmark works #
- 28 questions, hand-authored, with ground truth computed live from the D1 database at run time (not cached, not pre-baked).
- Two conditions: cold (no tools, model answers from training data only) and MCP-augmented (model has access to our 12-tool MCP server).
- Models tested: Claude Opus 4.6, Sonnet 4.6, GPT-5, Grok-4 (automated harness, last run 2026-04-17).
- One additional row: Claude Opus 4.7 hand-fed via chat (manual, single run, not comparable to the automated rows — labeled as such).
- Scoring: numeric tolerance (±5%) for number questions, Jaccard set overlap for list questions, LLM-as-judge (Claude Haiku) for free-text questions.
- Results are reproducible: harness, questions, and scoring scripts are in the repo.
- Disclosure lag stats computed on records with valid
trade_dateanddisclosure_date(corrupt date entries excluded).
Full leaderboard and per-category scores at /benchmark.
Limitations #
- Trade size is range-based, not exact (STOCK Act limitation).
- Disclosure lag is bounded by what Congress files — late filings inflate the long tail.
- Capitol Trades is the primary source; if their site changes format we go blind until we adapt.
- Benchmark covers 28 questions; not exhaustive coverage of "everything an LLM could be asked about congressional trading."
- Returns calculations assume held-since-trade. Real politicians may have sold positions we don't see if the sale wasn't disclosed yet (45-day window).
- Public API endpoints are rate-limited per IP to prevent abuse. See /mcp-docs for MCP-specific limits.
Reproducibility #
- Public dataset: every trade visible on the site with date, politician, ticker, amount range, source-link to the original filing.
- Open MCP server at /mcp (no auth, JSON-RPC).
- Benchmark harness, questions, scoring scripts, and frozen result snapshots in the repo: https://github.com/freshcod3s/congress-trade-alerts.
- All copy on the site that cites a number is traceable to a query against the public dataset.
Corrections #
Found an error? Email us at congresstradealertsapp@gmail.com.