Last run: 2026-06-08 15:56 UTC · 5 models
How well do AI models know congressional trading?
Same 28 factual questions. Every major frontier model. Each runs twice: once from memory alone ("cold"), once with access to our MCP server as its only tool. Ground truth is computed live from our database at the moment we ask.
The best model scored 0% cold — with our MCP server it scored 0%. (Best MCP score across all models: 38%.)
That gap is what an MCP server buys you: turning confident guesswork into live, verifiable data.
Methodology
Each model answers 28 questions twice: once with no tools (pure parametric knowledge), once with our MCP server wired up as its only tool. Ground truth is computed at run time by querying our D1 database directly — the benchmark tests against current data, not a frozen snapshot. Numeric answers score as correct with an exact match or within ±5% for averages/percentages. List answers use set-overlap (Jaccard index ≥ 0.8). Free-text answers are graded by Claude Haiku 4.5 as an LLM-as-judge. Questions are excluded from the denominator when the ground-truth query returns empty at run time (e.g. a not-yet-populated sector table), or when the model API fails after 3 retries — so a network blip or a missing data slice doesn't count against a model's accuracy. The scored denominator is shown next to each percentage. The full question set and scoring code is in the repo.
Leaderboard
Sorted by cold-mode accuracy — how well each model does without tools. "MCP" is the same model, same question, but with access to our MCP server.
| # | Model | Cold | With MCP | Delta | Last run |
|---|---|---|---|---|---|
| 1 | claude-opus-4-6 | 0% (0/0) | 0% (0/0) | 0 pts | Jun 8, 2026 UTC |
| 2 | claude-sonnet-4-6 | 0% (0/0) | 0% (0/0) | 0 pts | Jun 8, 2026 UTC |
| 3 | gemini-2.5-pro | 0% (0/0) | 0% (0/0) | 0 pts | Jun 8, 2026 UTC |
| 4 | gpt-5 | 0% (0/26) | 38% (10/26) | +38 pts | Jun 8, 2026 UTC |
| 5 | grok-4 | 0% (0/0) | 0% (0/0) | 0 pts | Jun 8, 2026 UTC |
C/D MCP tool fixes deployed 2026-04-18 (UTC) (commit 203edcf, Worker version e0d54de0). Effect not yet measured — leaderboard above predates this deploy.
Category breakdown (cold-mode accuracy)
Where models fall down without tools. Each cell shows correct/total for that category.
| Model | Aggregate | Chamber & Party | Member-Level | Ticker-Level | Committee-Level |
|---|---|---|---|---|---|
| claude-opus-4-6 | — | — | — | — | — |
| claude-sonnet-4-6 | — | — | — | — | — |
| gemini-2.5-pro | — | — | — | — | — |
| gpt-5 | 0% (0/3) | 0% (0/4) | 0% (0/11) | 0% (0/4) | 0% (0/4) |
| grok-4 | — | — | — | — | — |
Questions every AI got wrong (cold mode)
These are the questions where parametric knowledge fails universally — and where MCP tools earn their keep.
-
top traders How many stock trades has the most-active House member disclosed in the last 90 days?Correct answer from our data: 335
-
top traders List the top 5 members of the U.S. Congress by number of stock trades disclosed in the last 90 days.Correct answer from our data: Rohit Khanna, Michael McCaul, Gilbert Cisneros, April McClain Delaney, Richard Blumenthal
-
member profile How many total stock trades has Nancy Pelosi disclosed across her career?Correct answer from our data: 156
-
member profile What is Tommy Tuberville's average return percentage across all his disclosed stock trades?Correct answer from our data: 31.36
-
committee What percentage of the Senate Finance Committee has disclosed a stock trade in the last 90 days? Give a percentage number.Correct answer from our data: 8.1
-
ticker How many stock trades has Nancy Pelosi disclosed in the last 90 days?Correct answer from our data: 0
-
aggregate How many total stock trades has the U.S. Congress disclosed all-time in this tracked database?Correct answer from our data: 37,944
-
aggregate How many unique members of the U.S. Congress have disclosed at least one stock trade?Correct answer from our data: 395
-
aggregate What is the average disclosure lag in days for congressional stock trades (the gap between trade_date and disclosure_date)?Correct answer from our data: 71.4
-
party chamber What percentage of all disclosed congressional stock trades come from the Senate versus the House? Give two percentages.Correct answer from our data: Senate 17.7%, House 82.3%
Connect your AI to our data
One MCP endpoint. 12 tools. Same data that powers this benchmark. Works with Claude Desktop, Cursor, Claude Code, and any MCP-compatible client.