Plumb
B-

AI/ML research SOTA leaderboards

Papers with Code

Meta (Facebook AI / FAIR); founded by Robert Stojnic and Ross Taylor

Benchmark Free to read Visit Papers with Code ↗

A free, ad-free, open-data leaderboard for AI research that nobody could pay to top, but its benchmark scores are self-reported from papers rather than independently re-run, and Meta sunset the site in July 2025.

What it's really for A free, openly licensed research SOTA tracker (now archived).

What our grade covers The grade on this page is about its state-of-the-art ML leaderboards by task, not everything the site does.

High Scoring Confidence Checked against primary sources. We are confident in the facts and the grade here.

Operating since
2018 (8 years) · source
What it costs you
Free to read The reviews are free to read.
How they make money
It made no money: a free, advertising-free public resource run by Meta AI with all content openly licensed under CC-BY-SA.
What they do
It aggregated machine-learning papers with their open-source code and ranked methods on benchmark "state-of-the-art" (SOTA) leaderboards by task, dataset and metric.
What to watch for
Leaderboard scores were taken from what papers reported and entries were openly community-editable, so results were not independently re-tested or audited by the platform.
Composite score
2.90 / 5.00 → grade B-

How the grade was reached

Independence · 30% weight 2 / 5

Does the site take money from the very entities it ranks? Pay-for-placement, vendor-funded data, and affiliate commissions all pull this down. The less the ranking can be bought, the higher the score.

Evidence basis · 30% weight 3 / 5

What is the ranking actually built on? Hands-on testing scores highest, then verified first-hand reviews, then opinion or popularity surveys and self-reported figures, then pay-to-rank, which scores lowest.

Method transparency · 20% weight 4 / 5

Is the methodology published, specific, and reproducible? Can a reader see how a given rank was reached, or is it a black box?

Conflict disclosure · 10% weight 4 / 5

Are commercial relationships, sponsorships, and affiliate arrangements disclosed clearly and near the rankings themselves, rather than buried?

Manipulation resistance · 10% weight 2 / 5

How hard is it to game? Controls against fake reviews, solicited reviews, and vendor gaming raise this; an open box anyone can stuff lowers it.

Evidence

Compare with others

Others reviewing ai models (compare all →)

← Back to the Report Card