The Blatsheet

Numbers, narratives, and unsolicited opinion about basketball statistics and visualization.

Filtering news

A player's name appears in an article. The crawler sees it, matches it, stores it. Now that article shows up on the player's news feed. Simple. Except the article isn't about them.

"Lakers finalise trade package centred around Anthony Davis" mentions LeBron James in paragraph four. A passing reference. Context, not subject. But the crawler doesn't know the difference between being the story and being mentioned in one.

The mention problem

Name-matching is binary. The name is either in the text or it isn't. What we actually need is relevance — is this article meaningfully about this player, or does it just reference them in passing?

A headline mention is strong signal. A first-paragraph mention is decent. A mention buried in paragraph six alongside fifteen other names is noise. But automating that distinction requires understanding article structure, not just scanning for strings.

What we do now

The current system uses headline fingerprinting and Jaccard similarity to deduplicate stories across sources. If ESPN and Yahoo both cover the same trade, only the highest-priority source version survives. That part works.

What doesn't work well is the relevance filter. A player mentioned once in a 2,000-word roundup gets the same treatment as a player who is the sole subject of a profile piece. Both get stored. Both show up. The feed fills with tangential mentions that dilute the signal.

Where this goes

The honest answer is that filtering context — distinguishing subject from mention — remains an ongoing challenge. The options are positional weighting (headline > lead > body), mention density (one name in 200 words vs. one name in 2,000), or co-occurrence patterns (is this player mentioned alongside their team's activity, or just namedropped?).

None of these are clean. All of them are better than what we have, which is: if the name appears, the article counts. Working on it.

The player who never misses vs. the player who never plays

Why TCR and PGR tell different stories about the same season

Two numbers sit side by side on every player page. TCR. PGR. One measures what you did all season. The other measures what you did when you showed up. In most cases they track together. When they diverge, they reveal something the box score hides.

What they are

PGR is Per Game Rank. It takes a player's per-game averages across nine categories—points, rebounds, assists, steals, blocks, threes, turnovers, FG impact, FT impact—converts each to a z-score against the league, and ranks the composite. It answers: how good is this player on any given night?

TCR is Total Contribution Rank. Same nine categories, but using season totals instead of averages. Benchmarked against the top 170 players by minutes. It answers: how much has this player actually contributed to your fantasy team this year?

The gap between them is where the story lives.

The case for showing up

Jay Huff
Jay Huff
IND · C · 28 years old
Games played75
Minutes/game20.8
Points9.3
Rebounds3.8
Assists1.3
Blocks1.8
FG%47.3%
FT%81.9%
TCR 56
PGR 133
Gap +77

Jay Huff is not a star. 9.3 points, 3.8 boards, 1.8 blocks in 21 minutes off the bench. His per-game rank says he is the 133rd best fantasy player in the league. That's barely rosterable.

His season rank says 56th. That puts him ahead of players you actually drafted. Players you traded for. Players you talk about. The reason is the number in bold above: 75 games played. Huff shows up. Game after game after game. Those 1.8 blocks per night compound into a season-long blocks total that outpaces players with better per-game numbers who sat out 20 games.

In a fantasy season that runs six months, availability is not just the best ability. It is the entire argument.

The case for not showing up

Kristaps Porzingis
Kristaps Porziņģis
GSW · F-C · 30 years old
Games played28
Minutes/game23.9
Points17.3
Rebounds4.9
Assists2.6
Blocks1.3
FG%45.4%
FT%83.1%
PGR 48
TCR 254
Gap −206

Kristaps Porziņģis has played 28 games this season. When he plays, he is the 48th best fantasy player in basketball. But he has missed more games than he has played. His season total contribution ranks him 254th. That is not a roster spot. That is a memory.

This is the trap that PGR sets for you. Porziņģis looks like a top-50 player. He plays like a top-50 player. But he plays less than half the season, and fantasy does not give credit for games you could have played.

The outliers

The most dramatic gaps reveal the season's truths. Joel Embiid: PGR 14, TCR 136. When he plays he is elite. He has played 36 games. Jimmy Butler: PGR 16, TCR 122. Anthony Davis: PGR 31, TCR 278. Paul George: PGR 37, TCR 222. Talent does not expire. But it does depreciate when it sits on the bench in a suit.

PlayerGPPGRTCRGap
Joel Embiid3614136−122
Jimmy Butler III3816122−106
Anthony Davis2031278−247
Paul George3037222−185
Jay Huff7513356+77
Nickeil Alexander-Walker732413+11
Mikal Bridges753515+20
Desmond Bane742817+11

The bottom half of that table is the longevity argument. Mikal Bridges at PGR 35 is not exciting. At TCR 15 he is one of the most valuable fantasy assets in basketball, because he played 75 games and his contributions compounded quietly over six months.

What to do with this

If you play weekly head-to-head, PGR matters more on any given matchup. But if you are building a roster for the long season—or trading mid-year—TCR is the number that tells the truth. A player who gives you 80% of the value for 95% of the games will outscore a player who gives you 100% of the value for 40% of the games.

You can explore both ranks on the Players page, sorted and filtered however you like. The gap between the two columns is the season's real story.

The rule of thumb: A high TCR with a mediocre PGR is a player your league undervalues. A high PGR with a low TCR is a player your league overvalues.

Removing the minimums

The NBA uses qualification thresholds for percentage stats. 300 field goals made for FG%. 82 threes for 3P%. 125 free throws for FT%. These numbers exist so a player who goes 2-for-2 from three does not lead the league at 100%.

Fair enough for a full season. 82 games. Months of data. The thresholds filter noise.

Shrink the window to January and those numbers collapse. Nobody hits 300 field goals in a month. The thresholds that protected signal become the thing that kills it. Every player fails qualification. The leaderboard empties.

What we changed

When a custom date range is active, qualification minimums are removed entirely. All players with games in the window are eligible. FG%, 3P%, FT% leaders reflect the actual sample. Small sample, yes. Misleading, possibly. But visible. The user chose the window. The data should respect it.

A quiet line appears below the stats table: Qualification minimums removed for custom date range. No modal. No tooltip. Just a fact.

Why it matters

Thresholds are not universal constants. They are calibrated to a specific scope. Change the scope, recalibrate the threshold. In our case, recalibrating to zero was the honest answer.