Port of Context
Industry Analysis

Welcome to the Slop KPI Era: How Tokenmaxxing is Making AI Worse

Tokenmaxxing is a new productivity metric, and it's making AI output worse. Inside the Slop KPI Era and the context bloat problem nobody's measuring.

Misha Lanin
Misha Lanin
Head of Growth
"THE SLOP KPI ERA" in bold blue retro poster lettering beside a green dumpster overflowing with yellow star-shaped tokens, set against a financial stock chart background.

On a recent episode of the All-In Podcast, Jensen Huang, the CEO of NVIDIA—a company with a market cap equivalent to that of 1.5 Frances, 4 Polands, or 12 Finlands—said this: "If that $500,000 engineer did not consume at least $250,000 worth of tokens, I am going to be deeply alarmed."

At Meta, a smaller company worth just 4 Finlands, engineers have begun the process of "Tokenmaxxing." They've even started a leaderboard to see who's burning the most tokens. This is because it is unequivocally true that the more tokens you use, the more productive you're being.

For this same reason, it is widely recognized by connoisseurs that Everclear Neutral Grain Alcohol—a digestif that comes in at 95% ABV and also works well as a hospital-grade antiseptic—is the superior alcoholic beverage. By comparison, a bottle of 1937 Domaine de la Romanee-Conti, a red wine that survived 89 years, including all of World War II, in a temperature-controlled cave in Southern France, comes in at just 13% ABV. That makes the Everclear over 7 times better than the chateau whatever-it's-called.

Linus Torvalds, a man from Finland who also made something called "Linux," may disagree that more equals better. In a conversation that we found on YouTube, he had this to say about those who measure the quality of their engineers by the number of lines of code (LoC) they've written: "Anybody who thinks that's a valid metric is too stupid to work at a tech company." Perhaps the Finnish man has a point.

Imagine you're an engineer at Meta. Does that mean you can just log into Claude Code, fire up the extended-thinking Opus model, and use it to transcribe Tolstoy's War and Peace in Hellenic Greek 80,000 times? Would that get you 61.6 billion tokens higher up the leaderboard than, say, the engineer in the next cubicle over—who actually paused to think about what they were doing that day?

If our industry is moving toward slop KPIs that reward consumption first, then who's the winner? Is it the Tolstoy fan or the engineer who paused to think? Well, if the goal is to burn half of a Silicon Valley salary on tokens, guess which one's getting the promotion.

The shift is clear: token consumption is now being measured as a proxy for productivity. But who's measuring the quality of what those tokens actually produced? We need a metric for that.

Introducing the Slop Index: The Evaluative Layer for the Tokenmaxxing Era

Here's how the Slop Index works:

  1. Acquire a human being. Instead of using tokens, the Slop Index requires 'neurons', which can only be found in the brain of a human being. The human being can combine its neurons into thoughts, which can then be used to complete a surprising number of tasks.

  2. Prompt the human being. "Use your thoughts to determine whether the output of this AI model is slop or not. Is it useful? Is it stupid? Is it hallucinating? Is it solving a problem? If so, did that problem ever even exist?" And so on.

  3. Receive the verdict. The human being then returns an evaluative output in the form of thoughts, determining: is it slop or not?

*A human being is a hardware requirement for the Slop Index.

Context Bloat: The Illogical Conclusion to Tokenmaxxing

Slop KPIs and Tokenmaxxing have ushered in a strange new reality for developers, who are now rewarded for consumption rather than the quality of their output exclusively. This industrial-scale waste campaign is even more absurd when you consider that most current AI workflows, by default, are already burning way more tokens than they need to.

When AI models connect to outside systems (a process called "tool calling") they are often loaded up with giant bundles of instructions before they have even started the real task. Tell a model to connect directly to the Salesforce MCP server, for example, and it might inherit 50-plus of these tools, complete with detailed instructions on what they are and when to use them—even when most of them are irrelevant to the task at hand. It's like if, every time someone spoke to you, they began with: "Hi, my name is X, and I am going to say something right now. Also, I know how to talk, chew, swim, move my fingers, wiggle my toes, lift my left arm, lift my right leg..." This is what the industry calls context bloat. And while it pays dividends for the tokenmaxxer, it doesn't bode well for the quality of an AI model's output.

Even Anthropic, which literally makes its money on tokens, had this to say in a 2025 report: "Context must be treated as a finite resource with diminishing marginal returns" and "As the number of tokens in the context window increases, the model's ability to accurately recall information from that context decreases."

So, what we're left with isn't just a competition to performatively burn as many tokens as possible, but a race to the bottom in terms of the quality of what we're generating. And we'll be stuck in this twilight zone until the industry shifts to measuring output quality over consumption.

But, for now, the Slop Index may be a good place to start.