Claude Code Fixes a 5x Context Window Bug
This episode breaks down a compaction bug that made Claude Code treat Opus 4.7 like it had a 200K-token limit instead of 1M, causing long coding sessions to summarize far too early. It also covers the new default high effort setting for Pro and Max users on Opus 4.6 and Sonnet 4.6, and what that means for reasoning quality and token usage.
This show was created with Jellypod, the AI Podcast Studio. Create your own podcast with Jellypod today.
Get StartedIs this your podcast and want to remove this banner? Click here.
Chapter 1
The 5x compaction bug hiding in plain sight
Lachlan Reed
Welcome to the show! I'm Lachlan Reed with James Turner. James, imagine you're deep in a refactor, you've got this lovely long Claude Code session going, and then somewhere around the 200K-token mark the thing suddenly starts acting like it had a little bump on the head. [chuckles]
James Turner
[curious] That 200K number is the whole bug, right? Because Claude Code 2.1.117 was treating Opus 4.7 like it had a 200,000-token window, when Opus 4.7's native context is actually 1 MILLION.
Lachlan Reed
Exactly. Five times off. Bit of a shocker. So `/autocompact` could kick in at roughly 200K instead of letting the session stretch out toward 1M. Which means long sessions were getting compacted about five times too early.
James Turner
[matter-of-fact] And early compaction is not just cosmetic. Once the tool summarizes and compresses context, the model is no longer reading the raw thread in the same way. It's reading a reduced version sooner than it should.
Lachlan Reed
Yeah, and that's the bit that stings. [pauses] I've had this happen in coding tools where you're halfway through a giant cleanup -- renamed three services, moved config, fixed tests -- and then the assistant starts proposing the OLD structure back at you. Not totally wrong, just wrong enough that you go, "Mate... were you not here ten minutes ago?"
James Turner
[responds quickly] "Were you not here ten minutes ago" is the perfect description. Because a premature summary doesn't look like a crash. It looks like confidence with missing memory.
Lachlan Reed
[laughs softly] Yes! That's why it feels like gaslighting. You know you already settled the dependency direction or the naming convention, and suddenly the tool's like, "Have you considered the exact thing we rejected 40 minutes back?" And now you're doubting your own notes.
James Turner
The token waste matters too. If compaction fires at 200K instead of near 1M, you're forcing the model to re-read summarized context much earlier and potentially more often. So you lose fidelity AND you may burn tokens re-establishing state.
Lachlan Reed
Right. It's like replacing a full workshop manual with a sticky note way too early. The sticky note might say "engine partly disassembled, check carburetor" -- useful, sure -- but if you're rebuilding the whole trail bike, that's not enough detail to trust with your knuckles.
James Turner
[skeptical] And because the mismatch was specific, this wasn't some vague "AI forgot stuff" complaint. It was a concrete accounting bug: Opus 4.7 had a 1M-token native window, but Claude Code was budgeting like it only had 200K.
Lachlan Reed
Which is honestly reassuring in a weird way. Bad, yes. But fixable bad. Not mystical bad. [exhales sharply]
Chapter 2
Why it mattered in actual developer workflows
James Turner
[excited] The practical effect of the fix is pretty straightforward: if you're on a premium plan using Opus 4.7, your sessions should now stay coherent for WAY longer before autocompaction triggers.
Lachlan Reed
And "way longer" here means the difference between a proper multi-hour work session and one that starts wobbling before lunch. That matters for debugging especially. If you've spent an hour narrowing a bug to, say, one race condition in a queue worker, losing the earlier elimination steps is brutal.
James Turner
The queue-worker example is good because the lost context isn't just facts. It's rejected paths. "We already ruled out Redis config." "We already confirmed the timeout isn't network." Those negatives are expensive to rediscover.
Lachlan Reed
Same with architecture planning. If you've spent pages agreeing on constraints -- legacy database stays, API shape can't break mobile, rollout has to be gradual -- and then autocompact fires early, the conversation changes shape. The model may still sound clever, but it's now reasoning from a compressed version of those constraints.
James Turner
That's the tension I keep coming back to: autocompact is useful when it's correct. I actually LIKE it as a feature. But when it fires early, it's kind of disastrous because nothing visibly explodes. The tool doesn't say "failure." It silently swaps a rich history for a thinner one.
Lachlan Reed
[reflective] Yep. A loud failure at least has the decency to ruin your afternoon honestly. A silent one just makes the work feel slippery.
James Turner
And this bug was narrow. Opus 4.6 and Sonnet 4.6 were never affected because they genuinely use 200K context windows. So 200K was correct there. The mismatch only hit Opus 4.7, where the native window is 1M.
Lachlan Reed
That distinction is worth underlining. If someone says, "Hang on, my Sonnet 4.6 session compacted around 200K and seemed fine," yeah -- that's because 200K is the real ceiling there. No funny business.
James Turner
[questioning tone] So this isn't "autocompact bad." It's "wrong context accounting bad."
Lachlan Reed
Bingo. If the speed limit on the road is 100 and your brakes kick in at 20, the brakes aren't evil. They're just firing at the wrong bloody time.
Chapter 3
The default effort bump to high
Lachlan Reed
There's another update tucked in here, and this one's more about answer quality than memory. On Opus 4.6 and Sonnet 4.6, Pro and Max subscribers now default to `high` effort instead of `medium` automatically. No config change needed.
James Turner
And the key token there is `high`. Higher effort usually means slightly higher token usage per response, but also more deliberate reasoning before Claude answers. For code generation, debugging, architecture decisions -- that's often where the extra thinking pays off.
Lachlan Reed
I like that you said "slightly higher token usage" and not magic. Because there is a tradeoff. You are paying for more thought. The question is whether that should be the DEFAULT.
James Turner
I'm mostly yes on that. For serious dev workflows, defaulting to better reasoning feels right. Most people don't open Claude Code to get the fastest mediocre answer. They open it because a wrong migration plan or a shallow debugging path costs more than a few extra tokens.
Lachlan Reed
[skeptical] Mostly agree... but I can hear the indie devs grinding their teeth already. If you're doing lots of quick iterations, cost creep is real. Tiny per-response bumps add up, especially if you're in there all day poking at a big repo.
James Turner
Sure, but there's context on the context -- different context, sorry. [laughs] API key, Team, and Enterprise users were already on `high` by default since v2.1.94. So this isn't some totally new experiment. It's bringing Pro and Max into line with what another chunk of the ecosystem already had.
Lachlan Reed
That v2.1.94 detail matters. It tells you Anthropic had already decided `high` was the sane default for other user groups. So for Pro and Max, this is less a leap and more catching up.
James Turner
Exactly. And I think serious coding tasks benefit from more internal deliberation. If the model takes a beat and avoids a bogus refactor, that's worth more than shaving off a little token spend.
Lachlan Reed
Maybe. Though I still reckon defaults should fit the widest daily use, not just the heaviest one. Some people want a sharp pocketknife, not a whole workshop every time they ask for a regex.
James Turner
[chuckles] Fair. But when the tool is aimed at coding, I'd rather the default bias toward "thoughtful" than "cheap and hasty."
Chapter 4
What to check now and what defaults should mean
James Turner
So, practical stuff. Update with `npm install -g @anthropic-ai/claude-code@latest`, or just let auto-update do its thing. Then inside a session, use `/context` to confirm Claude Code is tracking the correct window size.
Lachlan Reed
That `/context` command is the bit I'd tell everyone to remember. Don't just trust vibes. Check the actual reported window. If you were manually disabling autocompact on Opus 4.7 as a workaround, you can re-enable it now because the bug is fixed and it should behave properly.
James Turner
And for effort settings, use `/effort` to inspect what's active. If you need tighter cost control, explicitly set `medium` for that session, or lock `defaultEffortLevel` in your project's `CLAUDE.md`.
Lachlan Reed
I love that there are knobs, but this whole update raises a bigger question for me. [reflective] When a tool quietly fixes a deep session-management bug and also changes a quality default, how much do you trust the defaults versus checking under the hood every single time?
James Turner
The phrase "quietly fixes" is the one that sticks with me. Because the best version of software is invisible -- stuff just works. But invisible changes also mean power users have to decide where blind trust ends.
Lachlan Reed
Yeah. I don't wanna babysit every setting like a nervous mechanic listening for a weird rattle. But I also don't wanna be the mug who assumes the dashboard is right while the engine light's been unplugged.
James Turner
[calm] Maybe that's the real modern developer skill now: not just using smart tools, but periodically auditing what they think the world looks like. Context window, effort level, compaction behavior -- the hidden state matters.
Lachlan Reed
And if hidden state matters, defaults stop being boring. They become a design philosophy. Are they there to protect most users, optimize quality, control cost, or reduce support tickets? Sometimes those are the same thing... sometimes not even close.
James Turner
[softly] Which is a pretty good note to leave hanging. Check the settings, sure -- but also ask what the defaults are trying to optimize for, and whether that's actually YOUR workflow.
Lachlan Reed
Nice one. Cheers for listening, folks.
