Claude Code used to have a warning that toggling thinking within a conversation would decrease performance:
> Changing thinking mode mid-conversation will increase latency and may reduce quality. For best results, set this at the start of a session.
Neither OpenAI nor Anthropic exposes raw thinking tokens anymore.
Claude Code redacts thinking by default (you can opt in to get Haiku-produced summaries at best), and OpenAI returns encrypted reasoning items.
Either way, first-party CLIs hold opaque thinking blobs that can't be manipulated or ported between providers without dropping them.
So cross-agent resume carries an inherent performance penalty: you keep the (visible) transcript but lose the reasoning.
I don't think I've ever /resumed a Claude Code session even once. What do people use that for? The way I use it is to make a change, maybe document the change, and then I'm done. New session.
I have like 15 concurrent sessions I leave up for weeks, 50% Codex 50% Claude Code, even though I know they work better with fresh context. Then again I also always have least 200 browser tabs up. I probably just have a mental illness.
Most of the time it's when I want to go back and have a skill made for future reuse, but with remote control I've had some sessions open for remote diagnostics and it just works better than starting from scratch - even having lessons learned to create memories and update Claude.md.
I know it's wasteful but often I've got a surplus of tokens and not enough of my time - so it's a trade off I've been fine with.
I'd use it if I hit the 5 hour quota mid-change and then came back later in the day in a new terminal (depending on the input/output ratio of my now un-cached context, of course).
Tooling like this is why I really want to build my own harness that can replace Claude Code, because I have been building a few different custom tools that would be nice as part of one single harness so I don't have to tweak configurations across all my different environments, projects and even OS' it gets tiresome, and Claude even has separate "memories" on different devices, making the experience even more inconsistent.
I've actually had the same itch and decided to give it a go ... So far I'm one year into the project, learned a ton and highly recommend to anyone who'd listen - try writing you own harness. It can be fun, it can be intoxicating, it can also be boring and mundane. However you'll learn so much along the way, even if you thought you already were well versed.
Interesting.
What kind of context usage does it have when switching between the two providers? Like is it smart about using the # tokens when you go from claude -> codex or vice versa for a conversation?
How does ctx "normalize" things across providers in the context window ( e.g. tool/mcp calls, sub-agent results)?
The way this works is that it stores workstreams and session state in a local SQLite DB, and links each ctx session to the exact local Claude Code and/or Codex raw session log it came from (also stored locally).
Prompt caching is done on the provider side. If you send two requests to a provider in short succession and the beginning of your second request is the same as your first (for example, because your second request is the continuation of an ongoing chat), the repeated tokens are much less expensive the second time.
Obviously, your tool does not provide this. But I think GP is undervaluing the UX advantages of having your conversation history.
Yes that's it. I actually just ask codex/claude code to look up the session id when I want to resume sessions cross harness, it's just jsonl files locally so it can access the full conversation history when needed.
Great callout about the prompt caching, this switch is going to burn subscription limits on Claude real real fast.
Unless the goal is to move from one provider to another and preserve all context 1:1. And I can’t seem to find a decent reason why you would want everything and not the TLDR + resulting work.
> Changing thinking mode mid-conversation will increase latency and may reduce quality. For best results, set this at the start of a session.
Neither OpenAI nor Anthropic exposes raw thinking tokens anymore.
Claude Code redacts thinking by default (you can opt in to get Haiku-produced summaries at best), and OpenAI returns encrypted reasoning items.
Either way, first-party CLIs hold opaque thinking blobs that can't be manipulated or ported between providers without dropping them. So cross-agent resume carries an inherent performance penalty: you keep the (visible) transcript but lose the reasoning.
I know it's wasteful but often I've got a surplus of tokens and not enough of my time - so it's a trade off I've been fine with.
But yeah, after the price hikes, it's inevitable that people will run open source harnesses
How does ctx "normalize" things across providers in the context window ( e.g. tool/mcp calls, sub-agent results)?
The way this works is that it stores workstreams and session state in a local SQLite DB, and links each ctx session to the exact local Claude Code and/or Codex raw session log it came from (also stored locally).
What do you mean by prompt caching?
Obviously, your tool does not provide this. But I think GP is undervaluing the UX advantages of having your conversation history.
Unless the goal is to move from one provider to another and preserve all context 1:1. And I can’t seem to find a decent reason why you would want everything and not the TLDR + resulting work.