It's Wednesday afternoon. Codex CLI suddenly returns rate limit exceeded while your PR is only half done—not a network glitch, but OpenAI's dual rolling quotas for Codex in 2026. This article walks through understand the mechanism → pick a fix → build a fallback stack, with seven practical paths when your weekly limit is exhausted.
1. Codex limits: which layer did you hit?
Many developers assume Codex has a single "message count" cap. In practice, Codex inside a ChatGPT subscription (CLI, IDE extension, cloud tasks) is governed by two independent rolling quotas. See OpenAI Codex Pricing for the official breakdown.
1.1 5-hour rolling window
Local CLI messages and cloud tasks share one 5-hour rolling window. The window does not reset at midnight—it rolls forward five hours from your first consumption in that period. Heavy single jobs—large repos, long agent runs, cloud offload—burn quota faster.
When you hit it: CLI shows rate limit errors, the IDE extension grays out send, cloud tasks queue or refuse. This is the layer developers hit most often day to day.
1.2 Weekly limit
Above the 5-hour window sits a rolling 7-day weekly quota that caps sustained high-intensity use across the week. Even if you conserve each 5-hour window, cumulative weekly workload can still drive Weekly to 0%.
When you hit it: global throttling despite leftover 5h capacity, or CLI showing Weekly 0%. That is the "weekly limit exhausted" scenario this article addresses.
Not the same as Context Window %
Some third-party plugins display context window usage as thousands of percent—that reflects current-session token occupancy, not 5h / Weekly quota. When troubleshooting, only trust metrics labeled 5h, Weekly, or Remaining.
1.3 Reading CLI / IDE percentages
Codex CLI and IDE extensions typically show something like:
Rate Limits Remaining: 5h 96%, Weekly 94%
The keyword is Remaining:
Weekly 94%= 94% of weekly quota left, not 94% used5h 12%= only 12% left in the current 5-hour window—about to hit the cap- Only when you see
0%or explicit rate-limit errors should you reach for the fixes below
1.4 Plan tiers at a glance (2026.06)
OpenAI publishes message ranges rather than fixed integers, because one "message" can span orders of magnitude depending on repo size. The table below is from the pricing page, using GPT-5.3-Codex as the reference (approximate Local Messages per 5 hours):
| Plan | GPT-5.3-Codex / 5h (approx.) | GPT-5.4-mini / 5h (approx.) | Notes |
|---|---|---|---|
| ChatGPT Plus | 10–60 | 60–350 | Also subject to Weekly cap |
| Pro 5x | ~5× Plus | ~5× Plus | 2026 $100/month tier |
| Pro 20x | ~20× Plus | ~20× Plus | Heavy parallel workloads |
| Business (non-flex pricing) | Similar to Plus | Similar to Plus | Per-seat billing |
| API Key | No 5h / Weekly subscription windows | Per-token + rate limits | |
In early 2026 OpenAI reshaped Plus / Pro tiers: a new $100 Pro tier (5x), a higher tier (20x), and the option to buy Credits after subscription quota runs out instead of forcing an immediate upgrade. Check chatgpt.com/codex/pricing for current numbers.
2. Weekly limit exhausted: 7 fixes
The seven options below are ordered by cost from low to high. Try them top-down—don't jump to a plan upgrade first.
Fix 1: Confirm whether you hit Weekly or 5h
"Completely blocked" sometimes means only the 5-hour window is at zero while Weekly still has headroom—or the opposite. Run codex in the CLI and read the rate limit line, or check the VS Code Codex extension status bar.
- Only
5h 0%→ wait for the 5-hour rollover, or use a reset token (Fix 3) - Only
Weekly 0%→ wait for the 7-day rollover, or Credits / upgrade (Fixes 5, 6) - Both at 0 → combine waiting + Credits, and plan a fallback stack (Fix 7)
Fix 2: Wait for natural rollover
The zero-cost option: stop launching new tasks and let time roll the window forward.
- 5-hour window: quota frees five hours after your earliest consumption in that window
- Weekly limit: rolls over 7 days—it does not refresh at Monday 00:00
Works for: non-urgent bug fixes, periods when you can switch to code review or docs. Poor fit for: pre-CI nights, release windows—use Fixes 3–7 instead.
Fix 3: Use a saved rate-limit reset token
In June 2026, OpenAI introduced saveable rate limit resets for Go / Plus / Pro / Business subscribers. Instead of resets only at fixed times, you receive a "reset coin" you can trigger manually to immediately restore the standard usage window.
- Eligible accounts received at least one free reset at launch
- Each token expires after 30 days
- Promotional periods may grant extra resets via referrals (see official announcements)
Note: The changelog mainly describes restoring the "standard usage window" and does not clearly state whether Weekly is cleared too. Treat it as 5-hour window emergency relief; Weekly still depends on rollover or Credits.
Fix 4: Switch to GPT-5.4-mini to reduce burn
On the same plan, GPT-5.4-mini carries a much higher message cap than full-size GPT-5.3-Codex (Plus tier: mini roughly 60–350 messages / 5h vs Codex roughly 10–60). Switch models in the CLI or IDE:
# Example: specify mini tier in a session that supports model switching
/model gpt-5.4-mini
Good for: single-file refactors, test completion, lint fixes, PR comment replies. Poor fit for: cross-module architecture migrations, complex concurrency bugs—mini may loop and waste quota on those.
Fix 5: Buy Credits to keep going
Since 2026, ChatGPT Plus and Pro users can purchase additional Credits after subscription quota is exhausted, without an immediate tier upgrade. Business / Enterprise flex-pricing workspaces can buy workspace credits.
Path: ChatGPT account settings → Usage / Billing → Buy Credits (UI labels may vary by region).
Good for: occasional sprint weeks (releases, hackathons) when you know you'll keep Codex next month. Poor fit for: hitting Weekly zero every month—that signals you need a higher tier or API mode (Fixes 6, 7).
Fix 6: Upgrade your subscription tier
If you drive Weekly to 0 every week, your plan and workload are mismatched. The 2026 tier structure roughly looks like:
- Plus: light daily use, intermittent coding
- Pro 5x (~$100/month): about 5× Plus Codex quota—fits full-time daily development
- Pro 20x: parallel multi-repo, multi-agent heavy users
Before upgrading, run the numbers: if monthly Credits spend already nears the higher-tier price gap, upgrading is usually simpler.
Fix 7: Switch to API Key or an alternate agent
When subscription quota becomes a hard ceiling, two paths:
A. OpenAI API Key mode—Codex supports API Key login. This mode has no ChatGPT 5h / Weekly windows; you pay per token and are bounded by account balance and RPM/TPM limits. GPT-5.3 Codex API pricing (2026.06): input ~$1.75 / million tokens, output ~$14 / million tokens.
B. Change toolchains—Claude Code, Cursor Agent, Gemini CLI, self-hosted LangGraph + API, and others each carry independent quota systems. Many teams run a Codex primary + Claude Code backup dual stack to avoid single-vendor throttling.
Quick decision guide
- Need to merge the PR today → Fix 3 or 5
- Third Weekly hit this week → Fix 6 or 7
- Integrating into your product → Fix 7A (API Key)
- Apple platform work, overnight runs → Fix 7B + stable Cloud Mac (see Section 5)
3. Alternative APIs and tools
A weekly limit exhaustion is often the moment to reassess your toolchain. The table below compares quota model, agent shape, and typical cost (public information as of June 2026):
| Option | Quota model | Agent shape | Best for |
|---|---|---|---|
| Codex (ChatGPT subscription) | 5h + Weekly rolling | CLI / IDE / cloud tasks | Plus/Pro subscribers deep in OpenAI ecosystem |
| Codex (API Key) | Per-token, no weekly cap | Same surfaces, predictable billing | Team integration, CI pipelines |
| Claude Code | Pro/Max session quota or API tokens | Terminal agent + CLAUDE.md |
Long-chain reasoning, multi-file refactors |
| Cursor Agent | Subscription requests + model surcharges | IDE-embedded | Daily coding + light agent work |
| Gemini CLI / API | Free tier + per-token | CLI / Google ecosystem | Multimodal, large-context RAG |
| DeepSeek API | Pure per-token, low cost | Requires your own agent framework | Chinese-market workloads, cost-sensitive teams |
For per-model pricing detail, see our 2026 LLM Pricing, Config, Performance & Who Should Use What. Switching tools does not mean switching execution environments—agents still need a stable macOS / Linux node to compile and test.
4. Prevention: stretch quota across the week
Fixing a limit once is easy; fixing it every week means changing how you work:
- Slice task granularity: one agent session, one clear goal ("fix flaky test" not "refactor the whole module")—fewer wasted round trips.
- Local first: if
rgor LSP can solve it, don't hand it to the agent; reserve quota for cross-file reasoning. - Tier models: default to mini; switch to full Codex only for hard problems.
- Don't abuse cloud tasks: Cloud Tasks and Local Messages share the 5h window—run locally when you can.
- Watch Remaining: when
5h < 20%, stop and leave wrap-up for the next window. - Dual-stack backup: Codex primary, Claude Code or API as fallback—switch seamlessly when throttled.
5. Execution node: quota restored, job still running
Quota solved, another failure mode remains: the execution environment. Codex cloud tasks, overnight local CLI compiles, Xcode UI tests—on a personal laptop, lid close, VPN jitter, or a full disk can leave the agent half done.
The pragmatic 2026 pattern: plan model quota and execution nodes separately. Subscription / API runs the brain; Cloud Mac runs the body—a dedicated macOS node online 24/7, long jobs in tmux, come back to review when Codex or Claude Code quota recovers.
This is a different scarcity than models: execution nodes are scarce in stable, predictable macOS compute. See The Model Arms Race Is Over—Why Mac Compute Nodes Are Suddenly Hard to Get.
FAQ
What's the difference between Codex Weekly and 5h limits?
The 5-hour window caps short bursts; the weekly limit caps cumulative use across the week. Both roll independently—either one at zero triggers throttling.
CLI shows Weekly 94%—almost gone or plenty left?
Remaining means left. 94% means 94% still available, not 94% used.
Can I buy Credits when the weekly limit is exhausted?
Yes. Plus / Pro can buy additional Credits; Business flex-pricing workspaces can buy workspace credits.
Does a saved reset token clear the weekly limit?
Not officially confirmed. Treat it as 5h emergency relief; Weekly still needs rollover or Credits.
Does switching to GPT-5.4-mini extend quota?
Yes. On the same plan, mini tier carries a significantly higher 5h message cap—good for lighter tasks.
Does API Key mode have a weekly limit?
No subscription 5h / Weekly windows—billing is per-token with rate limits.
Can Claude Code replace Codex?
Terminal agent experience is similar; quotas are independent. For iOS / macOS development, run long jobs on a stable Cloud Mac.
Conclusion
Codex weekly limit exhaustion is not a broken account—it reflects OpenAI separating "light trial use" from "full-time agent development" more sharply in 2026. Understand the 5h and Weekly layers, use reset tokens and mini tier wisely, buy Credits or upgrade when needed, and keep an API / Claude Code fallback—most teams can recover their rhythm within a week.
Quotas reset; release days don't wait. Parking long jobs on a node that stays online beats staying up for a rolling window.
Codex throttled? Run the agent on a stable Mac node first
Vuncloud dedicated Mac mini M4 Cloud Mac: Codex / Claude Code long runs, Xcode builds, overnight tmux jobs—US East / West / APAC nodes. When quota returns, review finished work—not a job cut in half.
Related reading
- 2026 LLM Pricing, Config, Performance & Who Should Use What
- The Model Arms Race Is Over—Why Mac Compute Nodes Are Suddenly Hard to Get
- AI Coding, Personal AI & Agent Architecture: The 2026 Developer Tool Triad
Last updated: June 22, 2026. Limit and pricing data from OpenAI Codex Pricing and June 2026 official changelog.