Why do GitHub Actions iOS CI builds swing fast and slow?

Most variance comes from mixing cold (dependency change, cache clear, scheme swap) with warm in one P95. In Shadow data ~86.6% were warm — track warm P95 for daily merge; chart cold separately.

What are cold and warm builds in iOS CI?

Cold: dependency or cache root changed — re-resolve, pod install, wide recompile. Warm: cache hit, incremental compile. Same commit 2–3× apart usually means blended build types.

Which xcodebuild phase is slowest on GitHub Actions?

Cold: pod install and resolve dominate. Warm: compile and link. Simulator tests or archive + signing also stretch wall-clock if in the same job.

Is macos-latest iOS CI slow because of CPU?

Most teams hit shared queue, cache miss from wiped workspaces, and disk I/O contention — not single-core power. Order: queue → cold/warm → cache → concurrency → hardware.

How should iOS CI P95 be measured?

Daily merge: warm P50/P95. Cold on its own line. Blending inflates tail and feels like 'always slow.'

What's the next step when iOS CI builds are slow?

Split cold/warm, then fix CocoaPods, DerivedData, SPM cache paths and keys. Still short after cache? Evaluate dedicated Mac mini self-hosted runner.

iOS CI Builds Feel Slow? Why Xcode Builds Drag on GitHub Actions

TL;DR · See clearly, then fix

Across 187 PRs, most builds are warm—track warm P95 for day-to-day merge pain; don’t let cold runs inflate the number
Dependency bumps, cache wipes, and scheme switches trigger cold starts—a 2–3× wall-clock spread is normal
Even on warm builds, bad cache keys, disk contention, and tests/signing bundled into one job still make CI feel like a lottery
Rough order: queue → split cold/warm → cache → concurrency → hardware last (see the waterfall breakdown)

Full benchmark data → GitHub Actions optimization pillar · Benchmark

86.6%

Shadow samples were warm builds

14:12→6:05

warm P95 (macos-latest → dedicated M4)

2–3×

typical cold vs warm wall-clock gap

1. CI feels like a lottery: same commit, 2–3× spread

If you run iOS CI on GitHub Actions, these scenes probably look familiar:

Re-run the same commit and wall clock can differ by 2–3×
Dashboard P95 looks scary, but merge day-to-day doesn’t feel that bad
pod install finishes in thirty seconds one run, then feels stuck the next
Friday afternoon merge anxiety—“what if CI draws the slow ticket again?”

It’s not always raw CPU. We ran a 14-day Shadow dual-track with teams, and the more common story is: numbers got blended, and the environment wasn’t stable—cache didn’t stick, disk got contested, and CI felt like a lottery. Swapping chips usually comes much later.

This piece has one job: why builds swing fast and slow. Job stuck waiting for a runner? See the queue guide. Cache YAML and buy-vs-rent math live in the cache deep dive and ROI piece.

2. Cold and warm: don’t mix them in one pot

A classic trap: dump every build duration into one P95. A few cold starts stretch the tail—you read “CI is always slow,” but daily merge wasn’t that bad.

How we split · Shadow benchmark definitions

warm: dependencies unchanged, cache still there, same scheme—mostly incremental compile; this is “what merge feels like”
cold: lockfile changed, cache cleared, new target, or first run on a fresh runner—re-resolve, pod install, large recompile

For day-to-day SLA, track warm P50/P95; chart cold on its own line—don’t tie it to merge experience.

2.1 What triggers a cold start

On macos-latest, cold runs show up more often—the workspace is usually wiped after each job, so cache rarely “lives” on disk:

Trigger	Typical extra time	Common log signals
`Podfile.lock` change	+3–8 min	`pod install`, Downloading dependencies
DerivedData miss / wiped	+5–15 min	CompileSwift, full .o rebuild
Scheme / target switch	+2–10 min	Different `xcodebuild -scheme`
SPM resolution change	+1–5 min	Resolve Package Graph
Xcode minor bump on macos-latest	First build +10–20 min	New SDK / module cache rebuild

Ship dependency bumps often and cold runs pile up—that’s not hardware aging, it’s a different kind of week. Split “Pod upgrade week” from “normal dev week” in reports and the numbers make sense.

2.2 Why warm still wobbles

Even when everything is warm, wall clock can still swing ~30%. Usual suspects:

Cache miss: key missing arm64, not tied to branch, or multiple jobs fighting one slot
macOS jobs pile up in the same org—disk and network drag each other
This run compiles main only; next run runs full unit + UI tests—different workload
Change surface matters: Pod source edits vs a SwiftUI preview tweak compile very differently

Developer reviewing GitHub Actions iOS CI build logs, analyzing xcodebuild warm vs cold build times

3. Where the time actually goes

Timestamp each workflow step and wall clock usually breaks into five chunks. Below is a typical warm split (your project will vary):

iOS CI wall-clock five phases (warm · illustrative)

① checkout + env setup        ~0:30 – 1:30
② pod install / SPM resolve   ~0:30 – 2:00   (cold ↑↑)
③ xcodebuild compile+link     ~3:00 – 8:00   (change surface drives this)
④ tests (simulator / unit)    ~1:00 – 6:00   (optional, often underestimated)
⑤ archive + codesign          ~1:00 – 4:00   (release pipeline)

warm P50 common range: 6 – 14 minutes total

Practical move: use step timing or time on pod install, xcodebuild build, and xcodebuild test separately. Step ② over five minutes? Think cold + cache. Step ③ all over the map? Check DerivedData and concurrency. Step ④ always slow? Split tests out or move full suite to nightly.

4. Tests and signing: hidden drag

A lot of “xcodebuild is slow” complaints turn out to be tests or signing counted in the same job:

Simulator cold boot: first launch on CI can add minutes without warmup—every job pays again
UI tests are an order of magnitude slower than unit tests; bundle them with compile and P95 gets ugly
Certs, Keychain, profile download—hosted runners often reconfigure every job
Archive and IPA are release-pipeline work—don’t mix them into PR validation metrics

Field note · Split PR and release

PR path: build + light tests; track warm P95.
TestFlight / release: separate workflow, separate metrics. Mix them and “how long does merge take?” never lines up.

5. 14-day Shadow benchmark

Dual-track data: macos-latest and a dedicated Mac mini M4 each ran 187 PR builds, Xcode 16.2 and CocoaPods 1.15.2 aligned. Full methodology → pillar Benchmark.

Category	Samples	Share	macos-latest P95	Dedicated M4 P95
warm build	162	86.6%	14:12	6:05
cold build	25	13.4%	19:40	11:20
Blended (easy to misread)	187	100%	~16:00+	~7:30+

Blend cold and warm into one P95 and day-to-day experience gets overstated by roughly 15–25%—easy to conclude “we need new hardware tomorrow.” Steadier approach: merge experience = warm; expect cold separately during dependency weeks.

On dedicated M4, warm still beats cold by a clear margin—giving DerivedData and Pods a fixed home pays off. How to wire cache → cache deep dive.

6. Fix order: from clarity to cache

Once queue pain is ruled out (or already small), work in this order—don’t jump to buying hardware:

Tag each build: cold or warm (Podfile.lock changed? cache hit?)
Day-to-day SLA on warm P95 only; cold on a weekly line or separate chart
Lock DerivedData, Pods, and SPM cache paths and keys — cache deep dive (in progress)
Don’t stack macOS jobs: 1–2 concurrent jobs per self-hosted box; on hosted runners avoid sharing one workspace across jobs
Split PR validation from release/signing so tests don’t pollute merge metrics
Cache in place and still missing the bar? Then talk M4 / M4 Pro — chip choice and engineering hours

Workflow snippet · tag warm / cold (illustrative)

- name: Classify build type
  run: |
    if git diff --name-only HEAD~1 | grep -q Podfile.lock; then
      echo "BUILD_TYPE=cold" >> $GITHUB_ENV
    else
      echo "BUILD_TYPE=warm" >> $GITHUB_ENV
    fi

- name: Record wall clock
  run: |
    echo "build_type=${BUILD_TYPE}" >> metrics.csv
    echo "wall_sec=$(( $(date +%s) - START ))" >> metrics.csv

7. FAQ

Same commit, re-run differs 2–3×—normal?

On macos-latest, yes—one run cache hits clean, the next misses plus queue jitter, and you’re looking at two different build types. Compare pod install duration and cache hit logs before shopping for new chips.

How should we calculate P95?

For merge experience, use warm P95; ~30 samples over two weeks is a decent start. Track cold separately or count occurrences—blend them and procurement math skews.

`pod install` slow every time—is CocoaPods broken?

On hosted runners it’s usually Pods with nowhere to live and cache keys not tied to branch/architecture. Point self-hosted cache at fixed disk; warm runs under thirty seconds are common.

Will a Mac mini M4 kill variance?

In our benchmark warm P95 dropped sharply (−57%) and spread tightened (σ −40%), but without cold/warm split and cache design, dedicated hardware still spikes.

Slow from queue or from the build itself?

Check how long logs show Waiting for a runner at the top. Queue counts toward wall clock, not GitHub minute billing. Near-zero queue and still slow? That’s the build side this piece covers → next stop: cache.

Where to read next?

Cache YAML and key design → cache deep dive; buy vs rent and ROI → ROI piece and 500 builds/month: buy or rent?

8. Wrap-up

“Slow” iOS CI on GitHub Actions is rarely just “not enough CPU.” More often it’s a stack of:

Cold and warm blended into one P95
Hosted jobs wiped after each run—cache can’t stick
Tests, signing, and PR build crammed into one job

Get warm P95 right first, then cache → concurrency → hardware. In the 14-day Shadow run, fixing metrics and cache alone pulled warm P95 from 14:12 to 6:05—no purchase decision required on day one.