GitHub Actions Troubleshooting: A Real 8-Hour Pages Freeze From Start to Finish

GitHubDevOpstroubleshootingCI/CDGitHub Pages

On April 28, 2026, my blog pipeline hit a rare GitHub infrastructure failure. From the 22PM cron trigger to confirming the root cause, it took nearly 8 hours. This article documents the complete troubleshooting process with three issues I hadn't encountered before: workflow_dispatch created but runner never assigned, raw.githubusercontent.com SSL handshake timeout, and Pages build permanently stuck on an old commit.

⚠️ Context： This failure happened during the 22PM scheduled task cycle. GitHub Actions froze completely — workflow_dispatch created a run but the runner never started, and Pages builds were stuck on a commit from April 23. New articles were pushed to GitHub but never went live.

Pitfall 1: workflow_dispatch Created, Runner Never Assigned

First instinct: check the GitHub Actions tab. This is what I saw:

Workflow: GitHub Pages Deploy
Status: queued
Jobs: 0
Started: 2026-04-28 22:XX:XX UTC
Duration: running...

The workflow_dispatch itself was successful — API returned 200, the run was created. But the problem was: queued status persisted indefinitely, runner never started, jobs stayed at 0.

This is a GitHub infrastructure-level deadlock. GitHub's runner allocation system can exhibit this behavior under high load or regional issues — tasks get queued but never assigned to a specific machine.

I tried manually re-triggering workflow_dispatch three times:

# Via GitHub CLI
gh workflow run deploy.yml

# Result: each attempt created a new run, each run stuck at queued with 0 runners

Three attempts, three identical results. This ruled out "single network hiccup" and confirmed a sustained GitHub-side infrastructure issue.

Pitfall 2: raw.githubusercontent.com SSL Timeout

While waiting for runner allocation, I manually tested GitHub API availability with curl:

# GitHub API - normal
curl -s -o /dev/null -w "%{http_code}" https://api.github.com/repos/yaohehe/yaohehe.github.io
# Returns: 200 ✅

# GitHub raw content - SSL timeout
curl -s -o /dev/null -w "%{http_code}" --max-time 10 https://raw.githubusercontent.com/
# Returns: timeout ❌

Around 23:14, testing raw.githubusercontent.com SSL handshake produced a timeout. This revealed that GitHub's CDN layer also had intermittent accessibility issues.

This detail is critical: the Pages workflow needs to pull resources from raw.githubusercontent.com during the build. If the CDN times out, the artifact upload phase fails — even if the runner eventually starts.

Important distinction: GitHub API (api.github.com) and GitHub raw CDN (raw.githubusercontent.com) are two independent infrastructure stacks. One working doesn't guarantee the other.

Pitfall 3: Pages Build Permanently Stuck on Old Commit

I manually checked GitHub's commit history. Confirmed: new article files were pushed to GitHub, SHA exists, file content is intact.

But opening the website showed the articles as 404.

Checking the latest Pages deploy via GitHub CLI:

gh run list --workflow=deploy.yml --limit 5

The output revealed the truth: the last successful Pages build was commit 3bc9c6e, timestamped April 23. This meant from April 23 to April 28 — five full days — Pages never rebuilt. Every commit after that point, including newly pushed articles, was invisible to the live site.

Root Cause Chain

Connecting the three pitfalls, the complete failure chain looks like this:

22PM cron fires: pipeline generates articles, GitHub API push succeeds (PUT returns 200)
workflow_dispatch created: Pages rebuild triggered, run gets queued
Runner allocation freezes: GitHub infrastructure issue, queued persists, runner never starts
raw.githubusercontent.com timeout (~23:14): even if runner starts, CDN timeout causes artifact upload to fail
Pages never rebuilds: stuck on April 23's 3bc9c6e, all new commits ignored
Articles return 404: files are in the GitHub repo but never go live

This wasn't a single point of failure — it was three GitHub infrastructure layers failing simultaneously: runner allocation system + CDN availability + Pages build queue. Beyond what my pipeline code could handle.

Solution: API Push + Wait for GitHub Recovery

With Pages rebuild being a GitHub infrastructure issue, I took the most pragmatic approach:

Confirmed articles were in GitHub: verified file SHA via GitHub API, content intact
Waited for GitHub infrastructure recovery: once runner allocation and CDN recovered, Pages would auto-rebuild
Fixed pipeline silent failures: added push_file_with_retry() to publish-articles.py, now exits 1 on failure instead of silent 0

The next day Pages resumed normal rebuilds and all articles went live. This 8-hour incident caused zero data loss — because the GitHub API push was successful. The real problem was the Pages build queue blockage, combined with the pipeline not detecting it promptly.

Three Key Lessons

workflow_dispatch success ≠ Pages rebuild success: API returning 200 only means GitHub received the request, not that the build completed. Actively check gh run list to confirm actual status.
GitHub API and GitHub CDN are independent infrastructure: api.github.com working doesn't guarantee raw.githubusercontent.com is accessible. Pages builds depend on CDN — CDN timeout causes silent artifact upload failures.
Pages build status belongs in pipeline observability: if Pages deploys use the same commit twice in a row, that should trigger an alert. This is a gap in the current pipeline.

This failure made one thing clear: pipeline observability isn't just "check if logs have errors." It means actively verifying the end result — whether articles are actually live. Push success doesn't mean users can access the content.

Related reading:

Ubuntu 24.04 Docker + UFW Firewall Setup — if you're deploying self-hosted CI runners
Thunderbolt Self-Hosted AI Panel: 5 Real Pitfalls — real-world self-hosted deployment lessons
n8n Self-Hosted Docker Deployment: 5 Real Problems — another self-hosted tool, same failure patterns

🔗 Related Tech Articles

Deep dive into related technical topics:

GitHub Actions Troubleshooting: A Real 8-Hour Pages Freeze From Start to Finish

技术标签: github, devops

GitHub Actions 排错实录：从 8 小时 Pages 卡死中提取的 3 个真实踩坑经历

技术标签: github, devops

GitHub Actions 排错实录：从 8 小时 Pages 卡死中提取的 3 个真实踩坑经历

技术标签: devops, troubleshooting

🔧 DevOps Hardware

查看推荐 →

client="ca-pub-3419621562136630" data-ad-slot="in-article" data-ad-format="auto">