GitHub Availability Report: April 2026 — 10 Incidents, Including a 30% Scraping Attack

On May 14, GitHub published its April 2026 availability report. It was a rough month: 10 incidents across code search, Copilot, Pages, Codespaces, Actions, and more. Here’s what happened and what GitHub is doing about it.

Incident Summary

Date	Service	Duration	Impact
Apr 1	Code Search	8h 43m	100% query failure, full re-index needed
Apr 1	Audit Log	4m	4,297 API actors affected
Apr 9	Copilot Agent	4h 16m	~84% of new sessions delayed, 54 min queues
Apr 13	Pages	39m	~17.5M HTTP 500 errors (12.8% peak)
Apr 16	Codespaces	3h 22m	~40% of VS Code starts failed
Apr 20	Code Scanning / Projects	15h 36m	New PRs not scanned; new issues missing from boards
Apr 22	Copilot Chat	3h 43m	Full unavailability, then regional recovery
Apr 23	Multi-service	1h 18m	Copilot, Webhooks, Git, Actions, Deployments — 5-7% traffic
Apr 27	Search (scraping attack)	6h 15m	Up to 65% of searches timed out across Issues, PRs, more
Apr 27+	Search continued	—	See above, same incident

The Big Ones

April 27: The Scraping Attack That Took Down Search

The most interesting incident was on April 27. Between 16:15 and 22:46 UTC, GitHub’s search services experienced severe degradation. The cause? A massive anonymous distributed scraping attack.

The attacker used 600,000+ unique IP addresses, with all requests including matching actor information — making standard rate limiting ineffective since each IP stayed below the threshold. This traffic made up 30% of the day’s total search traffic, concentrated within a 4-hour window. The load balancer tier saturated, causing up to 65% of searches to time out across Issues, Pull Requests, Projects, Repositories, Actions, Package Registry, and Dependabot Alerts.

GitHub’s response: scale the load balancer tier, block the traffic, add better connection handling, and implement new controls to allow restricting anonymous traffic to protect registered users.

April 23: DNS Degradation Cascades Across Copilot, Webhooks, Git, Actions

A single-datacenter DNS infrastructure degradation cascaded into a multi-service incident affecting 5-7% of overall traffic. A recently introduced traffic-balancing mechanism caused DNS resolvers to begin failing under a specific load pattern. The impact spread across Copilot (~7% model request failures), Webhooks (elevated latency >3s), Git Operations (1.25% errors), Actions (workflow status delays ~8s), and Deployments (temporarily blocked).

The fix: restart DNS infrastructure. The takeaway: better DNS resilience, safer rollout procedures, and self-healing mechanisms for resolution failures.

April 9: Rate Limit Bug Cripples Copilot Agent

A bug in Copilot’s rate limiting logic applied a global rate limit instead of per-installation. A coincidental 3-4x traffic surge from a client update accelerated the exhaustion. 84% of new agent sessions were delayed, with queue times hitting 54 minutes (normal: 15-40 seconds). A second wave on the same day was caused by a caching bug that persisted the rate-limited state.

Fix: per-installation credentials, disable faulty caching, and better monitoring.

Notable Recurring Themes

Cascading failures from shared infrastructure — The April 23 DNS incident is the textbook example. One degraded datacenter component → multiple services affected. GitHub is working on better isolation.
Rate limiting scoping — Both the Copilot (Apr 9) and scraping (Apr 27) incidents involve rate limits being either too global or too easy to bypass.
Automation causing harm — The April 1 code search outage was triggered by an automated infrastructure change applied too aggressively. The April 13 Pages outage was caused by an automated DNS tool deleting a necessary record.
Detection gaps — Multiple incidents had detection delays of 40-53 minutes because monitoring didn’t classify the failure pattern as a risk (e.g., the scraping attack was only discovered while working on mitigation).

What GitHub Is Doing

The report lists specific follow-up actions for each incident. Common threads:

Stronger DNS resilience and multi-datacenter failover
More gradual rollouts with better health checks
Faster detection through improved monitoring and alerting
Better traffic isolation to prevent cascading impact
Rate limit hardening — both per-installation scoping and anonymous traffic controls
Fallback mechanisms for upstream service dependencies (Codespaces VS Code Server, Pages storage)

GitHub Availability Report: April 2026 — 10 Incidents, Including a 30% Scraping Attack

GitHub Availability Report: April 2026 — 10 Incidents, Including a 30% Scraping Attack

Incident Summary

The Big Ones

April 27: The Scraping Attack That Took Down Search

April 23: DNS Degradation Cascades Across Copilot, Webhooks, Git, Actions

April 9: Rate Limit Bug Cripples Copilot Agent

Notable Recurring Themes

What GitHub Is Doing

References

Share this page

Scan to share on WeChat