In June, we experienced two incidents that resulted in degraded performance across GitHub services.
June 05 17:05 UTC (lasting 142 minutes)
On June 5, between 17:05 UTC and 19:27 UTC, the GitHub Issues service was degraded. During that time, events related to projects were not displayed on issue timelines. These events indicate when an issue was added to or removed from a project and when their status changed within a project. A misconfiguration of the service backing these events prevented the data from being loaded.
We determined the root cause to be a scheduled secret rotation that resulted in one of the configured services using old expired secrets. Specifically, as a part of our continual improvement, we had an initiative to cleanup, streamline, and simplify our service configurations for improved automation. A bug in the implementation resulted in a misconfiguration that resulted in the degradation.
We mitigated the incident by remediating the service configuration and we believe the simplified configuration will help avoid similar incidents in the future.
June 27 20:39 UTC (lasting 58 minutes)
On June 27, between 20:39 UTC and 21:37 UTC, the GitHub Migration service saw all in-progress migrations fail. Once the increased failures were detected, we paused new migrations so they could resume when the issue was mitigated. This resulted in longer migration times, but prevented further failures.
We attributed the root cause of this incident to an invalid infrastructure credential that required us to manually intervene.
Once identified, the incident was mitigated by the active involvement of our first responders at which time we unpaused queued migrations and continued processing them with an expected level of success.
To prevent recurrence of similar incidents in the future, we are mitigating specific gaps in our monitoring and alerting for infrastructure credentials.
Please follow our status page for real-time updates on status changes and post-incident recaps. To learn more about what we’re working on, check out the GitHub Engineering Blog.
The post GitHub Availability Report: June 2024 appeared first on The GitHub Blog.
In June, we experienced two incidents that resulted in degraded performance across GitHub services.
The post GitHub Availability Report: June 2024 appeared first on The GitHub Blog.
Social Plugin