Errors connecting to LaunchDarkly
Incident Report for LaunchDarkly
Postmortem

LaunchDarkly engineers noted at 8:23am PDT that error rates increased across LaunchDarkly’s systems. We identified that our DNS registration had expired at 8:17am PDT which caused widespread impact across launchdarkly.com properties. At 9:07am PDT our team made changes to the DNS system to remediate the issue, and within a few minutes, as customers' DNS caches were cleared, all monitoring returned to normal levels.

During the roughly 50 minute incident timeframe, the following impact was encountered:

  • Requests to our web properties such as www.launchdarkly.com, app.launchdarkly.com, blog.launchdarkly.com, and status.launchdarkly.com failed for some customers.
  • Some new connections from our SDKs failed. SDKs that were connected prior or after this window continued functioning normally. Impacted SDKs automatically recovered on their own, as the default behavior is to continue trying to connect with exponential backoff.

    • If an SDK had known flag state prior to the window, it would continue serving the last known good flag state.
    • If an SDK failed to initialize during the window (and did not fetch the flag state), it would serve fallback variations until re-establishing the connection to LaunchDarkly through automatic retry.
  • Some event data captured by SDKs was not recorded. Lost event data during the incident window impacts data export, experimentation results, as well as the Users page, flag statuses, flag insights charts, and the debugger.

We have identified several process gaps that led to the DNS issue, and are addressing to ensure that a similar incident does not occur in the future.

Posted Jul 15, 2020 - 13:30 PDT

Resolved
All LaunchDarkly services have continued behaving normally since the prior status update. At this point we are confident that the incident is resolved.
Posted Jul 15, 2020 - 11:04 PDT
Monitoring
A fix has been applied and traffic is once again connecting to LaunchDarkly services. We believe to have resolved the root cause. We are continuing to monitor the situation.
Posted Jul 15, 2020 - 09:14 PDT
Investigating
We are encountering errors connecting to LaunchDarkly at the moment. Some customers may be unable to access app.launchdarkly.com or use our polling or streaming services. This started around 8:17am PDT.
Posted Jul 15, 2020 - 08:17 PDT
This incident affected: Flag Delivery Network (core functionality) (Server-side streaming API, Client-side streaming API) and Feature management (core functionality) (Flag targeting).