Elevated latencies and delays

Incident Report for LaunchDarkly

Resolved

This incident has been resolved.

One of our mitigation steps involved adding new IPs for stream.launchdarkly.com to our public IP list. Some customers may need to update IP allowlists in their firewalls or proxy servers in order for their services to continue establishing streaming connections from server side SDKs to LaunchDarkly without disruption. Please refer to documentation at https://docs.launchdarkly.com/home/advanced/public-ip-list for more information.

Refer to https://app.launchdarkly.com/api/v2/public-ip-list for complete list of public IPs.

Customers who switched from streaming to polling mode as a workaround are clear to revert back to streaming mode.

Posted Oct 21, 2025 - 03:00 PDT

Update

One of our mitigation steps involved adding new IPs for stream.launchdarkly.com to our public IP list. Some customers may need to update IP allowlists in their firewalls or proxy servers in order for their services to continue establishing streaming connections from server side SDKs to LaunchDarkly without disruption. Please refer to documentation at https://docs.launchdarkly.com/home/advanced/public-ip-list for more information.

Refer to https://app.launchdarkly.com/api/v2/public-ip-list for complete list of public IPs.

We will continue to actively monitor our services and provide updates if anything changes.

We recommend that customers who switched from streaming to polling mode as a workaround remain in polling mode for now. We will continue to provide updates to this recommendation.

We’ll provide another update within 60 minutes.

The following stable IPs were added:
- 52.22.11.124/32
- 98.90.74.184/32
- 44.214.199.141/32
- 54.158.1.193/32
- 52.20.244.244/32
- 3.222.86.128/32
- 3.209.231.150/32
- 98.87.97.132/32
- 54.243.249.198/32
- 52.205.29.16/32
- 52.200.155.176/32
- 72.44.54.239/32
- 44.193.41.212/32
- 44.193.145.213/32
- 3.230.174.47/32
- 34.193.141.46/32
- 54.145.215.104/32
- 54.83.149.69/32
- 54.167.133.6/32
- 98.86.214.67/32
- 3.210.111.117/32
- 44.198.65.246/32
- 3.223.193.186/32
- 54.164.149.203/32
- 52.202.164.129/32
- 54.211.161.195/32
- 52.44.175.163/32
- 54.87.94.27/32
- 34.196.162.28/32
- 3.229.200.95/32
- 34.206.243.165/32
- 44.198.216.81/32
- 98.85.64.100/32
- 34.193.205.73/32
- 54.82.179.12/32
- 35.169.61.114/32
- 3.225.212.129/32
- 44.214.230.241/32
- 44.197.94.28/32
- 54.225.42.164/32
- 3.232.151.250/32
- 98.88.212.98/32
- 44.206.106.7/32
- 44.219.171.95/32
- 54.81.117.83/32
- 3.212.29.247/32
- 52.207.48.173/32
- 52.21.24.75/32
- 44.209.163.213/32
- 3.212.26.71/32
- 3.232.245.239/32
- 44.214.85.107/32
- 54.85.9.44/32
- 3.212.63.158/32
- 44.214.25.250/32
- 34.225.52.183/32
- 54.144.244.40/32
- 13.216.151.182/32
- 34.205.184.16/32
- 54.243.39.147/32
- 52.21.118.82/32
- 44.208.247.20/32
- 44.209.6.233/32
- 98.85.24.70/32
- 52.206.193.249/32
- 52.203.145.124/32
- 34.207.21.226/32
- 52.6.144.34/32
- 3.221.55.92/32
- 54.160.1.221/32
- 54.236.171.5/32
- 3.210.143.243/32
- 18.204.254.23/32
- 34.224.206.32/32
- 54.152.40.39/32
- 52.201.30.87/32
- 98.86.87.228/32
- 52.70.143.213/32
- 34.199.166.40/32
- 54.225.71.167/32
- 100.26.67.253/32
- 13.219.10.149/32
- 52.203.44.182/32
- 3.215.17.57/32
- 3.217.93.49/32
- 3.215.154.205/32
- 3.224.166.159/32
- 44.205.194.1/32
- 54.162.82.157/32
- 54.175.84.251/32
- 54.211.58.167/32
- 52.22.199.197/32
- 35.169.162.188/32
- 44.205.162.192/32
- 54.224.162.1/32
- 50.16.48.228/32
- 52.203.187.144/32
- 52.22.34.71/32
- 52.44.226.138/32
- 35.169.87.104/32
- 50.17.142.209/32
- 34.226.53.28/32
- 50.16.209.122/32
- 54.173.173.176/32
- 54.197.143.76/32
- 52.45.14.195/32
- 54.84.144.50/32
- 52.205.140.231/32
- 52.1.64.188/32
- 23.22.17.50/32
- 44.213.219.16/32
- 54.211.63.220/32
- 34.236.195.69/32
- 100.29.106.41/32
- 107.20.48.118/32
- 107.22.84.205/32
- 107.23.47.163/32
- 174.129.120.2/32
- 174.129.25.155/32
- 18.204.101.179/32
- 18.207.77.1/32
- 18.214.59.159/32
- 3.208.63.99/32
- 3.209.142.240/32
- 3.210.8.83/32
- 3.211.0.174/32
- 3.211.171.106/32
- 3.211.40.100/32
- 3.211.78.169/32
- 3.212.153.172/32
- 3.212.215.241/32
- 3.212.69.145/32
- 3.215.132.92/32
- 3.215.85.74/32
- 3.217.156.217/32
- 3.217.33.194/32
- 3.222.172.85/32
- 3.225.49.136/32
- 3.226.201.70/32
- 3.232.113.99/32
- 3.81.156.201/32
- 3.94.227.253/32
- 34.192.228.56/32
- 34.196.53.78/32
- 34.197.220.63/32
- 34.197.229.208/32
- 34.198.5.248/32
- 34.205.180.137/32
- 34.206.142.57/32
- 34.225.210.63/32
- 34.225.44.159/32
- 34.232.120.176/32
- 34.235.101.237/32
- 34.237.149.109/32
- 34.237.7.234/32
- 35.153.62.144/32
- 35.171.42.112/32
- 35.172.28.29/32
- 35.175.51.91/32
- 44.193.160.19/32
- 44.193.176.64/32
- 44.193.192.114/32
- 44.195.178.165/32
- 44.205.130.196/32
- 44.205.142.202/32
- 44.205.242.41/32
- 44.207.32.19/32
- 44.208.215.105/32
- 44.210.2.163/32
- 44.221.72.252/32
- 44.223.189.67/32
- 50.16.53.115/32
- 52.0.20.18/32
- 52.1.126.54/32
- 52.20.44.107/32
- 52.200.10.183/32
- 52.201.19.0/32
- 52.202.18.147/32
- 52.205.199.141/32
- 52.205.74.149/32
- 52.206.123.108/32
- 52.21.16.31/32
- 52.22.120.141/32
- 52.22.75.64/32
- 52.23.189.51/32
- 52.3.131.52/32
- 52.3.164.32/32
- 52.3.203.3/32
- 52.4.17.19/32
- 52.55.197.16/32
- 52.6.134.5/32
- 52.7.81.224/32
- 54.147.67.241/32
- 54.156.155.61/32
- 54.158.114.255/32
- 54.158.201.166/32
- 54.167.202.203/32
- 54.235.4.229/32
- 54.243.165.178/32
- 54.243.220.97/32
- 54.243.227.67/32
- 54.243.238.143/32
- 54.243.34.157/32
- 54.243.54.147/32
- 54.243.58.248/32
- 54.243.79.193/32
- 54.80.39.21/32
- 54.81.213.212/32
- 54.84.21.101/32
- 54.84.245.230/32
- 98.82.52.30/32
- 98.82.55.107/32

Posted Oct 21, 2025 - 02:53 PDT

Update

One of our mitigation steps involved adding new IPs for stream.launchdarkly.com to our public IP list. Some customers may need to update the IP allowlists in their firewalls or proxy servers to ensure that their services can continue establishing streaming connections from server-side SDKs to LaunchDarkly without disruption. Approximately 88% of traffic to stream.launchdarkly.com will continue to be routed to existing stable IPs.

We are working with AWS to provide a list of additional stable IPs and will post another update as soon as they become available.

We will continue to actively monitor our services and provide updates if anything changes.

We recommend that customers who switched from streaming to polling mode as a workaround remain in polling mode for now. We will continue to provide updates to this recommendation.

We’ll provide another update within 60 minutes.

Posted Oct 21, 2025 - 02:19 PDT

Monitoring

Server-side streaming is healthy.

The load balancer upgrade, along with the addition of another load balancer, has restored our service to healthy levels.

We will continue to actively monitor our services and provide updates if anything changes.

We recommend that customers who switched from streaming to polling mode as a workaround remain in polling mode for now. We will continue to provide updates to this recommendation.

We’ll provide another update within 60 minutes.

Posted Oct 21, 2025 - 01:08 PDT

Update

We're seeing signs of recovery, reported error rates for server-side SDKs are dropping significantly.

The initial load balancer unit was upgrading and has begun handling traffic successfully. The additional load balancer is online and is beginning to handle traffic.

Customers may still experience delayed flag updates.

We'll provide another update within 60 minutes.

Posted Oct 21, 2025 - 00:14 PDT

Update

Server-side streaming API is still experiencing a Partial outage.

An additional load balancer has been brought online and is being configured to receive traffic. When we confirm that this is successful, we'll bring the other additional load balancer units online to handle the increased volume in traffic and restore service to our customers.

Customers may still experience timeouts and 5xx errors when connecting to the server-side SDK endpoints.

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 23:53 PDT

Update

Server-side streaming API is still experiencing a Partial outage.

We are in the process of deploying additional load balancer units that are about to go online. We expect them to successfully handle the increased volume in traffic and restore service to our customers.

Customers may still experience timeouts and 5xx errors when connecting to the server-side SDK endpoints.

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 23:00 PDT

Update

Server-side streaming API is still experiencing a Partial outage.

We're still working on creating additional load balancer units to distribute and handle the increased volume in traffic. AWS is providing active support to LaunchDarkly as we work to restore service to our customers.

Customers may still experience timeouts and 5xx errors when connecting to the server-side SDK endpoints.

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 21:52 PDT

Update

Server-side streaming API is still experiencing a Partial outage and the reported error rates for server-side SDKs are reducing.

We've added an additional load balancer unit to distribute the traffic which is helping. Based on the volume of traffic, we're going to add five additional load balancer units to give our service enough capacity to handle it.

Customers may still experience timeouts and 5xx errors when connecting to the server-side SDK endpoints.

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 21:06 PDT

Update

Server-side streaming API is still experiencing a Partial outage and the error rates for server-side SDKs are still high.

We've escalated the recovery process with our AWS technical support team to accelerate the redeployment of our ALB for SDK connections to restore service. They are updating our ALB load balance capacity units (LCU) to accommodate increased levels of inbound traffic to our platform.

Customers may still experience timeouts and 5xx errors when connecting to the server-side SDK endpoints.

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 19:59 PDT

Update

Server-side streaming API is still experiencing a Partial outage and the error rates for server-side SDKs are still high. We're working with our AWS technical support team to accelerate the redeployment of our ALB for SDK connections to restore service.

As a temporary workaround, we recommend switching server-side SDK configs from streaming to polling.

Customers connecting their server-side SDKs directly to LD's streaming capabilities can reconfigure their SDKs to use polling to mitigate.

Node:
- Set LDOptions.stream to false
- https://launchdarkly.com/docs/sdk/features/config#expand-nodejs-server-side-code-sample
- https://launchdarkly.github.io/js-core/packages/sdk/server-node/docs/interfaces/LDOptions.html#stream

Python
- Set Config.stream to false
- https://launchdarkly.com/docs/sdk/features/config#expand-python-code-sample
- https://launchdarkly-python-sdk.readthedocs.io/en/latest/api-main.html#ldclient.config.Config.stream

Java:
- Use Components.pollingDataSource() instead of the default Components.streamingDataSource()
- https://launchdarkly.com/docs/sdk/features/config#expand-java-code-sample
- https://launchdarkly.github.io/java-core/lib/sdk/server/com/launchdarkly/sdk/server/LDConfig.Builder.html#dataSource-com.launchdarkly.sdk.server.subsystems.ComponentConfigurer-

.NET:
- create a builder with PollingDataSource(), change its properties with the methods of this class, and pass it to DataSource()
- https://launchdarkly.com/docs/sdk/features/config#expand-net-server-side-code-sample
- https://launchdarkly.github.io/dotnet-server-sdk/pkgs/sdk/server/api/LaunchDarkly.Sdk.Server.Integrations.PollingDataSourceBuilder.html

Enterprise customers connecting their server-side SDKs to a Relay Proxy cluster can reconfigure their Relay Proxy to be in Offline Mode to mitigate. https://launchdarkly.com/docs/sdk/relay-proxy/offline

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 19:08 PDT

Update

Server-side streaming API is still experiencing a Partial outage in our main US region and we're continuing our efforts to restore service. We're redirecting traffic to an EU region to help distribute the load to healthy servers while we work to restore our primary region.

Customers connecting their server-side SDKs directly to LD’s streaming capabilities can reconfigure their SDKs to use polling to mitigate.

Node:
- Set LDOptions.stream to false
- https://launchdarkly.com/docs/sdk/features/config#expand-nodejs-server-side-code-sample
- https://launchdarkly.github.io/js-core/packages/sdk/server-node/docs/interfaces/LDOptions.html#stream

Python
- Set Config.stream to false
- https://launchdarkly.com/docs/sdk/features/config#expand-python-code-sample
- https://launchdarkly-python-sdk.readthedocs.io/en/latest/api-main.html#ldclient.config.Config.stream

Java:
- Use Components.pollingDataSource() instead of the default Components.streamingDataSource()
- https://launchdarkly.com/docs/sdk/features/config#expand-java-code-sample
- https://launchdarkly.github.io/java-core/lib/sdk/server/com/launchdarkly/sdk/server/LDConfig.Builder.html#dataSource-com.launchdarkly.sdk.server.subsystems.ComponentConfigurer-

.NET:
- create a builder with PollingDataSource(), change its properties with the methods of this class, and pass it to DataSource()
- https://launchdarkly.com/docs/sdk/features/config#expand-net-server-side-code-sample
- https://launchdarkly.github.io/dotnet-server-sdk/pkgs/sdk/server/api/LaunchDarkly.Sdk.Server.Integrations.PollingDataSourceBuilder.html

Enterprise customers connecting their server-side SDKs to a Relay Proxy cluster can reconfigure their Relay Proxy to be in Offline Mode to mitigate. https://launchdarkly.com/docs/sdk/relay-proxy/offline

We'll provide another update within 60 minutes.

Posted Oct 20, 2025 - 18:02 PDT

Update

Server-side streaming API is still experiencing a Partial outage and the error rates for server-side SDKs are still high. We're redeploying our ALB for SDK connections to restore service.

As a temporary workaround, we recommend switching server-side SDK configs from streaming to polling.

Error rates for client side streaming SDKs are low, but flag updates are still delayed.

All other service component are fully recovered and we've updated their status to Operational.

We will provide our next update within 60 minutes.

Posted Oct 20, 2025 - 16:42 PDT

Update

We're redeploying parts of our service to address the high error rates for client and server side SDK connections that we continue to see.

The EU and Federal LaunchDarkly instances continue to not be impacted by this incident at this time.

We will provide our next update within 60 minutes.

Posted Oct 20, 2025 - 16:18 PDT

Update

Server-side streaming connections continue to be impacted by this incident.

The event ingestion pipeline is fully functional again. This means that the following product areas are functional for all customers while data sent between Sunday Oct 19 11:45pm PT and Monday Oct 20 2:45pm PT may be unrecoverable:

- AI Configs Insights
- Contexts
- Data Export
- Error Monitoring
- Event Explorer
- Experimentation
- Flag Insights
- Guarded rollouts
- Live Events

Additionally, Observability functionality has recovered as mentioned in our previous update.

The EU and Federal LaunchDarkly instances continue to not be impacted by this incident at this time.

We will provide our next update within 30 minutes.

Posted Oct 20, 2025 - 15:28 PDT

Update

The LaunchDarkly web application is fully recovered for customer traffic. Flag Delivery traffic has been scaled back up to 100% and connection error rates are decreasing but non-zero. Active streaming connections should receive flag updates once successfully connected. If disconnected, these connections will automatically retry in accordance with our SDK behavior until being able to connect successfully.

We've currently enabled 7.5% of traffic for the event ingestion pipeline and will continue to enable it progressively. As of 1:40pm PT Observability data is successfully flowing again and we are catching up on data backlog. Observability data between 1:50am PT and 1:40pm PT is unrecoverable due to an outage in the ingest pipeline.

The EU and Federal LaunchDarkly instances continue to not be impacted by this incident at this time.

We will provide our next update within 60 minutes.

Posted Oct 20, 2025 - 14:55 PDT

Update

We've hit our target of healthy, stable nodes that are available for LaunchDarkly web application and are increasing traffic from 10% to 20%. We'll continue to monitor as we scale the web application back up.

Recovering the Flag Delivery service for all customers is our top priority. We're working on stabilizing the Flag Delivery Network.

We are beginning to progressively enable the event ingestion pipeline for the LaunchDarkly service.

The EU and Federal LaunchDarkly instances continue to not be impacted by this incident at this time.

We will provide our next update within 60 minutes.

Posted Oct 20, 2025 - 13:55 PDT

Update

The impacted AWS region continues to recover and make resources available which we are using to improve the availability of the LaunchDarkly platform. As we continue to recover and scale up, so do our customers. This increase in traffic is slowing our ability to reduce the impact of the outage.

For customers who are using the LaunchDarkly SDKs, we do not recommend making changes to your SDK configuration at this time as doing so will impact our ability to continue service during our recovery.

For Flag Delivery, server-side streaming is back online and no longer impacted by the incident for most customers. Customers using big segments or payload filtering are still impacted.

The EU and Federal LaunchDarkly instances continue to not be impacted by this incident at this time.

The event ingestion pipeline will remain disabled to limit the traffic volume within LaunchDarkly's services during our recovery.
We will provide our next update within 60 minutes.

Posted Oct 20, 2025 - 12:47 PDT

Update

We've made significant progress on our recovery from this incident. Our engineers are continuing to bring the LaunchDarkly web application into a healthy state and have more than tripled the number of healthy nodes to serve our customers. The status of many service components has been upgraded from Major outage to Partial Outage.

The following components are still experiencing a Major Outage:
- Experiment Results Processing
- Global Metrics
- Feature Management Context Processing
- Feature Management Data Export
- Feature Management Flag Usage Metric

The EU and Federal LaunchDarkly instances continue to not be impacted by this incident at this time.

The event ingestion pipeline will remain disabled to limit the traffic volume within LaunchDarkly's services during our recovery.

We will provide our next update within 30 minutes.

Posted Oct 20, 2025 - 12:02 PDT

Update

We continue to work towards recovering from this incident. We're actively working towards restoring the LaunchDarkly service into a healthy state. We now have 58% of the LaunchDarkly web application in a healthy state.

The EU and FedRAMP LaunchDarkly instances are not impacted by this incident.

While working towards a resolution for our customers, we disabled the event ingestion pipeline to limit the traffic volume within LaunchDarkly's services. This means that the following product areas have unrecoverable data loss:

- AI Configs Insights
- Contexts
- Data Export
- Error Monitoring
- Event Explorer
- Experimentation
- Flag Insights
- Guarded rollouts
- Live Events
- Observability

While recovering, there is continued impact to customers using our SDKs to connect to our Flag Delivery network.

Our engineers are continuing to recover our service in our main region. We will provide our next update within 30 minutes.

Posted Oct 20, 2025 - 11:28 PDT

Update

While we continue to resolve the ongoing impact, we want to clarify the ongoing impact to our Flag Delivery Network and SDKs:
- Customers using client-side or server-side SDKs should continue to see the last known flag values if a local cache exists, or fall back to in-code values.
- Customers using our Relay Proxy should continue to see last known flag values if a local cache exists.
- Customers using our Edge SDKs should continue to see last known flag values.

Additionally, our event ingestion pipeline is dropping events that power product features such as flag insights, experimentation, observability, and context indexing.

Posted Oct 20, 2025 - 09:41 PDT

Update

We're continuing to work on resolving the immediate impact from this incident. We're actively working on recovering within our AWS us-east-1 region while also working on options to move traffic to a healthier region.

Posted Oct 20, 2025 - 08:18 PDT

Update

We are continuing to work on a fix for this issue.

Posted Oct 20, 2025 - 06:55 PDT

Update

We are aware that our web app and API are experiencing high error rates due to scaling issues in AWS us-east-1 region.

Posted Oct 20, 2025 - 06:16 PDT

Update

We are still experiencing delays in flag updates and event ingestion pipeline, affecting experimentation, data export, flag status metrics and others.

Additionally we are experiencing elevated error rate on client side SDK streaming API in us-east-1 region due to scaling issues in that AWS region.

Posted Oct 20, 2025 - 06:06 PDT

Update

We are still experiencing delays in flag updates and event ingestion pipeline, affecting experimentation, data export, flag status metrics and others.

Additionally, observability data (session replays, errors, logs, and traces) has also been impacted starting ~1:50am PT.

Posted Oct 20, 2025 - 04:55 PDT

Update

We are seeing initial recovery for the following services

- Flag updates
- SDK requests for environments using Big Segments

We are monitoring for the recovery of the rest of the services.

Posted Oct 20, 2025 - 03:15 PDT

Update

We are continuing to work on the issue. Additionally impacted services

- Delayed flag updates to SDKs
- Dropped SDK events impacting Experimentation, Data Export

Posted Oct 20, 2025 - 01:54 PDT

Identified

We have identified issue with elevated error rates and event pipelines. Currently impacted services are

- SDK and Relay Proxy requests for environments using Big Segments in us-east-1 region
- Guarded rollouts
- Scheduled flag changes
- Experimentation
- Data export
- Flag usage metrics
- Emails and notifications
- Integrations web hooks

Posted Oct 20, 2025 - 01:00 PDT

Investigating

We are investigating elevated latencies and delays in multiple services including scheduled flag changes, flag updates and events processing. We will post updates as they are available.

Posted Oct 20, 2025 - 00:25 PDT

This incident affected: Flag Delivery Network (core functionality) (Server-side streaming API, Client-side streaming API, Polling API), Automations (non-core functionality) (Emails and notifications), Global (core functionality) (Authentication, Metrics, Web app (app.launchdarkly.com)), Feature management (non-core functionality) (Context processing, Data Export, Flag usage metrics), Feature management (core functionality) (Flag targeting, Segment management, Progressive rollouts, Release pipelines), and Experimentation (core functionality) (Experiment results processing).