increased error rate and delays for usage metrics

Incident Report for LaunchDarkly

Resolved

Data Backfill has been completed. System is stable.
Posted Dec 23, 2021 - 21:19 PST

Update

Backfill of events is ongoing. System is stable, and processing incoming events.
Posted Dec 20, 2021 - 10:01 PST

Update

System health and stability is confirmed, and backfill event ingestion has been started. We will provide an update and ETA for backfill completion.
Posted Dec 16, 2021 - 09:48 PST

Monitoring

The Usage Metrics service is in a healthy state again -- error rates associated with retrieving usage metrics have returned to normal. Event ingestion for usage metrics is still paused. We are monitoring the health of the service for a while longer before resuming ingestion.
Posted Dec 14, 2021 - 19:34 PST

Update

The Usage Metrics service is in a healthy state again -- error rates associated with retrieving usage metrics have returned to normal. Event ingestion for usage metrics is still paused. We are monitoring the health of the service for a while longer before resuming ingestion.
Posted Dec 14, 2021 - 17:52 PST

Identified

Ingestion of events for usage metrics was halted at 1:20PM PT while we investigate this issue. Insights metrics and connection metrics are frozen as of this time. Once we have resolved the issue ingesting usage metrics, we will process these delayed events and update insights and connection metrics.
Posted Dec 14, 2021 - 16:31 PST

Investigating

We are currently investigating this issue.
Posted Dec 14, 2021 - 14:04 PST
This incident affected: Feature management (non-core functionality) (Flag usage metrics).