We've identified the root cause as an inefficient query that was increasing load on our ES cluster. The cluster is now healthy, and API response times and error rates are back to normal. We're continuing to monitor the situation.
Posted about 1 year ago. Sep 02, 2016 - 15:11 PDT
We've seen a recurrence of the earlier incident leading to increased latency and error rates related to user searching. We're investigating again.