Date: 2026-06-17
Status: Resolved
Customer impact window: ~16:03–16:10 UTC (about 7 minutes)
During a planned upgrade of the infrastructure that powers our processing queues, a brief failover caused a short window where some envelopes could not be moved between processing stages. Hundreds live envelopes failed during that window.
We have recovered every envelope that had already been signed. A smaller set that failed at creation time were not recovered automatically and needs to be submitted again. No signed documents or signature data were lost.
We upgraded the cache and queue infrastructure behind the API. The upgrade includes a failover from one node to another. The failover itself took only a few seconds, but client connections took a little longer to reconnect.
During that short window, the API could not hand envelopes to the processing queue, so those envelopes were marked as failed instead of continuing.
All the other operations of the API were working as usual.
The failures fall into two groups:
The API itself stayed up throughout. Only envelopes created or finalized in the ~7-minute window were affected.
The upgrade triggers a failover. While clients reconnected to the new node, the API briefly could not enqueue work. The API treated these short-lived enqueue errors as permanent and failed the affected envelopes, instead of retrying once the connection recovered.
We did not apply this change directly to production. The underlying change was rolled out to our staging environment first. There, we deliberately rehearsed the same node failover the upgrade performs and confirmed two things: the service stayed available, and the system reconnected and recovered on its own.
What the staging rehearsal did not reproduce was the specific behavior that hit production: an envelope being marked failed instead of retrying. That only happens when live envelopes are moving through the queue at the exact moment of the failover, and our staging rehearsal was not carrying live traffic. Closing that gap — both in how we enqueue work and in how we rehearse — is the main fix below.
We are sorry for the disruption. If you have questions about a specific envelope, contact support and we will help.