Engineering

How we shaved settlement latency to 142ms

Jun 18, 20268 min read·Maya Iverson, Staff Engineer
How we shaved settlement latency to 142ms

Six months ago a transfer between two Indigo Escrow accounts took just under two seconds. Today it takes 142 milliseconds — here is what changed under the hood.

Six months ago, a transfer between two Indigo Escrow accounts took just under two seconds end-to-end. Today the same operation completes in 142 milliseconds at the median. The improvement was not a single trick — it was a slow accumulation of small, deliberate moves across the ledger, the queue, and the API edge.

We started by measuring honestly. Most fintech latency dashboards show the time from request to response, which hides queueing delays and read-after-write inconsistency. We instrumented every hop — the inbound TLS handshake, validation, the row lock on the sender's account, the row lock on the recipient, the ledger insert, the webhook fan-out, and the client acknowledgement — and reported on the slowest of the three replicas. That single change made our P95 four times worse on paper, and four times more useful.

The biggest win came from collapsing two separate transactions into one. The original design wrote a debit row, committed, then wrote a credit row in a second transaction tied together by a saga. It was elegant on a whiteboard and a disaster in production: the saga coordinator was the bottleneck for every transfer in the system. We rewrote the core movement as a single atomic Postgres transaction with row-level locks held in a deterministic order. Deadlocks dropped to zero and the P50 collapsed by 600 milliseconds overnight.

We also stopped trusting the network. Every Indigo Escrow API endpoint now writes to a local edge ledger first and replicates to the canonical store asynchronously. For the user, the transfer is done the moment the edge commits. For our auditors, the replication lag is measured and bounded, and any divergence triggers an automatic freeze on the affected pair of accounts.

Finally — and this is the unglamorous one — we deleted code. Forty percent of the latency in the old hot path came from middleware that nobody on the team could explain. We turned each layer off in staging one at a time and kept only the ones that broke something real. The final transfer handler is now 380 lines, down from 2,100, and it is the most boring file in the repo. That, more than anything, is how we got to 142 milliseconds.