Skip to main content

Post Mortem - 2025-11-04 - Acurast Canary chain stalled

Post Mortem

Date & Time

11.04.2025 at 2:30 GMT+2

Engineer

Andreas Gassmann, Mike Godenzi

Summary

The Acurast Canary chain stopped regularly producing new blocks for about 40mins, afterwards block production resumed but blocks were not finalized, after 2 hours blocks were finalized again.

Status

Resolved

Root causes

The Acurast collators could not produce any new blocks because the Kusama relay chain nodes they were connected to were in a crash loop, exhibiting the behavior described in this Github issue.

The Kusama nodes were running on a version containing the bug detailed in the issue linked above. Restarting the nodes did not improve the situation.

After around 2 hours the Kusama relay nodes started to work properly again.

Trigger

Kusama relay chain nodes used by the Acurast collators entered in a crash loop because of a bug described in this Github issue.

Resolution

The relay chain nodes affected started to work again after around 2 hours. Afterwards they have been all updated to the latest version.

In addition, all Acurast collators node have been configured with additional relay chain nodes as fallback.

Timeline

2:30:

  • block production halts

2:42:

  • initial triage of issue
  • discovered Kusama relay chain nodes used by the Acurast collators entered in a crash loop
  • restarting of Kusama relay chain nodes
  • restarting of all acurast parachain nodes
  • blocks started to be produced but they were not getting finalized
  • continues triage of issue after chain still did not recover

4:33:

  • Kusama relay chain nodes started to work again

Lesson learned

  • Keep Kusama nodes up to date
  • Have more additional backup nodes configured as the relay chain nodes for the Acurast Collators.