Anatomy of Graceful Shutdown: Part 4

Part 4: Celery Graceful Shutdown Process Part 1: Signals and Linux Part 2: Containers and signals Part 3: Graceful shutdown of K8S pods Part 4: Celery Graceful Shutdown [you’re here] Part 5: Prometheus Graceful Shutdown Part 6: Other frameworks and libraries [WIP] Change of the approach We’ve had enough generic theory in previous 3 articles, covering the kernel, application, container runtimes and high-level abstractions like K8S, so what’s next? I suggest to change the flow of the articles to overview of popular backend systems and how they handle graceful shutdowns, so that we have a perspective of the topic in the wild. ...

June 30, 2025 · 10 min · 2015 words · Andrei Sviridov

Anatomy of Graceful Shutdown: Part 1

Part 1: Signals and Linux Part 1: Signals and Linux [you’re here] Part 2: Containers and signals Part 3: Graceful shutdown of K8S pods Part 4: Celery Graceful Shutdown Part 5: Prometheus Graceful Shutdown Part 6: Other frameworks and libraries [WIP] Intro There’s been quite a lot of issues surrounding application shutdowns in my line of work. Connections not being correctly closed, incoming requests being processed when they shouldn’t have been, various quirks around how new deployments affect customers during busy hours. I’ve decided to familiarize myself more with the topic and that’s quite a lot going on there. ...

March 3, 2024 · 25 min · 5230 words · Andrei Sviridov

Python Multiprocessing Quirks on MacOS.

Prelude Currently, I’m working on the product, built around a large Django monolithic application and a bunch of microservices around it. The codebase is quite huge and has a lot of (tens of thousands) tests, that are normally run in a parallel mode in the CI environment. The CPython and Django versions are a little bit stale (3.8 and 3.2 respectively). For local development purposes, it’s well enough to run a subset of tests in a non-parallel mode or to wait for the whole suite to pass during the CI run, but for one specific use case I had to run a parallel test suite locally. It was a surprise for me to see the Segmentation Fault as the test failure reason for a bunch of tests. ...

January 14, 2024 · 8 min · 1639 words · Andrei Sviridov