Recently, I need to implement graceful shutdown feature in a message queue consumer handling a large number of transactions in production that can’t be terminated immediately by a simple kill command. Implementation was tricky, though, it’s a great chance to solid my concurrent programming foundation.
So, the basic idea is to utilize Unix (Unix-like, and POSIX-compliant) signal, a form of IPC (Inter-process communication). User can notify the consumer by the
signal() system call from other processes or command
kill [PID]. By default,
kill command sends
SIGTERM, the signal causing program termination, to the target. We just follow the convention.
PHP has a PCNTL extension to manipulate signals. The thought of implementation is to preinstall a signal handler which set the global flag, then free resources and exit after the flag was detected by main loop.
The difficulty here is that, system-level signals would interrupt normal execution of current process, which continues after the signal handler returns. The concurrent execution flow could result in issues such as synchronization and race condition. The buggy code might affect the system ferociously after a long uptime.
A fairly straightforward and simple code is shown below:
The callback function calls
pcntl_sigprocmask() to disable all interrupts, or just
SIGTERM (depends on scenes) during the critical section, and re-enable interrupts after that. This procedure avoids the possibility that a kill signal breaks the transaction.
Pending signal will not be processed until
pcntl_signal_dispatch() is called. An alternate way is to use
declare(ticks=1) to check signal periodically.
However, when there is no message handling, the consumer is in idle state waiting for a new one. If we signal the program at this time. It’s quite a great chance to receive the follow warning:
The reason is that the
$channel->wait() call will eventually invoke the
select() Unix function which is interruptable by the
We can let the warning slipped past of course. However, for a robust application, it’s a good practice to treat all PHP warnings as an exception (as PHP frameworks did such as Laravel and Swoole)
Assume that the warning throws an
ErrorException if the
stream_select was interrupted. Here is the remedy:
As there is no specific exception type for the
stream_select() interruption, an unfriendly way is to check the message using regular expression.
Seems great! Now go get ready for launch.
Ehh, Houston, we still have a problem.
A race condition is introduced right after the while condition test and before the wait invocation.
An Unix style solution is to use
pselect() is equivalent to atomically executing the following calls:
Unfortunately, there is no
pselect_wait() in PHP, this code is infeasible. We have no way to choose but implementing an inelegant mechanism by passing timeout parameter to the
$channel->wait() will time out periodically and throw an
AMQPTimeoutException. We just ignore it and continue our next loop if no termination signal issued. Of course, the exception might be raised by other reasons as it’s a general timeout exception. Further inspection has to be taken if necessary.
Also, if the
SIGTERM is called just before the
$channel->wait(), there will be an unavoidable timeout seconds delay before normal exit.
Well, I implemented this feature based on several references, including one project using this mechanism. Still, I don’t think this is an elegant and perfect solution. If you have any better ideas, or there are potential problems persisting, feel free to leave comments below.