Debugging Busy Processes
Sometimes when we run the command php start.php status, we see processes in the busy state, indicating that the corresponding process is handling a task. Under normal circumstances, once the task is completed, the process should revert to the idle state. This generally doesn't cause any problems. However, if a process remains in the busy state without returning to idle, it indicates that there may be a blockage or infinite loop within the process. Below are methods to locate the issue.
Using strace + lsof to Locate
1. Find the PID of the Busy Process
After running php start.php status, it shows the following:

The pid of the busy processes shown in the image are 11725 and 11748.
2. Trace the Process with strace
Choose one process pid (in this case, 11725) and run strace -ttp 11725, which shows:

You can see that the process is continuously looping on the system call poll([{fd=16, events=...., which is waiting for a readable event on the descriptor for fd 16, meaning it is waiting for this descriptor to return data.
If no system calls are displayed, keep the current terminal open, open a new terminal, and run kill -SIGALRM 11725 (sending an alarm signal to the process) and check if there’s a response in the strace terminal, or if it is blocking on a certain system call. If no system calls are displayed even now, it likely means the program is stuck in a business logic infinite loop. Please refer to the "Other Reasons for Processes Being Busy for a Long Time" section below for further solutions.
If the system is blocking on epoll_wait or select system calls, this is normal as it indicates that the process is already in an idle state.
3. Using lsof to Check Process Descriptors
Run lsof -nPp 11725 to display the following:

The descriptor 16 corresponds to the record of 16u (in the last line), showing that the fd=16 descriptor is a tcp connection, with the remote address 101.37.136.135:80. This indicates that the process is likely accessing an HTTP resource, and the continuous looping on poll([{fd=16, events=.... is simply waiting for the HTTP server to return data, which explains why the process is in a busy state.
Solution:
Now that it's clear where the process is blocked, solving the problem becomes easier. For example, the above investigation indicates that the business logic is calling curl, and the corresponding URL is taking too long to return data, causing the process to wait indefinitely. At this point, one can contact the URL provider to identify the reason for the slow response. Additionally, adding a timeout parameter to the curl call can help, for instance, if no return is received in 2 seconds, it should timeout, to prevent long waiting that locks the process (this might result in a busy state of around 2 seconds).
Other Reasons for Processes Being Busy for a Long Time
In addition to process blockage causing a busy state, there are also other factors that can lead to a process being in a busy state.
1. Fatal Errors in Business Logic Causing Continuous Process Exit
Phenomenon: In such cases, you will see a relatively high system load, with the load average in status being 1 or higher. The process's exit_count number will be high and continuously increasing.
Solution: Run in debug mode (php start.php start without -d) to check for errors in the business logic and resolve those errors.
2. Infinite Loops in Code
Phenomenon: In top, you can observe that the busy process is consuming high CPU usage, and the command strace -ttp pid outputs no system calls.
Solution: Refer to Bird Brother's article to locate using gdb and PHP source. The summarized steps are as follows:
- Use
php -vto check the version. - Download the corresponding PHP version source code.
- Run
gdb --pid=busy process's pid. - Execute
source php source path/.gdbinit. - Use
zbacktraceto print the stack trace.
In the last step, you can see the current execution call stack of PHP code, which indicates the location of the infinite loop in PHP code.
Note: Ifzbacktracedoes not print the call stack, it may be because your PHP was not compiled with the-gparameter, so you need to recompile PHP and restart Workerman for debugging.
3. Continuously Adding Timers Without Deletion
Business logic continuously adds timers without deleting them, resulting in an ever-increasing number of timers within the process, ultimately causing the process to run timers indefinitely. For example, the following code:
$worker = new Worker;
$worker->onConnect = function($con){
Timer::add(10, function(){});
};
Worker::runAll();
In this code, upon a client connecting, a timer is added, but there is no logic to delete this timer throughout the business code. Thus, as time progresses, the process will continuously add timers, ultimately leading to indefinite running of timers and a busy state.
The correct code:
$worker = new Worker;
$worker->onConnect = function($con){
$con->timer_id = Timer::add(10, function(){});
};
$worker->onClose = function($con){
Timer::del($con->timer_id);
};
Worker::runAll();