Finding the root cause of hung Python test
How many times did you face a test that hanging for some reason? And how many times did you wonder what was the root cause of this? 🤷♂
Let’s say we can connect to an instance where tests run by SSH and have enough permissions to do it, but what should we do next?
In this article, we will try to figure out what we can do in this case and how
Let’s begin. I have an easy function that hangs.
At first, need to define on which PID our process started, usually, I use htop but in this case, it doesn’t matter let's find it via ps -x | grep python

Now we know that our program is running and has PID 7183
Let's try to use gdb for connecting to this process, it’s a low-level debugger that can be useful If you experience low-level problems such as crashes or deadlocks
By default, gdb doesn’t know about high-level information specific to the CPython interpreter, such as which Python function is currently executing, or what type or value has a given Python object represented by a standard
PyObject *
pointer. We hereafter present two ways to overcome this limitation.
I used several commands to install gdb and extensions of Python for it
sudo gdb -p 7183
Now when we are inside this process we should check how many threads we have
info threads

And we have only a single thread, just let’s jump into thread 1
Then when we are inside the current thread we can check the stack trace bt

It seems useful but if you know the C programing language. But we are Python developers and already installed python-gdb. So we can use py-bt means pretty the same but in Python interpretation 😍

I guess it’s much better!
Also can be useful other commands:py-list
, py-up
, py-down
, py-locals
, py-print
But what about the test which from time to time hangs. Let’s try the same things but with pytest
As a result of running python -m pytest -lvv test_run.py our test expectedly is hanging 😞

We can use the same steps as above, except need to find the PID of the pytest process

sudo gdb -p 7240
info threads
thread 1
In this case, let’s try another command py-list

It shows to us where our test stuck, to me, it is just amazing 🎉
In case if you have more threads than in the example thread apply all py-list could be useful
BTW: exiting from gdb pretty the same as in VIM 😁

So seems we figure out how to find the root cause of the hung test. Thanks for reading and good luck with tests 👍