How to kill all processes using a given GPU?

I use the CUDA toolkit to perform some computations on my Nvidia GPUs. How to kill all processes that use a given GPU? (killing at once, i.e. without having to manually type the PIDs behind kill -9.)

E.g. killing all processes using GPU 2:

enter image description here

Following the Unix philosophy, you have a tool that lists processes using a given GPU, and a tool that kills processes. Combine them using shell constructs and text processing tools.

For example, to kill all the processes using GPU 2, you can execute the following command:

kill $(nvidia-smi | awk '$2=="Processes:" {p=1} p && $2 == 2 && $3 > 0 {print $3}')

or

kill $(nvidia-smi -g 2 | awk '$2=="Processes:" {p=1} p && $3 > 0 {print $3}')

maybe this is what you need:

kill -9 $(nvidia-smi | awk '$2 == "GPU" && $3 == "PID" {flag = 1} flag && $3 > 0 {print $2, $3}' | awk '$1 == 2 {print $2}')

For more complex conditions, you can change the condition statements of 2nd awk command. For example, the following command can be used to kill all processes that use GPU-0 to GPU-3 and PID > 1000:

kill -9 $(nvidia-smi | awk '$2 == "GPU" && $3 == "PID" {flag = 1} flag && $3 > 0 {print $2, $3}' | awk '$1 < 4 && $2 > 1000 {print $2}')

As you can see, kill -9 PIDs needs PIDs to shut processes down, and awk is used twice to find valid PIDs to kill.

More specifically, The 1st awk command will choose lines after the “GPU PID Type Process Name” line, and then print out lines of GPU ids and PIDs, with a space between each GPU id and PID. The 2nd awk will find specific GPU ids or PIDs, which in this case is to find all processes using GPU-2, and then print out the PID. Finally, kill -9 PIDs will kill the processes according to these PIDs.

Answered By: ruiyuan Lu

Looks like answer by @Gilles ‘SO- stop being evil’ is extracting PID by $3 > 0 but it won’t work on some versions of awk. My workaround is checking by $3 + 0 == $3, which is stolen from here. And here’s the modified version:

kill $(nvidia-smi -g 2 | awk '$2=="Processes:" {p=1} p && $3 + 0 == $3 {print $3}')

Answered By: zingdle
lsof /dev/nvidia* | awk '{print $2}' | xargs -I {} kill {}

worked for me.

In my case, the processes were not listed from nvidia-smi.

Reference: https://stackoverflow.com/questions/4354257/can-i-stop-all-processes-using-cuda-in-linux-without-rebooting

Answered By: Shivam Kumar

This worked for me:

kill $(nvidia-smi -g 2 | awk '$5=="PID" {p=1} p {print $5}')

where the -g sets the gpu id to kill processes in and $5 is the PID column. You can omit the -g argument if you want to kill processes in all the gpus.

The awk-ification can by further enhanced by conditioning on the gpu memory usage: awk '$5==“PID” && $8>0 {p=1} p {print $5}', where $8 is the memory usage column.

Answered By: atmp
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.