CentOS systemd places service subprocesses started with `sudo` in `user.slice` (instead of `system.slice`)
Subprocesses that are created using sudo
within a systemd service are placed in the user.slice
. This behavior has been observed only on CentOS 8 (x86_64 20230606) and not on Ubuntu 20.04.5 LTS, where such subprocesses are placed in the system.slice
instead.
This is a huge problem because this results in the main service process residing in a different cgroup while the subprocesses created with sudo
end up in another cgroup. As a result, systemctl stop
invoked on the service only stops the main service process and not the subprocesses created with sudo
, so they end up hanging.
Reproduction steps:
- Login in as a non-root user (that has
sudo
permissions):
$ ssh centos@{ADDR}
- Create a bash script
script.sh
with the following content:
#!/bin/bash
sudo sleep 1000
- Make the script exectuable:
$ chmod +x script.sh
- Run the script as a systemd service:
$ sudo systemd-run ./script.sh
Running as unit: run-r16ed2fef94d442b5800035653bcbbe01.service
- Check the status:
$ sudo systemctl status run-rf4fe2865d42240c1a95f1ed497575494.service
● run-rf4fe2865d42240c1a95f1ed497575494.service - /home/centos/./script.sh
Loaded: loaded (/run/systemd/transient/run-rf4fe2865d42240c1a95f1ed497575494.service; transient)
Transient: yes
Active: active (running) since Wed 2023-11-15 10:01:49 UTC; 1min 1s ago
Main PID: 10587 (script.sh)
Tasks: 1 (limit: 97480)
Memory: 1.7M
CGroup: /system.slice/run-rf4fe2865d42240c1a95f1ed497575494.service
└─10587 /bin/bash /home/centos/./script.sh
You can see that neither sudo
nor sleep
subprocesses are listed in the resulted cgroup.
Note PID 10587.
- Check that the sleep processes exist:
$ ps aux | grep sleep
root 10588 0.0 0.0 333172 8140 ? S 10:01 0:00 sudo sleep 1000
root 10591 0.0 0.0 217092 848 ? S 10:01 0:00 sleep 1000
centos 10687 0.0 0.0 221940 1164 pts/0 S+ 10:03 0:00 grep --color=auto sleep
Stopping the created systemd process will not kill these two sudo
and sleep
processes (with PID 10588 and 10591, respectively).
- Check the systemd slice to which these processes belong:
- The main systemd service process (10587, see
systemctl status
above):
$ cat /proc/10587/cgroup | grep name=
1_name=systemd:/system.slice/run-rf4fe2865d42240c1a95f1ed497575494.service
- The
sudo
subprocess (10588):
$ cat /proc/10588/cgroup | grep name=
1_name=systemd:/user.slice/user-0.slice/session-c37.scope
- The
sleep
subprocess (10591)"
$ cat /proc/10591/cgroup | grep name=
1_name=systemd:/user.slice/user-0.slice/session-c37.scope
You can observe that the main script process belongs to the system.slice
, while the subprocesses belong to the user.slice
.
In Ubuntu 20.04.5 LTS with the exact same steps lead to all the subprocesses being placed in the same cgroup, under the system.slice
:
$ sudo systemd-run ./script.sh
Running as unit: run-r09332b0da31a4dd198286a87a917e55f.service
ubuntu@ip-10-0-0-92:~$ sudo systemctl status run-r09332b0da31a4dd198286a87a917e55f.service
● run-r09332b0da31a4dd198286a87a917e55f.service - /home/ubuntu/./script.sh
Loaded: loaded (/run/systemd/transient/run-r09332b0da31a4dd198286a87a917e55f.service; transient)
Transient: yes
Active: active (running) since Wed 2023-11-15 10:06:55 UTC; 7s ago
Main PID: 1729 (script.sh)
Tasks: 3 (limit: 18627)
Memory: 1.4M
CGroup: /system.slice/run-r09332b0da31a4dd198286a87a917e55f.service
├─1729 /bin/bash /home/ubuntu/./script.sh
├─1730 sudo sleep 1000
└─1731 sleep 1000
So stopping this service kills all the subprocesses.
So what is going on in CentOS? How can I make systemd there to behave the same way it does in Ubuntu? Can I force the subprocesses to be placed in the same cgroup under system.slice
?
I checked the systemd parameters for the created service in CentOS and they are effectively identical to Ubuntu’s. In particular, the Slice
parameter is set in CentOS to system.slice
:
$ sudo systemctl show run-rf4fe2865d42240c1a95f1ed497575494.service | grep Slice=
Slice=system.slice
Systemd service configuration does not take over child process creation nor move child process anywhere. That’s done by the PAM configuration of sudo
itself, which unnecessarily invokes pam_systemd
(perhaps deliberately to achieve the result that you don’t want, or perhaps just in an attempt to have XDG_RUNTIME_DIR set for the target user when people try to run GUI apps as root? I don’t know).
The PAM module creates a systemd-logind session and this causes logind to move the invoking process to the "session" cgroup, in much the same way that your own shell process gets moved out of sshd.service or getty@.service when you log in.
Edit /etc/pam.d/sudo
to disable the use of this module. Take care to not disable it for other PAM configurations, however – if the file includes /etc/pam.d/system-auth (which is also used by console and SSH logins), you’ll want to keep pam_systemd there; instead you might have to insert a pam_succeed_if to trick PAM into skipping it:
session [success=1 default=ignore] pam_succeed_if.so service = sudo
session whatever pam_systemd.so
("success=1" skips 1 next module if pam_succeed_if returns PAM_SUCCESS, while "default=ignore" causes all other results to be disregarded.)