CentOS systemd places service subprocesses started with `sudo` in `user.slice` (instead of `system.slice`)

Subprocesses that are created using sudo within a systemd service are placed in the user.slice. This behavior has been observed only on CentOS 8 (x86_64 20230606) and not on Ubuntu 20.04.5 LTS, where such subprocesses are placed in the system.slice instead.

This is a huge problem because this results in the main service process residing in a different cgroup while the subprocesses created with sudo end up in another cgroup. As a result, systemctl stop invoked on the service only stops the main service process and not the subprocesses created with sudo, so they end up hanging.

Reproduction steps:

  1. Login in as a non-root user (that has sudo permissions):
$ ssh centos@{ADDR}
  1. Create a bash script script.sh with the following content:
#!/bin/bash
sudo sleep 1000
  1. Make the script exectuable:
$ chmod +x script.sh
  1. Run the script as a systemd service:
$ sudo systemd-run ./script.sh
Running as unit: run-r16ed2fef94d442b5800035653bcbbe01.service
  1. Check the status:
$ sudo systemctl status run-rf4fe2865d42240c1a95f1ed497575494.service
● run-rf4fe2865d42240c1a95f1ed497575494.service - /home/centos/./script.sh
   Loaded: loaded (/run/systemd/transient/run-rf4fe2865d42240c1a95f1ed497575494.service; transient)
Transient: yes
   Active: active (running) since Wed 2023-11-15 10:01:49 UTC; 1min 1s ago
 Main PID: 10587 (script.sh)
    Tasks: 1 (limit: 97480)
   Memory: 1.7M
   CGroup: /system.slice/run-rf4fe2865d42240c1a95f1ed497575494.service
           └─10587 /bin/bash /home/centos/./script.sh

You can see that neither sudo nor sleep subprocesses are listed in the resulted cgroup.
Note PID 10587.

  1. Check that the sleep processes exist:
$ ps aux | grep sleep
root       10588  0.0  0.0 333172  8140 ?        S    10:01   0:00 sudo sleep 1000
root       10591  0.0  0.0 217092   848 ?        S    10:01   0:00 sleep 1000
centos     10687  0.0  0.0 221940  1164 pts/0    S+   10:03   0:00 grep --color=auto sleep

Stopping the created systemd process will not kill these two sudo and sleep processes (with PID 10588 and 10591, respectively).

  1. Check the systemd slice to which these processes belong:
  • The main systemd service process (10587, see systemctl status above):
$ cat /proc/10587/cgroup | grep name=
1_name=systemd:/system.slice/run-rf4fe2865d42240c1a95f1ed497575494.service
  • The sudo subprocess (10588):
$ cat /proc/10588/cgroup | grep name=
1_name=systemd:/user.slice/user-0.slice/session-c37.scope
  • The sleep subprocess (10591)"
$ cat /proc/10591/cgroup | grep name=
1_name=systemd:/user.slice/user-0.slice/session-c37.scope

You can observe that the main script process belongs to the system.slice, while the subprocesses belong to the user.slice.

In Ubuntu 20.04.5 LTS with the exact same steps lead to all the subprocesses being placed in the same cgroup, under the system.slice:

$ sudo systemd-run ./script.sh
Running as unit: run-r09332b0da31a4dd198286a87a917e55f.service
ubuntu@ip-10-0-0-92:~$ sudo systemctl status run-r09332b0da31a4dd198286a87a917e55f.service
● run-r09332b0da31a4dd198286a87a917e55f.service - /home/ubuntu/./script.sh
     Loaded: loaded (/run/systemd/transient/run-r09332b0da31a4dd198286a87a917e55f.service; transient)
  Transient: yes
     Active: active (running) since Wed 2023-11-15 10:06:55 UTC; 7s ago
   Main PID: 1729 (script.sh)
      Tasks: 3 (limit: 18627)
     Memory: 1.4M
     CGroup: /system.slice/run-r09332b0da31a4dd198286a87a917e55f.service
             ├─1729 /bin/bash /home/ubuntu/./script.sh
             ├─1730 sudo sleep 1000
             └─1731 sleep 1000

So stopping this service kills all the subprocesses.

So what is going on in CentOS? How can I make systemd there to behave the same way it does in Ubuntu? Can I force the subprocesses to be placed in the same cgroup under system.slice?

I checked the systemd parameters for the created service in CentOS and they are effectively identical to Ubuntu’s. In particular, the Slice parameter is set in CentOS to system.slice:

$ sudo systemctl show run-rf4fe2865d42240c1a95f1ed497575494.service | grep Slice=
Slice=system.slice
Asked By: flashy

||

Systemd service configuration does not take over child process creation nor move child process anywhere. That’s done by the PAM configuration of sudo itself, which unnecessarily invokes pam_systemd (perhaps deliberately to achieve the result that you don’t want, or perhaps just in an attempt to have XDG_RUNTIME_DIR set for the target user when people try to run GUI apps as root? I don’t know).

The PAM module creates a systemd-logind session and this causes logind to move the invoking process to the "session" cgroup, in much the same way that your own shell process gets moved out of sshd.service or getty@.service when you log in.

Edit /etc/pam.d/sudo to disable the use of this module. Take care to not disable it for other PAM configurations, however – if the file includes /etc/pam.d/system-auth (which is also used by console and SSH logins), you’ll want to keep pam_systemd there; instead you might have to insert a pam_succeed_if to trick PAM into skipping it:

session [success=1 default=ignore] pam_succeed_if.so service = sudo
session whatever                   pam_systemd.so

("success=1" skips 1 next module if pam_succeed_if returns PAM_SUCCESS, while "default=ignore" causes all other results to be disregarded.)

Answered By: u1686_grawity
Categories: Answers Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.