Discussion:
Controlling user processes with systemd+cgroups
Add Reply
Benjamin Rose
2015-09-03 18:57:09 UTC
Reply
Permalink
Raw Message
Hello all,

I am in an academic environment here, and lots of poor code gets written
and then run. Memory leaks are a constant problem. So with RHEL6, I used
cgconfig and cgred to create 3 cgroups partitions:

/students: 80 CPU ticks, 80% of available memory total
/staff: 10 CPU ticks
/system: 10 CPU ticks

Everyone in groups "grad", "ugrad", "visitor", etc all got put into
/students. Anyone in @staff got put into the staff cgroup, and all the
rest went into /system.

The main goal was that no single user can hog all of the memory and kill
off system/sysadmin processes. I believe I have achieved this with systemd:

[***@example ~]# cat
/etc/systemd/system/system.slice.d/systemcpureserve.conf
[Slice]
CPUShares=10
[***@example ~]# cat /etc/systemd/system/user.slice.d/limitusers.conf
[Slice]
CPUShares=80
MemoryLimit=14G

But the other benefit that my cgred setup had was that no single user
could also use more than 50% of the available memory, so that one user
was less likely to kill other people's processes if the OOM was invoked.
This is in addition to the group constraint. Said another way - I don't
want one bad apple user in user.slice to be able to kill off every other
process in user.slice. I also want anyone in the group @staff to be
exempt from these memory limits, like in my setup for 6. It was nice to
be able to SSH in to an otherwise overloaded machine as myself and see
what was going on. In this solution above, root is put into user.slice
and is bound by the same resource limits as the students.

As far as I can tell, systemd-logind when included through PAM, only
makes a cgroup like "user-<uid>" under the user slice. But I am looking
to make this based not only on user ID, but also group ID. Is there any
way to achieve all of this within systemd? I guess there is an option of
doing something like this:

[***@prometheus system]# cat /etc/systemd/system/user-0.slice
[Slice]
CPUShares=10
MemoryLimit=16G

But I'm not even sure that would work (since user.slice limit is only
14G), let alone that I would need to create one of them for every UID of
every sysadmin on our network? And still, how to make sure that each
person can never use more than 50% of total system memory, while still
reserving 10%-20% of the overall resources for system/sysadmin functions.

Any help would be appreciated!

Thanks,
Ben
Lennart Poettering
2015-09-06 14:14:13 UTC
Reply
Permalink
Raw Message
As far as I can tell, systemd-logind when included through PAM, only makes a
cgroup like "user-<uid>" under the user slice. But I am looking to make this
based not only on user ID, but also group ID. Is there any way to achieve
all of this within systemd?
This is currently not possible, but certainly something we'd like to
cover eventually. The way we'd like to see this implemented someday
though is through an extensible user database that actually would
allow us to attach slice information to a user directly.

Currently, because we have no way to store nicely for each user which
slice it shall be attached to we will attach all logged in users to
the same "user.slice". In an ideal world, where the user database is
synchronized from an LDAP server the slice information belongs onto
the LDAP server as well. However, there's no commonly accepted
implementation and API for this on Linux, which we could use to query
such an additional user field from logind.

Ultimately our goal is that you build your tree of slices, and then
freely attach users, services, containers, VMs to these slices at the
places you want them. You can already do that nicely for services and
containers (at least for nspawn containers), but for users this is
really missing.

Sorry,

Lennart
--
Lennart Poettering, Red Hat
Michał Zegan
2015-09-06 15:05:04 UTC
Reply
Permalink
Raw Message
Well, actually I believe you could mess with unit configuration
overrides, couldn't you?
I was experimenting once by giving the user test 1% of cpu using cgroup
controls.
Post by Lennart Poettering
As far as I can tell, systemd-logind when included through PAM, only makes a
cgroup like "user-<uid>" under the user slice. But I am looking to make this
based not only on user ID, but also group ID. Is there any way to achieve
all of this within systemd?
This is currently not possible, but certainly something we'd like to
cover eventually. The way we'd like to see this implemented someday
though is through an extensible user database that actually would
allow us to attach slice information to a user directly.
Currently, because we have no way to store nicely for each user which
slice it shall be attached to we will attach all logged in users to
the same "user.slice". In an ideal world, where the user database is
synchronized from an LDAP server the slice information belongs onto
the LDAP server as well. However, there's no commonly accepted
implementation and API for this on Linux, which we could use to query
such an additional user field from logind.
Ultimately our goal is that you build your tree of slices, and then
freely attach users, services, containers, VMs to these slices at the
places you want them. You can already do that nicely for services and
containers (at least for nspawn containers), but for users this is
really missing.
Sorry,
Lennart
Lennart Poettering
2015-09-06 15:25:29 UTC
Reply
Permalink
Raw Message
Well, actually I believe you could mess with unit configuration overrides,
couldn't you?
I was experimenting once by giving the user test 1% of cpu using cgroup
controls.
Well, you can of course configure limits on individual sessions,
per-user and for all users combined, by using "systemctl set-property"
on the session scope unit, the per-user slice unit, or the
"user.slice" unit, respectively. However, what Benjamin tries to do
(as far as I understood it) is to introduce groups that combine the
resources of multiple users into one, and can have limits on
them. Now, the slice concept would allow that, but there's no nice way
to tell logind to place specific users in specific slices, they all
end up below user.slice and that's it.

Lennart
--
Lennart Poettering, Red Hat
Michał Zegan
2015-09-06 15:44:27 UTC
Reply
Permalink
Raw Message
if you would override the slice in the unit file using override files it
would not work?
Also not sure, I think I understand the question as "how to create
cgroup per user group" but well, I may understand it wrong.
Post by Lennart Poettering
Well, actually I believe you could mess with unit configuration overrides,
couldn't you?
I was experimenting once by giving the user test 1% of cpu using cgroup
controls.
Well, you can of course configure limits on individual sessions,
per-user and for all users combined, by using "systemctl set-property"
on the session scope unit, the per-user slice unit, or the
"user.slice" unit, respectively. However, what Benjamin tries to do
(as far as I understood it) is to introduce groups that combine the
resources of multiple users into one, and can have limits on
them. Now, the slice concept would allow that, but there's no nice way
to tell logind to place specific users in specific slices, they all
end up below user.slice and that's it.
Lennart
Lennart Poettering
2015-09-06 15:53:38 UTC
Reply
Permalink
Raw Message
Post by Michał Zegan
if you would override the slice in the unit file using override files it
would not work?
No it won't. Currently systemd hardcodes the slice and does not allow
it to be overriden.
Post by Michał Zegan
Also not sure, I think I understand the question as "how to create cgroup
per user group" but well, I may understand it wrong.
well, yes, precisely, the question was regarding being able to have
explicit cgroups (in systemd that would be slices) for specific
groupings of users, and we don't support that right now.

Lennart
--
Lennart Poettering, Red Hat
Michal Koutný
2018-02-08 11:30:42 UTC
Reply
Permalink
Raw Message
Hello systemd-devel.
Post by Lennart Poettering
Ultimately our goal is that you build your tree of slices, and then
freely attach users, services, containers, VMs to these slices at the
places you want them. You can already do that nicely for services and
containers (at least for nspawn containers), but for users this is
really missing.
The missing piece is thus where to store user->slice mapping (not
necessarily injective as it is now). Then it would be possible to apply
limits _shared in groups_ of users.

Lennart also sketched such information could be in the user database,
although there is no standard way how to obtain that. AFAIK this still
applies and IMO it may take longer to change than comfortable. (Please
enlighten me on whatever I might be missing in this regard.)

I gave a thought to alternatives. They basically rely on GID -- the
information that already can be obtained from a user database in
standard way and it overlaps with most missing use cases.

Variant 1 -- Slice.GroupId= property.

Admins would create unit files for slices specifying this option and
users who are members of any listed groups would have their user
sessions placed into given slice instead of user-$UID.slice.


Variant 2 -- group mode

This would allow admins to switch how user slices are created. By
switching into the group mode (e.g. pam_systemd or logind option) user
sessions would be put into group-$GID-$UID.slice and cgroup
configuration would be then applied to respective group-$GID.slice units.

What are your thoughts on that? Do any other alternatives come to your
mind? Would some of the variants be eventually acceptable to be included?

Thanks,
Michal

[1]
https://lists.freedesktop.org/archives/systemd-devel/2015-September/034131.html
Loading...