leftover machines
(too old to reply)
Johannes Ernst
2018-03-24 00:58:36 UTC
Raw Message
After running a bunch of systemd-nspawn containers, I am left with a few that seem to be empty, running nothing, but refuse to die or be killed after they did their useful work (so they did run correctly, the problem seems to occur on poweroff). What might be going on here?

This is:
* x86_64
* systemd 238.0-3 (from Arch)
* btrfs

% machinectl
t-23-235311-webapptest container systemd-nspawn - - -
t-23-235806-webapptest container systemd-nspawn - - -
t-24-000300-webapptest container systemd-nspawn - - -
t-24-001033-webapptest container systemd-nspawn - - -
t-24-002455-webapptest container systemd-nspawn - - -

% machinectl status t-24-002455-webapptest
Since: Sat 2018-03-24 00:24:55 UTC; 20min ago
Leader: 22947
Service: systemd-nspawn; class container
Root: /build/x86_64/workarea/repository/dev/x86_64/uncompressed-images/.#machine.ubos_dev_x86_64-container_20180323-234823.tardir53dc09a3ef78c32c
Iface: 13
Unit: machine-t\x2d24\x2d002455\x2dwebapptest.scope

Mar 24 00:24:55 ubos-pc systemd[1]: Started Container t-24-002455-webapptest.

The path to root does not exist (any more). All that’s left is the sibling lock file. btrfs does not list a relevant subvol.

Is Leader supposed to be a process ID? If so, such a process does not exist either.

% machinectl poweroff t-24-002455-webapptest
Could not kill machine: No such process

% machinectl terminate t-24-002455-webapptest
(says nothing, but does not do anything either)

% systemd-cgls
(does not show anything in the machine.slice, for none of the leftover machines)

Reboot makes them go away, but when I run the same scripts again, sooner or later I have leftover containers again.

There is nothing relevant in the journal I can see either.

This exact setup used to work just fine, I doubt I made any meaningful changes other than OS including systemd upgrades.

Life cycle:
1. systemd-nspawn --boot --ephemeral --network-veth --machine=t-xxx-webapptest --directory '/build/x86_64/workarea/repository/dev/x86_64/uncompressed-images/ubos_dev_x86_64-container_LATEST.tardir' --bind '/tmp/da6G8OYszu:/UBOS-STAFF' --system-call-filter=set_tls
2. System boots, I ssh in, and do stuff, and when done:
3. machinectl poweroff ’t-xxx-webapptest'

Any ideas?