Post by Zbigniew Jędrzejewski-Szmek Post by Lennart Poettering Post by Patrick Häcker Post by Zbigniew Jędrzejewski-Szmek Post by Lennart Poettering
However, this one appears bogus to me. Is there any such software
around that really does this? And if so, this really appears weird to
me to support. Delaying shutdown for more than 30min is just wrong.
Isn't this what the various "download updates and reboot" gnome-y
things are doing?
At least unattended-upgrades from Debian/Ubuntu/... can be configured to
install updates on shutdown (without any special mode or something). And,
yes, this can run for more than 30 minutes, which I could already observe in
its default mode (installing during normal system activities), so I see no
reason why this should not happen when configured to install during shutdown.
The reason is, that unattended-upgrades can basically update the whole
distribution to the next version, which naturally can take a lot of time.
It's questionable if this is a sane setup, but I can think of setups where
this might be useful, e.g. having two identically configured servers for
redundancy reasons where one server would be enough. Then it might make sense
to update one system during shutdown while the other one takes over. This has
the advantage, that normally running servers either have the old or the new
state, but never some intermediate state during the update. The shutdown time
does not really matter in this case and a watchdog killing the system
wouldn't be welcome. But all in all this seems like an exotic use case.
Is "unattended-upgrades" a package of its own? If so, I'd probably ask
the packagers to include drop-ins for reboot.target to override the
timeout. That way, as soon as you install it the shutdown timeouts are
I'm a log-running job, I need more time, but everything is still fine. This
type of status should require periodical pings, watchdog style. Let's say that
the backup job run during shutdown hangs because there's no network, we want
to shutdown at some point anyway.
So, we always had per-unit timeouts in place, and they are opt-out
(with the exception of Type=oneshot services where they are
opt-in). Hence adding a second level of opt-out timeouts doesn't
sound particularly attractive to me.
The reason I added the system-wide startup/shutdown timeouts was
really to be a safety net, so that the individual per-unit timeouts
and the opted-out exceptions don't add up beyond bounds.
The specific usecase I had for this (beyond the obvious one in
embedded/HA setups) was my Lenovo Yoga laptop. It has the power button
on the outside, that is reachable even when the laptop is closed (this
is due the fact that it is convertible into tablet mode, where the
button needs to stay accessible). Now, if the system is suspended, and
the lid closed, and the power button is accidentally hit because the
laptop was stuffed in a backpack, then since a couple of versions ago
is not a problem: after a short while the system will suspend
again. However, if the machine is powered off with the lid closed, and
the power-button is hit the machine powers up currently, then boots up
until the LUKS prompt is hit and then just hangs there, forever,
heating up my backpack, so then when I finally unpack it the battery
is completely empty. I figure this is not only an issue with laptops
like the Yoga but in general with all kinds of devices.
Now, the question is what we can do now about this:
a) we could move logind into early boot. This has multiple problems
though: it would need to track system state as gettys on other ttys
should only be started in multi-user mode, not in early boot. Also,
the behaviour would probably not be ideal: i think it would be
preferable if the system shuts down rather then suspend if we hang
b) specifically do something about LUKS prompt timeouts: when a very
long timeout is hit for essential devices we could simply turn off
the machine again. This would fix my immediate problem, but I am
not sure I like it too much, I think other hangs should really be
c) we can come up with a scheme that explicitly excludes fsck, selinux
relabel and so on from the overall-timeout. Sounds messy and
non-obvious given that they all have individual timeouts
anyway... Two layers of opting out of timeouts sounds suspicious?
Any other ideas?
Lennart Poettering, Red Hat