Discussion:
[PATCH] Avoid reloading services when shutting down
(too old to reply)
Martin Pitt
2015-02-03 16:26:35 UTC
Permalink
Hey all,

I'm currently reviewing our Debian patches for systemd, and came
across this one which sounds important for other distributions, too.
This was reported and fixed two years ago in
https://bugs.debian.org/635777 which has all the details and logs, but
the summary is:

Distributions have quite a lot of "run scripts in this .d/ directory
if stuff happens"; e. g. the ISC DHCP client has
/etc/dhcp/dhclient-{enter,exit}-hooks.d/, Debian's ifupdown has
/etc/network/if-{up,down}.d/, and of course init.d scripts themselves
also occasionally call "service foo reload" and similar. It can happen
that when requesting a shutdown, a script of the above kind reloads or
restarts another service. In this case, postfix wants to reload itself
after a network interface goes up or down; during runtime that works
fine, but if it happens during shutdown, that "systemctl reload
postfix" will cause the entire shutdown to stall for 90s (the default
timeout).

Michael Stapelberg wrote the attached patch which fixes this by
disallowing unit reloads/restarts when final.target is queued. This is
admittedly not very elegant as it hardcodes the "final.target" name
(and also, for upstream the comment should be adjusted of course), but
it does fix the issue. I can still reproduce the problem with 218 in a
VM.

Is this something which we can fix upstream in a more elegant manner?

Thanks,

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Lennart Poettering
2015-02-03 16:29:32 UTC
Permalink
Post by Martin Pitt
Hey all,
I'm currently reviewing our Debian patches for systemd, and came
across this one which sounds important for other distributions, too.
This was reported and fixed two years ago in
https://bugs.debian.org/635777 which has all the details and logs, but
Distributions have quite a lot of "run scripts in this .d/ directory
if stuff happens"; e. g. the ISC DHCP client has
/etc/dhcp/dhclient-{enter,exit}-hooks.d/, Debian's ifupdown has
/etc/network/if-{up,down}.d/, and of course init.d scripts themselves
also occasionally call "service foo reload" and similar. It can happen
that when requesting a shutdown, a script of the above kind reloads or
restarts another service. In this case, postfix wants to reload itself
after a network interface goes up or down; during runtime that works
fine, but if it happens during shutdown, that "systemctl reload
postfix" will cause the entire shutdown to stall for 90s (the default
timeout).
Hmm, why precisely does this stall for 90s?

Isn't this a case where people should just use "--no-block"?

Lennart
--
Lennart Poettering, Red Hat
Martin Pitt
2015-02-03 17:01:56 UTC
Permalink
Post by Lennart Poettering
Hmm, why precisely does this stall for 90s?
The current transaction has final.target and all other jobs which need
to be shut down. One of these now trigger "systemctl reload
postfix.service", but that reload isn't going to actually run in the
same transaction but in the next one. OTOH systemctl reload
waits for the reloading to finish, thus we have a deadlock.
Post by Lennart Poettering
Isn't this a case where people should just use "--no-block"?
Kind of. Not using this is the right thing while the machine is
running, so that the reload is actually done after calling systemctl
reload, and you can go on using postfix or whatever. --no-block should
help during shutdown, or early boot (same principal bug, but slightly
different patch, see http://bugs.debian.org/624599).

So every time you call reload you'd have to check whether or not you
are in early boot/during shutdown, or in the running system, and
conditionally use --no-block. However, as such scripts should never
call systemctl directly, but "service foo reload" (to work with other
init systems or chroot), it would be also possible to move that check
there, and conditionally add --no-block. It would just be another
thing that every distro has to re-discover :-)

Thanks,

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Lennart Poettering
2015-02-03 17:10:29 UTC
Permalink
Post by Martin Pitt
Post by Lennart Poettering
Hmm, why precisely does this stall for 90s?
The current transaction has final.target and all other jobs which need
to be shut down. One of these now trigger "systemctl reload
postfix.service", but that reload isn't going to actually run in the
same transaction but in the next one. OTOH systemctl reload
waits for the reloading to finish, thus we have a deadlock.
Post by Lennart Poettering
Isn't this a case where people should just use "--no-block"?
Kind of. Not using this is the right thing while the machine is
running, so that the reload is actually done after calling systemctl
reload, and you can go on using postfix or whatever. --no-block should
help during shutdown, or early boot (same principal bug, but slightly
different patch, see http://bugs.debian.org/624599).
So every time you call reload you'd have to check whether or not you
are in early boot/during shutdown, or in the running system, and
conditionally use --no-block. However, as such scripts should never
call systemctl directly, but "service foo reload" (to work with other
init systems or chroot), it would be also possible to move that check
there, and conditionally add --no-block. It would just be another
thing that every distro has to re-discover :-)
I am very strongly against adding hacky work-arounds like this to PID
1. The deadlocks are deadlocks, and implying --no-block if we are in
shutdown mode is a pretty drastic hack I think that special cases one
peculiar case way too much.

My recommendation would be to stick this into your /usr/bin/service
compat glue. Use the state string "systemctl is-system-running"
outputs to check if you are shutting down. If so, add --no-block to
the params you pass to systemctl.

Another option might be to pass --job-mode=ignore-dependencies instead
of --no-block, which was created for usecases like this, even though
it is frickin' ugly...

Lennart
--
Lennart Poettering, Red Hat
Martin Pitt
2015-02-03 19:36:01 UTC
Permalink
Post by Lennart Poettering
I am very strongly against adding hacky work-arounds like this to PID
Yeah, indeed. This is why I asked for a more elegant approach, and
indeed the --no-block or --job-mode=ignore-dependencies sound like
slightly better approaches to this. I'll test these more thoroughly
tomorrow, thanks for pointing out!
Post by Lennart Poettering
1. The deadlocks are deadlocks, and implying --no-block if we are in
shutdown mode is a pretty drastic hack I think that special cases one
peculiar case way too much.
Right, the problem is of course more generic. Any set of jobs (i. e. a
transaction) which causes (maybe through some indirection) one of its
job members to get restarted/reloaded will suffer from this deadlock
problem, AFAIUI. Booting and shutdown are therefore mostly affected by
this as pretty much every other time there the pending transactions
are only very small.
Post by Lennart Poettering
My recommendation would be to stick this into your /usr/bin/service
compat glue. Use the state string "systemctl is-system-running"
outputs to check if you are shutting down. If so, add --no-block to
the params you pass to systemctl.
That actually sounds like just what's needed here. is-system-running
will also neatly cover the corresponding case on bootup.
Post by Lennart Poettering
Another option might be to pass --job-mode=ignore-dependencies instead
of --no-block, which was created for usecases like this, even though
it is frickin' ugly...
For reload that should be fairly okay, as reload will quickly fail if
the unit isn't actually running, and if it is running its dependencies
already ought to be satisfied.

I'll look into both, thanks for the hints!

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Michael Biebl
2015-02-03 19:50:39 UTC
Permalink
Post by Martin Pitt
Post by Lennart Poettering
I am very strongly against adding hacky work-arounds like this to PID
Yeah, indeed. This is why I asked for a more elegant approach, and
indeed the --no-block or --job-mode=ignore-dependencies sound like
slightly better approaches to this. I'll test these more thoroughly
tomorrow, thanks for pointing out!
The current patch we ship in Debian is admittedly not the nicest one,
but it solves a very real issue which affects a lot of non-trivial
setups.
Post by Martin Pitt
Post by Lennart Poettering
1. The deadlocks are deadlocks, and implying --no-block if we are in
shutdown mode is a pretty drastic hack I think that special cases one
peculiar case way too much.
Right, the problem is of course more generic. Any set of jobs (i. e. a
transaction) which causes (maybe through some indirection) one of its
job members to get restarted/reloaded will suffer from this deadlock
problem, AFAIUI. Booting and shutdown are therefore mostly affected by
this as pretty much every other time there the pending transactions
are only very small.
Post by Lennart Poettering
My recommendation would be to stick this into your /usr/bin/service
compat glue. Use the state string "systemctl is-system-running"
outputs to check if you are shutting down. If so, add --no-block to
the params you pass to systemctl.
That actually sounds like just what's needed here. is-system-running
will also neatly cover the corresponding case on bootup.
Post by Lennart Poettering
Another option might be to pass --job-mode=ignore-dependencies instead
of --no-block, which was created for usecases like this, even though
it is frickin' ugly...
For reload that should be fairly okay, as reload will quickly fail if
the unit isn't actually running, and if it is running its dependencies
already ought to be satisfied.
I'll look into both, thanks for the hints!
Keep in mind though, that you'll need to not only update "service",
but also invoke-rc.d and the lsb-hook, which is used when someone
calls /etc/init.d/foo directly.

It's also magnitudes less efficient to spawn external binaries to get
the system state then doing this check from within systemd itself.

So I'm not really convinced that shifting the responsibility here to
the scripts themselves is the better approach.


Michael
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Lennart Poettering
2015-02-03 20:52:35 UTC
Permalink
Post by Michael Biebl
Post by Martin Pitt
Post by Lennart Poettering
Another option might be to pass --job-mode=ignore-dependencies instead
of --no-block, which was created for usecases like this, even though
it is frickin' ugly...
For reload that should be fairly okay, as reload will quickly fail if
the unit isn't actually running, and if it is running its dependencies
already ought to be satisfied.
I'll look into both, thanks for the hints!
Keep in mind though, that you'll need to not only update "service",
but also invoke-rc.d and the lsb-hook, which is used when someone
calls /etc/init.d/foo directly.
It's also magnitudes less efficient to spawn external binaries to get
the system state then doing this check from within systemd itself.
So I'm not really convinced that shifting the responsibility here to
the scripts themselves is the better approach.
But note that this way you alter *all* queued jobs that way,
regardless if they are created with the assumptions of sysv behaviour
or if they were created in code that understands systemd's semantics,
and actually cares for the correct ordering..

I'd strongly recommend finding local solutions to the problems at hand
here, rather than changing behaviour for all other non-sysv code as
well...

Lennart
--
Lennart Poettering, Red Hat
Michael Biebl
2015-02-03 20:58:09 UTC
Permalink
Post by Lennart Poettering
But note that this way you alter *all* queued jobs that way,
regardless if they are created with the assumptions of sysv behaviour
or if they were created in code that understands systemd's semantics,
and actually cares for the correct ordering..
The patch does not alter any ordering, it simply throws away
reload/restart requests, which would be pointless anyway.
Post by Lennart Poettering
I'd strongly recommend finding local solutions to the problems at hand
here, rather than changing behaviour for all other non-sysv code as
well...
Maybe this is an misunderstanding, but this issue is not sysv specific at all.
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Lennart Poettering
2015-02-03 21:10:14 UTC
Permalink
Post by Michael Biebl
Post by Lennart Poettering
But note that this way you alter *all* queued jobs that way,
regardless if they are created with the assumptions of sysv behaviour
or if they were created in code that understands systemd's semantics,
and actually cares for the correct ordering..
The patch does not alter any ordering, it simply throws away
reload/restart requests, which would be pointless anyway.
Well, OK, "cares for execution of the jobs", then.
Post by Michael Biebl
Post by Lennart Poettering
I'd strongly recommend finding local solutions to the problems at hand
here, rather than changing behaviour for all other non-sysv code as
well...
Maybe this is an misunderstanding, but this issue is not sysv specific at all.
I don't see how this would apply to non-sysv code. I mean, code that
is written with systemd semantics in mind should be able to issue a
service reload during any time it wants to, if it keeps the ordering
issues in mind. For example, if I have a service that has
DefaultDependencies=no (and hence ordered against nothing at all by
default), and I want to issue systemctl reload on it, knowing that
this cannot really deadlock, since there are no deps that could cause
deadlocks, then i should be able to do so. With your patch you turn
these reloads into NOPs...

Lennart
--
Lennart Poettering, Red Hat
Michael Biebl
2015-02-03 21:22:04 UTC
Permalink
Post by Lennart Poettering
I don't see how this would apply to non-sysv code. I mean, code that
is written with systemd semantics in mind should be able to issue a
service reload during any time it wants to, if it keeps the ordering
issues in mind. For example, if I have a service that has
DefaultDependencies=no (and hence ordered against nothing at all by
default), and I want to issue systemctl reload on it, knowing that
this cannot really deadlock, since there are no deps that could cause
deadlocks, then i should be able to do so. With your patch you turn
these reloads into NOPs...
During shutdown (and early boot), yes. But why would you want to
schedule a restart or reload during shutdown?
Do you have a real-world example for that?

The thing is, you have to be extra careful to never, ever call a
restart/reload from such hook scripts. If those are triggered via
service or systemctl on a native or SysV script doesn't make a
difference.
It's simply to easy to cause a dead lock this way, and I think systemd
should try much harder to avoid such situations.
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Lennart Poettering
2015-02-03 21:38:16 UTC
Permalink
Post by Michael Biebl
Post by Lennart Poettering
I don't see how this would apply to non-sysv code. I mean, code that
is written with systemd semantics in mind should be able to issue a
service reload during any time it wants to, if it keeps the ordering
issues in mind. For example, if I have a service that has
DefaultDependencies=no (and hence ordered against nothing at all by
default), and I want to issue systemctl reload on it, knowing that
this cannot really deadlock, since there are no deps that could cause
deadlocks, then i should be able to do so. With your patch you turn
these reloads into NOPs...
During shutdown (and early boot), yes. But why would you want to
schedule a restart or reload during shutdown?
Well, we try to keep reloads out of the default codepaths, but it's
easy to construct cases where people might want to deviate fromt the
default codepaths.
Post by Michael Biebl
Do you have a real-world example for that?
Consider systemd-journald.service. It's a service with
DefaultDependencies=no. We don't terminate it as part of the normal
shutdown, we leave that to the final killing spree, so that we have
logging for as long as possible.

Now, people might want to hack something up that changes journald
configuration to forward logs to kmsg during the last part of
shutdown, so that they can see it in netconsole or so. SO they write
their little service that patches journald.conf and restarts journald,
and this is done during shutdown from a normal service's ExecStop=
line. This normally works fine, since journald is not ordered against
anything that matters. But it doesn't work on Debian, because on
Debian there's no way anymore to restart journald as soon as shutdown
commenced...

While I just made this scenario up I think it's actually quite
realistic, and I think it's a valid thing for admins to do....
Post by Michael Biebl
The thing is, you have to be extra careful to never, ever call a
restart/reload from such hook scripts. If those are triggered via
service or systemctl on a native or SysV script doesn't make a
difference.
It is completely fine to enqueue restarts and reloads from hook
scripts. However the emphasis is on *enqueue*. The issue is that you
block on it, you shouldn't do that.

On Fedora, iscsi is reloaded from an NM callout. If you ask me that's
frickin' ugly, but it *is* supported as long as iscsi's reload is
queued asynchronously instead of requested synchronously. In Fedora,
the logic to make this async sits on the client side of things, it's
not enforced by the engine in PID 1.

The same really applies to Debian too...
Post by Michael Biebl
It's simply to easy to cause a dead lock this way, and I think systemd
should try much harder to avoid such situations.
Well, it would be great if we could solve the halting problem. But we
can't.

I mean, I am all ears regarding adding deadlock detection code. But I
am really not convinced that breaking the engine for *everybody* just
because *some* clients are weird is an option.

Lennart
--
Lennart Poettering, Red Hat
Michael Biebl
2015-02-03 22:03:01 UTC
Permalink
Post by Lennart Poettering
Post by Michael Biebl
Post by Lennart Poettering
I don't see how this would apply to non-sysv code. I mean, code that
is written with systemd semantics in mind should be able to issue a
service reload during any time it wants to, if it keeps the ordering
issues in mind. For example, if I have a service that has
DefaultDependencies=no (and hence ordered against nothing at all by
default), and I want to issue systemctl reload on it, knowing that
this cannot really deadlock, since there are no deps that could cause
deadlocks, then i should be able to do so. With your patch you turn
these reloads into NOPs...
During shutdown (and early boot), yes. But why would you want to
schedule a restart or reload during shutdown?
Well, we try to keep reloads out of the default codepaths, but it's
easy to construct cases where people might want to deviate fromt the
default codepaths.
Post by Michael Biebl
Do you have a real-world example for that?
Consider systemd-journald.service. It's a service with
DefaultDependencies=no. We don't terminate it as part of the normal
shutdown, we leave that to the final killing spree, so that we have
logging for as long as possible.
Now, people might want to hack something up that changes journald
configuration to forward logs to kmsg during the last part of
shutdown, so that they can see it in netconsole or so. SO they write
their little service that patches journald.conf and restarts journald,
and this is done during shutdown from a normal service's ExecStop=
line. This normally works fine, since journald is not ordered against
anything that matters. But it doesn't work on Debian, because on
Debian there's no way anymore to restart journald as soon as shutdown
commenced...
While I just made this scenario up I think it's actually quite
realistic, and I think it's a valid thing for admins to do....
Well, we could easily check if DefaultDependencies=yes in this case.
Actually, this is already what we do for the boot case. [1]

So even with your made-up example, it would work.
Post by Lennart Poettering
Post by Michael Biebl
The thing is, you have to be extra careful to never, ever call a
restart/reload from such hook scripts. If those are triggered via
service or systemctl on a native or SysV script doesn't make a
difference.
It is completely fine to enqueue restarts and reloads from hook
scripts. However the emphasis is on *enqueue*. The issue is that you
block on it, you shouldn't do that.
On Fedora, iscsi is reloaded from an NM callout. If you ask me that's
frickin' ugly, but it *is* supported as long as iscsi's reload is
queued asynchronously instead of requested synchronously. In Fedora,
the logic to make this async sits on the client side of things, it's
not enforced by the engine in PID 1.
The same really applies to Debian too...
Post by Michael Biebl
It's simply to easy to cause a dead lock this way, and I think systemd
should try much harder to avoid such situations.
Well, it would be great if we could solve the halting problem. But we
can't.
I mean, I am all ears regarding adding deadlock detection code. But I
am really not convinced that breaking the engine for *everybody* just
because *some* clients are weird is an option.
Calling it "breaking the engine for everybody" is hyperbole.

That said, do you have better ideas how we could avoid having systemd
easily being dead-locked on shutdown?
I'm all for solving it in a nicer way. But simply throwing the hands
in the air and saying, not our problem, doesn't help.
Things like that hurt our users badly. systemd should always try to
get into a usable state, i.e. to boot into state where one can log in,
or shutdown/reboot successfully.
As much as I like systemd, it's really easy to break it currently and
it would be awesome if it was more robust in such situations.


Michael


[1] http://anonscm.debian.org/cgit/pkg-systemd/systemd.git/tree/debian/patches/Avoid-reload-and-re-start-requests-during-early-boot.patch
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Lennart Poettering
2015-02-03 23:14:54 UTC
Permalink
Post by Michael Biebl
Post by Lennart Poettering
While I just made this scenario up I think it's actually quite
realistic, and I think it's a valid thing for admins to do....
Well, we could easily check if DefaultDependencies=yes in this case.
Actually, this is already what we do for the boot case. [1]
So even with your made-up example, it would work.
Post by Lennart Poettering
Post by Michael Biebl
The thing is, you have to be extra careful to never, ever call a
restart/reload from such hook scripts. If those are triggered via
service or systemctl on a native or SysV script doesn't make a
difference.
It is completely fine to enqueue restarts and reloads from hook
scripts. However the emphasis is on *enqueue*. The issue is that you
block on it, you shouldn't do that.
On Fedora, iscsi is reloaded from an NM callout. If you ask me that's
frickin' ugly, but it *is* supported as long as iscsi's reload is
queued asynchronously instead of requested synchronously. In Fedora,
the logic to make this async sits on the client side of things, it's
not enforced by the engine in PID 1.
The same really applies to Debian too...
Post by Michael Biebl
It's simply to easy to cause a dead lock this way, and I think systemd
should try much harder to avoid such situations.
Well, it would be great if we could solve the halting problem. But we
can't.
I mean, I am all ears regarding adding deadlock detection code. But I
am really not convinced that breaking the engine for *everybody* just
because *some* clients are weird is an option.
Calling it "breaking the engine for everybody" is hyperbole.
That said, do you have better ideas how we could avoid having systemd
easily being dead-locked on shutdown?
I'm all for solving it in a nicer way. But simply throwing the hands
in the air and saying, not our problem, doesn't help.
I made a clear recommendation: whenever commands are converted from
sysv operations into systemctl operations, then add --no-block or
--job-mode=ignore-deps to the systemctl command line, after checking
that you are in startup or shutdown mode. Why wouldn't that suffice?

Lennart
--
Lennart Poettering, Red Hat
Lennart Poettering
2015-02-03 20:40:37 UTC
Permalink
Post by Martin Pitt
Post by Lennart Poettering
I am very strongly against adding hacky work-arounds like this to PID
Yeah, indeed. This is why I asked for a more elegant approach, and
indeed the --no-block or --job-mode=ignore-dependencies sound like
slightly better approaches to this. I'll test these more thoroughly
tomorrow, thanks for pointing out!
Post by Lennart Poettering
1. The deadlocks are deadlocks, and implying --no-block if we are in
shutdown mode is a pretty drastic hack I think that special cases one
peculiar case way too much.
Right, the problem is of course more generic. Any set of jobs (i. e. a
transaction) which causes (maybe through some indirection) one of its
job members to get restarted/reloaded will suffer from this deadlock
problem, AFAIUI. Booting and shutdown are therefore mostly affected by
this as pretty much every other time there the pending transactions
are only very small.
It's really about synchronous waiting on jobs. If you synchronously
wait for completion of jobs that are ordered against the job your are
part of yourself, then things will deadlock. This can happen in many
contexts. You can fix this by:

a) changing from synchronous to asynchronous operation

or

b) removing the ordering (either individually for the job, or
generically forever in the unit)

Now, regardless which option you choose it's always a good idea to
keep this change as local as possible. Altering the state engine for
all operations is the worst solution...

Lennart
--
Lennart Poettering, Red Hat
Martin Pitt
2015-02-04 07:56:36 UTC
Permalink
Hello all,

I changed the Subject: to make this thread a bit easier to find.
Post by Lennart Poettering
It's really about synchronous waiting on jobs. If you synchronously
wait for completion of jobs that are ordered against the job your are
part of yourself, then things will deadlock.
Indeed. The problem is that if you reload e. g. postfix from a DHCP or
"network up/down" hook, such a script doesn't have the slightest idea
whether it was run because the network changed at runtime (i. e. udev
event or the user just selected a new network) or whether it happens
as part of a systemd transaction (boot/shutdown). In the former case
you do want to block, in the latter case you mustn't.

FTR, I'm currently debugging a similar issue on
https://launchpad.net/bugs/1417010 which isn't caught by the current
two Debian patches, so we need a more generic solution anyway.
Post by Lennart Poettering
Now, regardless which option you choose it's always a good idea to
keep this change as local as possible. Altering the state engine for
all operations is the worst solution...
Well, it's a problem which can happen in a lot of scenarios and isn't
specific to which kind of service or hook script you have, so what's
"local" is actually quite hard to define here.

I agree with Michael that involving a lot of shell commands which we
then have to copy to lots of places (and find these places at all) is
also not the best solution. So perhaps we could have some middle
ground here and make systemctl a bit more clever?

- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?

And/or

- systemctl reload/restart could imply --no-block if the service is
already enqueued in the current transaction. That would avoid this
deadlock situation in more cases.

With that the remaining deadlock case would be trying to reload an
already running service which isn't affected by the current
transaction, but we haven't seen that in practice yet.

If you don't want this upstream, I'd keep it as a patch in Debian. But
I can't really imagine that this wouldn't happen in Fedora or other
distros? I mean, things like the ISC DHCP hooks aren't a Debianism,
and a lot of existing software wasn't written with this "be careful on
service reloads and guess whether you need --no-block" approach in
mind, as it has never been a problem with other init systems.

Thanks,

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Lennart Poettering
2015-02-04 12:27:26 UTC
Permalink
Post by Martin Pitt
Post by Lennart Poettering
It's really about synchronous waiting on jobs. If you synchronously
wait for completion of jobs that are ordered against the job your are
part of yourself, then things will deadlock.
Indeed. The problem is that if you reload e. g. postfix from a DHCP or
"network up/down" hook, such a script doesn't have the slightest idea
whether it was run because the network changed at runtime (i. e. udev
event or the user just selected a new network) or whether it happens
as part of a systemd transaction (boot/shutdown). In the former case
you do want to block, in the latter case you mustn't.
I don't see why you'd want to block in either case. Networking is
asynchronous anyway, there's really no point in blocking here. What
does this give you?

Modern code, that talks directly to systemctl, that uses hook scripts,
should always use --no-block for these cases.

That said, non-crap code does not rely on hook scripts like this
anyway, and just subscribes to rtnl on its own.
Post by Martin Pitt
Post by Lennart Poettering
Now, regardless which option you choose it's always a good idea to
keep this change as local as possible. Altering the state engine for
all operations is the worst solution...
Well, it's a problem which can happen in a lot of scenarios and isn't
specific to which kind of service or hook script you have, so what's
"local" is actually quite hard to define here.
I agree with Michael that involving a lot of shell commands which we
then have to copy to lots of places (and find these places at all) is
also not the best solution. So perhaps we could have some middle
ground here and make systemctl a bit more clever?
Hmm? I don't follow here at all?

I mean, you must have some code in Debian that forwards old sysv
restart/reload requests to systemctl. That's probably going to be in
/usr/sbin/service and maybe a few other places. Why can't you just add
the --no-block logic there, conditionalized to shutdown/reboot?
Post by Martin Pitt
- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?
No, this would open up the door for races. The guarantee we give
around blocking operations, is that by the time they return the
operation or an equivalent has been executed. More specifically, if
you start a service, and it is in "starting", and then issue a
"reload" or "restart", and it returns you *know* that the
configuration that was on disk at the time you issued the "reload" or
"restart" -- or a newer one -- is in place. If you'd suppress the
reload/restart in this case, then you will not get that guarantee,
because the configuration ultimately loaded might be the one from the
time the daemon was first put into starting mode at.

Or in other words: if a script does this:

# sed -i -e 's/something/somethingelse/' /etc/mydaemon.conf
# systemctl restart mydaemon.service

Then we *must* give the guarantee that when the second command returns
the change just made to /etc/mydaemon.conf is in place, and not a
previous config loaded. And we must guarantee that in any case,
regardless if the daemon is about to start, already started, reloaded
or anything else...

(Now, this example is about restarts, but reload is pretty much the
same, except that many people use async ExecReload= commands (for
example send a signal), which will break this guarantee, but that's on
them then, we recommend synchronous reload operations, to make this
race-free).
Post by Martin Pitt
And/or
- systemctl reload/restart could imply --no-block if the service is
already enqueued in the current transaction. That would avoid this
deadlock situation in more cases.
Opens the same race, as described above.

Also: *why* for heaven's sake? Just add --no-block when you are
running from these contexts, and all is good. I really don't get why
you don't want to do that.
Post by Martin Pitt
With that the remaining deadlock case would be trying to reload an
already running service which isn't affected by the current
transaction, but we haven't seen that in practice yet.
If you don't want this upstream, I'd keep it as a patch in Debian. But
I can't really imagine that this wouldn't happen in Fedora or other
distros? I mean, things like the ISC DHCP hooks aren't a Debianism,
and a lot of existing software wasn't written with this "be careful on
service reloads and guess whether you need --no-block" approach in
mind, as it has never been a problem with other init systems.
On Fedora, these hook scripts use --no-block correctly, and I am
completely puzzled why you don't want this on Debian!

Please, can you elaborate what your issue with --no-block is?

Lennart
--
Lennart Poettering, Red Hat
Martin Pitt
2015-02-04 14:25:26 UTC
Permalink
Post by Lennart Poettering
Post by Martin Pitt
Post by Lennart Poettering
It's really about synchronous waiting on jobs. If you synchronously
wait for completion of jobs that are ordered against the job your are
part of yourself, then things will deadlock.
Indeed. The problem is that if you reload e. g. postfix from a DHCP or
"network up/down" hook, such a script doesn't have the slightest idea
whether it was run because the network changed at runtime (i. e. udev
event or the user just selected a new network) or whether it happens
as part of a systemd transaction (boot/shutdown). In the former case
you do want to block, in the latter case you mustn't.
I don't see why you'd want to block in either case. Networking is
asynchronous anyway, there's really no point in blocking here. What
does this give you?
So far, if a hook or shell script calls e. g. "service foo restart",
it can count on the service actually being reloaded after it finishes,
and thus you can then interact with it immediately. That's not
important for many scripts, but we can't just always make "service foo
reload" async by using --no-block, as that would break compatibility
with scripts which do rely on that behaviour. So we do need to
conditionalize it to avoid the deadlocks under systemd.
Post by Lennart Poettering
Modern code, that talks directly to systemctl, that uses hook scripts,
should always use --no-block for these cases.
Yeah, for sure. We just don't have that luxury easily, as first these
scripts don't call systemctl but "service" (for the Debianists: or
invoke-rc.d, but same difference), it will take some time to find
all these occurrences, and people will hate us for having to add stuff
like

if (running under systemd)
systemctl --no-block reload foo
else
service foo reload

as that special case appears rather pointless for someone who hasn't
dealt with the details of systemd job transactions. That's not to say
that it shouldn't happen (perhaps adding a --no-block option to
service), but that takes time.
Post by Lennart Poettering
I mean, you must have some code in Debian that forwards old sysv
restart/reload requests to systemctl. That's probably going to be in
/usr/sbin/service and maybe a few other places. Why can't you just add
the --no-block logic there, conditionalized to shutdown/reboot?
Well, we can; it's just as much an unreliable hack as the two existing
patches :-) and I was wondering if we can't solve this in a more
generic/distro agnostic way.
Post by Lennart Poettering
Post by Martin Pitt
- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?
No, this would open up the door for races. The guarantee we give
around blocking operations, is that by the time they return the
operation or an equivalent has been executed. More specifically, if
you start a service, and it is in "starting", and then issue a
"reload" or "restart", and it returns you *know* that the
configuration that was on disk at the time you issued the "reload" or
"restart" -- or a newer one -- is in place. If you'd suppress the
reload/restart in this case, then you will not get that guarantee,
because the configuration ultimately loaded might be the one from the
time the daemon was first put into starting mode at.
Hm, I don't quite understand this. If you reload a service which isn't
currently running, systemctl will fail. It's just currently going to
enqueue the reload request, and executing the job will fail due to the
"not active" check in unit_reload(). The problem with that is just
that systemctl doesn't fail immediately with "the unit is not active",
but blocks on the queued request. So if you don't have pending jobs,
the net result is the same, but with pending jobs you can easily get a
deadlock.
Post by Lennart Poettering
# sed -i -e 's/something/somethingelse/' /etc/mydaemon.conf
# systemctl restart mydaemon.service
Then we *must* give the guarantee that when the second command returns
the change just made to /etc/mydaemon.conf is in place, and not a
previous config loaded.
Yes, that's fine. I only said "reload" above, as that's what
DHCP/network scripts usually do.
Post by Lennart Poettering
(Now, this example is about restarts, but reload is pretty much the
same
reload on an inactive unit will fail, restart will start it.
Post by Lennart Poettering
Also: *why* for heaven's sake? Just add --no-block when you are
running from these contexts, and all is good. I really don't get why
you don't want to do that.
Well, I do, but special-casing that on boot/shutdown isn't enough (you
can have other bigger transactions which lock up); and it's rather
expensive to do this via multiple CLI tools and D-BUS calls rather
than in the manager itself. So I was pondering how to get this effect
much more efficiently in manager.c instead of the shell wrappers
calling systemctl.
Post by Lennart Poettering
On Fedora, these hook scripts use --no-block correctly, and I am
completely puzzled why you don't want this on Debian!
Mostly because we don't currently have the luxury of having only
systemd :-)

Anyway, I think your stanza on this is clear. We'll keep the hacks on
the Debian side, if that's not a problem for other distros.

Thanks for the discussion and the hint with --no-block!

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Lennart Poettering
2015-02-04 15:38:38 UTC
Permalink
Post by Martin Pitt
Post by Lennart Poettering
Post by Martin Pitt
Post by Lennart Poettering
It's really about synchronous waiting on jobs. If you synchronously
wait for completion of jobs that are ordered against the job your are
part of yourself, then things will deadlock.
Indeed. The problem is that if you reload e. g. postfix from a DHCP or
"network up/down" hook, such a script doesn't have the slightest idea
whether it was run because the network changed at runtime (i. e. udev
event or the user just selected a new network) or whether it happens
as part of a systemd transaction (boot/shutdown). In the former case
you do want to block, in the latter case you mustn't.
I don't see why you'd want to block in either case. Networking is
asynchronous anyway, there's really no point in blocking here. What
does this give you?
So far, if a hook or shell script calls e. g. "service foo restart",
it can count on the service actually being reloaded after it finishes,
and thus you can then interact with it immediately. That's not
important for many scripts, but we can't just always make "service foo
reload" async by using --no-block, as that would break compatibility
with scripts which do rely on that behaviour. So we do need to
conditionalize it to avoid the deadlocks under systemd.
Sure, I can only recommend again: in the the glue code that calls out
to "systemctl" from "service", you can add the code to use --no-block
or --job-mode=ignore-dependencies , if you notice you are in shutdown
mode...
Post by Martin Pitt
Post by Lennart Poettering
Modern code, that talks directly to systemctl, that uses hook scripts,
should always use --no-block for these cases.
Yeah, for sure. We just don't have that luxury easily, as first these
scripts don't call systemctl but "service" (for the Debianists: or
invoke-rc.d, but same difference), it will take some time to find
all these occurrences, and people will hate us for having to add stuff
like
if (running under systemd)
systemctl --no-block reload foo
else
service foo reload
I am not proposing anything like that.

Your /usr/sbin/service binary translates sysv requests into systemctl
requests, right? In that code, you can easily impliy --no-block or
--job-mode=ignore-dependencies under the right circumstances.

There's *no* reason to patch that into all scripts, just patch that
into your "service" binary...
Post by Martin Pitt
Post by Lennart Poettering
I mean, you must have some code in Debian that forwards old sysv
restart/reload requests to systemctl. That's probably going to be in
/usr/sbin/service and maybe a few other places. Why can't you just add
the --no-block logic there, conditionalized to shutdown/reboot?
Well, we can; it's just as much an unreliable hack as the two existing
patches :-) and I was wondering if we can't solve this in a more
generic/distro agnostic way.
No it's not as broken. Because if you use "systemctl" then, you will
always get the strict and right systemd semantics. Only your sysv glue
code will degrade things to be closer to sysv then.

/usr/bin/service is distro-specifc, there's no way around that. Some
distros have it as C binary, others as shell script, all with slightly
different semantics. If you make it translate sysv scripts into
systemctl invocation, then that's precisely where you should add in
--no-block or --job-mode=ignore-dependencies.
Post by Martin Pitt
Post by Lennart Poettering
Post by Martin Pitt
- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?
No, this would open up the door for races. The guarantee we give
around blocking operations, is that by the time they return the
operation or an equivalent has been executed. More specifically, if
you start a service, and it is in "starting", and then issue a
"reload" or "restart", and it returns you *know* that the
configuration that was on disk at the time you issued the "reload" or
"restart" -- or a newer one -- is in place. If you'd suppress the
reload/restart in this case, then you will not get that guarantee,
because the configuration ultimately loaded might be the one from the
time the daemon was first put into starting mode at.
Hm, I don't quite understand this. If you reload a service which isn't
currently running, systemctl will fail. It's just currently going to
enqueue the reload request, and executing the job will fail due to the
"not active" check in unit_reload(). The problem with that is just
that systemctl doesn't fail immediately with "the unit is not active",
but blocks on the queued request. So if you don't have pending jobs,
the net result is the same, but with pending jobs you can easily get a
deadlock.
Again, if a "start" job is already queued and in progress of being
dispatched, we need to queue the reload job, to get the right
guarantess.

It's not that hard to see, is it?
Post by Martin Pitt
Post by Lennart Poettering
# sed -i -e 's/something/somethingelse/' /etc/mydaemon.conf
# systemctl restart mydaemon.service
Then we *must* give the guarantee that when the second command returns
the change just made to /etc/mydaemon.conf is in place, and not a
previous config loaded.
Yes, that's fine. I only said "reload" above, as that's what
DHCP/network scripts usually do.
Post by Lennart Poettering
(Now, this example is about restarts, but reload is pretty much the
same
reload on an inactive unit will fail, restart will start it.
Yes, but again: regardless if the unit is currently starting, already
started or currently reloading, we must properly enqueue a new job, to
give the right guarantess.

We must execute the jobs in order. If there's start job in progress,
or a reload job, and we queue a reload job, then we need to wait for
the start job or reload job to finish, to begin with the new reload
job. Otherwise you cannot give the guarantee I pointed out above.
Post by Martin Pitt
Post by Lennart Poettering
Also: *why* for heaven's sake? Just add --no-block when you are
running from these contexts, and all is good. I really don't get why
you don't want to do that.
Well, I do, but special-casing that on boot/shutdown isn't enough (you
can have other bigger transactions which lock up); and it's rather
expensive to do this via multiple CLI tools and D-BUS calls rather
than in the manager itself. So I was pondering how to get this effect
much more efficiently in manager.c instead of the shell wrappers
calling systemctl.
Well, universal deadlock detection is not possible. That would mean
solving the halting problem, which is impossible.

I mean, if you want precise sysv semantics you can just use
--job-mode=ignore-dependencies always, since sysv ignore all deps too
when sysv scripts are involved.

And two bus calls instead of one, what's the problem with that? You
are not winning speed records with shell in the loop anyway, and two
or one bus call are entirely irrelevant.
Post by Martin Pitt
Post by Lennart Poettering
On Fedora, these hook scripts use --no-block correctly, and I am
completely puzzled why you don't want this on Debian!
Mostly because we don't currently have the luxury of having only
systemd :-)
Dude! Come on!

As I keep repeating: please add this to the sysv compat glue, not to
PID 1! Add it to your service command, and all is good.
Post by Martin Pitt
Anyway, I think your stanza on this is clear. We'll keep the hacks on
the Debian side, if that's not a problem for other distros.
I'd strongly encourage you to work around this on client side in the
sysv glue, not breaking the guarantess that systemd-aware code needs
when issuing commands.
Post by Martin Pitt
Thanks for the discussion and the hint with --no-block!
Please, think about this. Fix this in the sysv client code once, don't
break proper systemd clients forever.

If you break the engine like this this will reflect back on us
upstream. On Fedora and everywhere else this:

# sed -i -e s/foobar/waldo/ /etc/mydaemon.conf
# systemctl reload mydaemon.conf

will have the excepected effect of making sure that the config changes
made to mydaemon.conf are in effect when systemctl reload. With your
hack however that guarantee is lost when the daemon is still in the
process of being started or being reloaded.

You are not helping anyone with this, you are just breaking things.

I really don't udnerstand why you don't want to change the
/usr/sbin/service code. Why precisely are you unwilling to touch that
code?

Lennart
--
Lennart Poettering, Red Hat
Martin Pitt
2015-02-04 16:45:20 UTC
Permalink
Post by Lennart Poettering
Sure, I can only recommend again: in the the glue code that calls out
to "systemctl" from "service", you can add the code to use --no-block
or --job-mode=ignore-dependencies , if you notice you are in shutdown
mode...
Yeah, I agree that given all the options that's the best heuristics
for now.
Post by Lennart Poettering
Post by Martin Pitt
Post by Lennart Poettering
Modern code, that talks directly to systemctl, that uses hook scripts,
should always use --no-block for these cases.
Yeah, for sure. We just don't have that luxury easily, as first these
scripts don't call systemctl but "service" (for the Debianists: or
invoke-rc.d, but same difference), it will take some time to find
all these occurrences, and people will hate us for having to add stuff
like
if (running under systemd)
systemctl --no-block reload foo
else
service foo reload
I am not proposing anything like that.
Right, this was just a reply to "modern code that talks directly to
systemctl", and we can't do that while we support more than one init
system. Sorry for the misunderstanding.
Post by Lennart Poettering
Post by Martin Pitt
Post by Lennart Poettering
Post by Martin Pitt
- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?
For the record, this was already discussed here:
http://lists.freedesktop.org/archives/systemd-devel/2014-July/021457.html
continuation:
http://lists.freedesktop.org/archives/systemd-devel/2014-August/022048.html
Post by Lennart Poettering
I mean, if you want precise sysv semantics you can just use
--job-mode=ignore-dependencies always, since sysv ignore all deps too
when sysv scripts are involved.
"Always" sounds rather unsafe to me, as we would totally ignore a
unit's Requires/Wants= then. Doing it if !systemctl is-system-running
sounds ok.
Post by Lennart Poettering
I'd strongly encourage you to work around this on client side in the
sysv glue, not breaking the guarantess that systemd-aware code needs
when issuing commands.
Yes, I prefer that as well. (That's what I meant with "distro side"..)

Thanks,

Martin
--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
Uoti Urpala
2015-02-04 18:19:47 UTC
Permalink
Post by Lennart Poettering
Post by Martin Pitt
Post by Lennart Poettering
Post by Martin Pitt
- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?
No, this would open up the door for races. The guarantee we give
around blocking operations, is that by the time they return the
operation or an equivalent has been executed. More specifically, if
you start a service, and it is in "starting", and then issue a
"reload" or "restart", and it returns you *know* that the
configuration that was on disk at the time you issued the "reload" or
"restart" -- or a newer one -- is in place. If you'd suppress the
reload/restart in this case, then you will not get that guarantee,
because the configuration ultimately loaded might be the one from the
time the daemon was first put into starting mode at.
You're missing an essential point here: there's a distinction between
skipping reloads for services which have not not been dispatched, and
skipping reloads for services for which startup code is already running
(and may be using existing configuration) but which have not reached
full "running" status yet.

The former is the correct behavior (but currently handled wrong by
systemd!), and never causes races. Only the latter can cause races like
described above.

Fixing the systemd semantics should fix most of the bootup deadlock
cases. This is not a "sysv workaround" or anything like that. The
current systemd semantics are wrong and undesirable for new code,
regardless of any legacy compatibility issues. Fixing them would give
semantics that are more logically correct and work better in practice.

IIRC the smbd.service case was just a buggy circular service definition
and can not be reasonably fixed by any systemd-side semantics change; if
I remember that correctly, it should not be used as an example in any
discussion of systemd-side changes.
Post by Lennart Poettering
Post by Martin Pitt
Hm, I don't quite understand this. If you reload a service which isn't
currently running, systemctl will fail. It's just currently going to
enqueue the reload request, and executing the job will fail due to the
"not active" check in unit_reload(). The problem with that is just
that systemctl doesn't fail immediately with "the unit is not active",
but blocks on the queued request. So if you don't have pending jobs,
the net result is the same, but with pending jobs you can easily get a
deadlock.
Again, if a "start" job is already queued and in progress of being
dispatched, we need to queue the reload job, to get the right
guarantess.
It's not that hard to see, is it?
You're correct about the "in progress of being dispatched" case, but the
problem case is when systemd incorrectly blocks when no client side code
using old configuration has been actually dispatched yet (only a queued
start inside systemd). The latter systemd misbehavior causes the
deadlocks. The "in progress of being dispatched" case typically does not
cause deadlocks, because the already running startup will normally
finish without blocking on anything, and then the new reload can run, so
there's no lock.
Post by Lennart Poettering
We must execute the jobs in order. If there's start job in progress,
or a reload job, and we queue a reload job, then we need to wait for
the start job or reload job to finish, to begin with the new reload
job. Otherwise you cannot give the guarantee I pointed out above.
Again, that's literally correct but talking about the wrong case. The
relevant case is the one where there ISN'T a "job in progress", only a
queued one.


This was discussed before, last mail being:
http://lists.freedesktop.org/archives/systemd-devel/2014-October/024612.html
(no replies to that one).
Lennart Poettering
2015-02-04 18:36:09 UTC
Permalink
Post by Uoti Urpala
Post by Lennart Poettering
Post by Martin Pitt
- Don't enqueue a reload if the service to be reloaded isn't running.
E. g. postfix.service "inactive/dead" in
https://bugs.debian.org/635777 or smbd.service "start/waiting" in
https://launchpad.net/bugs/1417010. This would completely avoid
the deadlock in most situations already, and doesn't change the
semantics for working use cases either, so this should even be
applicable for upstream?
No, this would open up the door for races. The guarantee we give
around blocking operations, is that by the time they return the
operation or an equivalent has been executed. More specifically, if
you start a service, and it is in "starting", and then issue a
"reload" or "restart", and it returns you *know* that the
configuration that was on disk at the time you issued the "reload" or
"restart" -- or a newer one -- is in place. If you'd suppress the
reload/restart in this case, then you will not get that guarantee,
because the configuration ultimately loaded might be the one from the
time the daemon was first put into starting mode at.
You're missing an essential point here: there's a distinction between
skipping reloads for services which have not not been dispatched, and
skipping reloads for services for which startup code is already running
(and may be using existing configuration) but which have not reached
full "running" status yet.
The former is the correct behavior (but currently handled wrong by
systemd!), and never causes races. Only the latter can cause races like
described above.
These two cases aren't that different. If somebody pushes an
additional job into the queue that wants to run before the reload but
after the service is up you cannot ot flush out the reload just
because the service has not started yet...

Whether you change config in your current context, or you do so from a
new unit's context is no difference: we cannot move anything that is
supposed to happen after that change before it, and we cannot remove it
either...

There are some forms of coalescing possible, but we already do all of
the ones that are safe...
Post by Uoti Urpala
Fixing the systemd semantics should fix most of the bootup deadlock
cases. This is not a "sysv workaround" or anything like that. The
current systemd semantics are wrong and undesirable for new code,
regardless of any legacy compatibility issues. Fixing them would give
semantics that are more logically correct and work better in
practice.
No, totally not. THe current semantics give the necessary guarantees
that changing a config file from any context you like or queing a file
config change from any config you like, and then queuing a reload will
take effect, regardless if there's a job for the unit already queued,
running or anything else.

Lennart
--
Lennart Poettering, Red Hat
Uoti Urpala
2015-02-04 20:10:02 UTC
Permalink
Post by Lennart Poettering
Post by Uoti Urpala
You're missing an essential point here: there's a distinction between
skipping reloads for services which have not not been dispatched, and
skipping reloads for services for which startup code is already running
(and may be using existing configuration) but which have not reached
full "running" status yet.
The former is the correct behavior (but currently handled wrong by
systemd!), and never causes races. Only the latter can cause races like
described above.
These two cases aren't that different. If somebody pushes an
additional job into the queue that wants to run before the reload but
after the service is up you cannot ot flush out the reload just
because the service has not started yet...
I cannot parse what you're trying to say here, if it's anything
meaningful. Your "wants to run before the reload" sounds like you're
talking about guaranteeing that a reload NOT happen before something
else runs, but that would be nonsense - the "guarantee" would guarantee
nothing semantically relevant (if systemd only starts executing the
service binary *after* the reload has been queued, it cannot use any
pre-reload-order config at any point; there's no "guaranteed to use old
config" guarantee of any form possible!).
Post by Lennart Poettering
Whether you change config in your current context, or you do so from a
new unit's context is no difference: we cannot move anything that is
supposed to happen after that change before it, and we cannot remove it
either...
If no code from a service is currently running, it's already guaranteed
that every request issued to the service in the future will use the new
config (no old state exists, and any newly started process will
obviously load the new config). Thus the requirements for a reload are
already fulfilled; the operation is complete, and there is nothing more
to do. Unnecessary waiting only causes deadlocks for no benefit
whatsoever.
Post by Lennart Poettering
There are some forms of coalescing possible, but we already do all of
the ones that are safe...
This is not exactly "coalescing" - it's just immediately returning
success if there is no service code running (either in "running" state
or in startup state where a process already exists and could have read
the old config before it was changed).

Removing the current incorrect blocking and returning success
immediately is 100% safe, in the following strictly defined sense:
All requests handled by the service after "systemctl reload" has
returned will use a version of config equal or newer than the one that
was in effect when the reload call was started.

If you still want claim that removing the blocking would not be safe,
please try to construct a sequence of operations where such non-blocking
behavior would lead to failure (failure defined as: the service
processes a request using configuration older than what existed when
"reload" was requested). I'm confident that it is impossible to
construct such a counterexample.
Lennart Poettering
2015-02-04 20:57:36 UTC
Permalink
Post by Uoti Urpala
Post by Lennart Poettering
Post by Uoti Urpala
You're missing an essential point here: there's a distinction between
skipping reloads for services which have not not been dispatched, and
skipping reloads for services for which startup code is already running
(and may be using existing configuration) but which have not reached
full "running" status yet.
The former is the correct behavior (but currently handled wrong by
systemd!), and never causes races. Only the latter can cause races like
described above.
These two cases aren't that different. If somebody pushes an
additional job into the queue that wants to run before the reload but
after the service is up you cannot ot flush out the reload just
because the service has not started yet...
I cannot parse what you're trying to say here, if it's anything
meaningful.
No, usually what I am babbling is not meaningful at all...
Post by Uoti Urpala
Your "wants to run before the reload" sounds like you're
talking about guaranteeing that a reload NOT happen before something
else runs, but that would be nonsense - the "guarantee" would guarantee
nothing semantically relevant (if systemd only starts executing the
service binary *after* the reload has been queued, it cannot use any
pre-reload-order config at any point; there's no "guaranteed to use old
config" guarantee of any form possible!).
OK, let's try this again, with an example:

a) you have one service mydaemon.service

b) you have a preparation service called
mydaemon-convert-config.service that takes config from somewhere,
converts it into a suitable format for mydaemon.service's binary

Now, you change that config that is located somewhere, issue a restart
request for m-c-c.s, issue a reload request for mydaemon.service.

Now, something like this should always have the result that your
config change is applied to mydaemon.service. Regardless if
mydaemon.service's start was queued, is already started or is
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued, but that's really not the
case, because you first have to run m-c-c.s, before you can reload...

Lennart
--
Lennart Poettering, Red Hat
Reindl Harald
2015-02-04 21:25:24 UTC
Permalink
Post by Lennart Poettering
a) you have one service mydaemon.service
b) you have a preparation service called
mydaemon-convert-config.service that takes config from somewhere,
converts it into a suitable format for mydaemon.service's binary
Now, you change that config that is located somewhere, issue a restart
request for m-c-c.s, issue a reload request for mydaemon.service.
Now, something like this should always have the result that your
config change is applied to mydaemon.service. Regardless if
mydaemon.service's start was queued, is already started or is
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued
which is true

the config change *after* issue restart should not affect the already
pending (for whatever reason) restart because this is *unpredictable*
bahvior - if i expect that config change to get effective i would issue
"systemctl reload" *before* the restart command

if you say that is expected behavior than all the warnings about "you
need to enter systemctl deamon-reload because the unit file changed" are
pointless and you could just reload automagically all the time
Lennart Poettering
2015-02-04 21:31:16 UTC
Permalink
Post by Reindl Harald
Post by Lennart Poettering
a) you have one service mydaemon.service
b) you have a preparation service called
mydaemon-convert-config.service that takes config from somewhere,
converts it into a suitable format for mydaemon.service's binary
Now, you change that config that is located somewhere, issue a restart
request for m-c-c.s, issue a reload request for mydaemon.service.
Now, something like this should always have the result that your
config change is applied to mydaemon.service. Regardless if
mydaemon.service's start was queued, is already started or is
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued
which is true
the config change *after* issue restart should not affect the already
pending (for whatever reason) restart because this is *unpredictable*
bahvior - if i expect that config change to get effective i would issue
"systemctl reload" *before* the restart command
We don't make guarantees like that. This is UNIX, we have no
transaction file system.

We do make guarantees about that if you changed your config and issue
a reload, we will not drop that reload. We do not make guarantees
whether changing your config concurrently with the daemon confuses the
daemon or not. That's between the user and the daemon, we are not involved.

Lennart
--
Lennart Poettering, Red Hat
Reindl Harald
2015-02-04 21:42:36 UTC
Permalink
Post by Lennart Poettering
Post by Reindl Harald
Post by Lennart Poettering
a) you have one service mydaemon.service
b) you have a preparation service called
mydaemon-convert-config.service that takes config from somewhere,
converts it into a suitable format for mydaemon.service's binary
Now, you change that config that is located somewhere, issue a restart
request for m-c-c.s, issue a reload request for mydaemon.service.
Now, something like this should always have the result that your
config change is applied to mydaemon.service. Regardless if
mydaemon.service's start was queued, is already started or is
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued
which is true
the config change *after* issue restart should not affect the already
pending (for whatever reason) restart because this is *unpredictable*
bahvior - if i expect that config change to get effective i would issue
"systemctl reload" *before* the restart command
We don't make guarantees like that. This is UNIX, we have no
transaction file system.
We do make guarantees about that if you changed your config and issue
a reload, we will not drop that reload. We do not make guarantees
whether changing your config concurrently with the daemon confuses the
daemon or not. That's between the user and the daemon, we are not involved
i know that all but *you said* above "like this should always have the
result that your config change is applied to mydaemon.service.
Regardless if mydaemon.service's start was queued, is already started or
is currently being started"

that is all i referred to

no - my config change shoul *not* get applied when the daemon "is
currently being started" because that is aksing for luck and race
conditions and leads in unpredictable behavior

if you (systemd) know that the system is about shut down there is no
point in reload services at that time
Uoti Urpala
2015-02-04 21:48:32 UTC
Permalink
Post by Lennart Poettering
a) you have one service mydaemon.service
b) you have a preparation service called
mydaemon-convert-config.service that takes config from somewhere,
converts it into a suitable format for mydaemon.service's binary
Now, you change that config that is located somewhere, issue a restart
request for m-c-c.s, issue a reload request for mydaemon.service.
Now, something like this should always have the result that your
config change is applied to mydaemon.service. Regardless if
mydaemon.service's start was queued, is already started or is
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued, but that's really not the
case, because you first have to run m-c-c.s, before you can reload...
I do not see why that would cause any problems with removing the
blocking.

If you mean literally running "systemctl restart
mydaemon-convert-config.service; systemctl reload mydaemon.service" then
this should still work fine - the first "restart" will block until the
operation is complete and new config exists, and then the "reload"
guarantees that no old config is in use. However, I don't see why you'd
include the part about creating the new configuration via
mydaemon-convert-config.service in this case - the new configuration
already exists before any "reload" functionality is invoked, so why the
irrelevant complication of creating that configuration via another
service? It seems you are implicitly assuming some kind of parallel
execution of the restart and the reload?

If you mean something like "systemctl restart --no-block
mydaemon-convert-config.service; systemctl reload mydaemon.service", I
don't see why you'd ever /expect/ this to work with any reload semantics
- isn't this clear user error, and will be racy with current systemd
code just as much as the proposed fix? There are no ordering constraints
here, any more than there would be about starting two services and
expecting the first request to be started first. And in any case I'd
consider the semantics of reload to be "switch to configuration equal or
newer than what existed *when the reload was requested*", without any
guarantees that changes from operations queued but not finished before
calling reload would be taken into account.

So unless I completely misunderstood your example, it seems that this
does NOT demonstrate any problems with removing the blocking.
Uoti Urpala
2015-03-10 18:29:00 UTC
Permalink
Post by Uoti Urpala
Post by Lennart Poettering
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued, but that's really not the
case, because you first have to run m-c-c.s, before you can reload...
If you mean literally running "systemctl restart
...
Post by Uoti Urpala
So unless I completely misunderstood your example, it seems that this
does NOT demonstrate any problems with removing the blocking.
Discussion seems to have died again. How to proceed with fixing this? Is
there anything more I can clarify about why the current behavior (that
is, when service startup is queued, have "reload" requests block until
the service is up) is wrong and why the fix would be valid?
Lennart Poettering
2015-04-27 16:07:47 UTC
Permalink
On Wed, 04.02.15 23:48, Uoti Urpala (***@pp1.inet.fi) wrote:

Sorry for the late reply,
Post by Uoti Urpala
Post by Lennart Poettering
a) you have one service mydaemon.service
b) you have a preparation service called
mydaemon-convert-config.service that takes config from somewhere,
converts it into a suitable format for mydaemon.service's binary
Now, you change that config that is located somewhere, issue a restart
request for m-c-c.s, issue a reload request for mydaemon.service.
Now, something like this should always have the result that your
config change is applied to mydaemon.service. Regardless if
mydaemon.service's start was queued, is already started or is
currently being started. You are suggesting that the reload can
suppressed when a start is already enqueued, but that's really not the
case, because you first have to run m-c-c.s, before you can reload...
I do not see why that would cause any problems with removing the
blocking.
If you mean literally running "systemctl restart
mydaemon-convert-config.service; systemctl reload mydaemon.service" then
this should still work fine - the first "restart" will block until the
operation is complete and new config exists, and then the "reload"
guarantees that no old config is in use.
No, the commands you suggest above don't just enqueue the operations,
they enqueue them and then wait for them to finish. Of course, if you
synchronously wait for them to finish then all races are gone, but
this really should work without that, so that things can be enqueued
and work correctly.
Post by Uoti Urpala
However, I don't see why you'd
include the part about creating the new configuration via
mydaemon-convert-config.service in this case - the new configuration
already exists before any "reload" functionality is invoked, so why the
irrelevant complication of creating that configuration via another
service? It seems you are implicitly assuming some kind of parallel
execution of the restart and the reload?
Well, this is an example of the way people do this, and yes, i was
talking about "enqueuing", and that really means just that: it won't
be the client anymore that controls execution order, but it is solely
the queue and its semantics that enforce how things are run and what
guarantees are made.
Post by Uoti Urpala
If you mean something like "systemctl restart --no-block
mydaemon-convert-config.service; systemctl reload mydaemon.service", I
don't see why you'd ever /expect/ this to work with any reload semantics
- isn't this clear user error, and will be racy with current systemd
code just as much as the proposed fix?
Yupp, this is what I mean. (though I'd actually specify the --no-block
in the second command too, though this doesn't make much of a
difference...)

I am pretty sure that enqueueing these two commands should be sufficient
to get the config that was written out by the first service to be in
effect in the second service.
Post by Uoti Urpala
There are no ordering constraints
here, any more than there would be about starting two services and
expecting the first request to be started first.
hmm? if you start two services, and they are ordered against each
other, then yes, the second service should only be started after the
first one completed startup.
Post by Uoti Urpala
And in any case I'd consider the semantics of reload to be "switch
to configuration equal or newer than what existed *when the reload
was requested*", without any guarantees that changes from operations
queued but not finished before calling reload would be taken into
account.
The queue is really a work queue, and the After= and Before= deps
dictate how the work can be parallelized or needs to be serialized. As
such if i have 5 jobs enqueued that depend on each other, i need to
make sure they are executed in the order specified and can operateon
the results of the previous job.

I hope this makes sense...

Lennart
--
Lennart Poettering, Red Hat
Uoti Urpala
2015-05-01 20:35:51 UTC
Permalink
Post by Lennart Poettering
Post by Uoti Urpala
If you mean something like "systemctl restart --no-block
mydaemon-convert-config.service; systemctl reload mydaemon.service", I
don't see why you'd ever /expect/ this to work with any reload semantics
- isn't this clear user error, and will be racy with current systemd
code just as much as the proposed fix?
Yupp, this is what I mean. (though I'd actually specify the --no-block
in the second command too, though this doesn't make much of a
difference...)
Post by Uoti Urpala
And in any case I'd consider the semantics of reload to be "switch
to configuration equal or newer than what existed *when the reload
was requested*", without any guarantees that changes from operations
queued but not finished before calling reload would be taken into
account.
The queue is really a work queue, and the After= and Before= deps
dictate how the work can be parallelized or needs to be serialized. As
such if i have 5 jobs enqueued that depend on each other, i need to
make sure they are executed in the order specified and can operateon
the results of the previous job.
I hope this makes sense...
After those clarifications I believe I now understand what kind of
example case you meant, and it does now seem a meaningful case to
consider; however, I still think that you're wrong, as your example case
turns out to work fine and is not actually a counterexample to the kind
of changes I was talking about.

If I understood correctly, you're talking about a case where service B
has "After=A.service", both A and B have queued jobs where the B job is
a reload, and the queued job for A might change the configuration for B
(so the reload needs to happen after that); and you're worried that
immediately returning success for the reload could create a violation of
the "after job A" requirement. Is this reload property of "After"
documented anywhere? The code does seem to apply it to reloads, but
systemd.unit documentation only starts about start/stop. Anyway, when
you consider what actually happens with my suggested change, it turns
out that even these "After" semantics for reload still work.

The situation where my changes would result in different behavior is
when B has a start job queued, but no code for B is running yet, and you
request a reload for B; current code waits for the start of B before the
reload is considered complete, whereas my change makes the reload return
immediate success. This does not actually change the semantics above:
the only difference is when the reload operation is CONSIDERED COMPLETE,
there is NO difference in what operations are actually run or in which
order! [1] Current code merges RELOAD to existing START and returns
success for reload after START has completed, whereas my change returns
success immediately; but both run exactly the same START operation with
the same ordering constraints, which already ensure that it happens
after A.service (START already has the ordering constraints from
"After="; merging the RELOAD to START does not add any additional
ordering that START would not already have had).

[1] So this difference only really matters when something blocks to wait
until the reload completes.

Continue reading on narkive:
Loading...