Discussion:
[systemd-devel] Socket activated SSHD service showing up as a failure when the client connection fails
Roger Qiu
2014-07-17 02:40:59 UTC
Permalink
Hello everybody,

I recently discovered that when using a socket activated SSHD service on
NixOS, it will show up as a failure on `sudo systemctl status` when the
client fails the connection.

The details are in this issue: https://github.com/NixOS/nixpkgs/issues/3279

Basically all I need to do is telnet to my VM and of course fail the
protocol, then run `sudo systemctl status` on the VM, and see 1 extra
failure.

Is this correct behaviour for a service to be considered a failure just
because the client fails the connection protocol?

Thanks,
Rger
Zbigniew Jędrzejewski-Szmek
2014-07-17 02:51:13 UTC
Permalink
Post by Roger Qiu
Hello everybody,
I recently discovered that when using a socket activated SSHD
service on NixOS, it will show up as a failure on `sudo systemctl
status` when the client fails the connection.
The details are in this issue: https://github.com/NixOS/nixpkgs/issues/3279
Basically all I need to do is telnet to my VM and of course fail the
protocol, then run `sudo systemctl status` on the VM, and see 1
extra failure.
Is this correct behaviour for a service to be considered a failure
just because the client fails the connection protocol?
sshd chooses to exit with a failure code in this case. What we should really
do is add SuccessExitStatus= setting to the ***@.service and ignore that
code. Is it some specific value?

Zbyszek
Roger Qiu
2014-07-17 02:53:01 UTC
Permalink
Hello,

This is the log of the status codes:

```
● ***@3-10.0.2.15:22-10.0.2.2:51014.service - SSH Daemon (10.0.2.2:51014)
Loaded: loaded
(/nix/store/wr8r8jrj204q3i0v4vfav8m63ssnv8w1-unit/***@.service)
Active: failed (Result: exit-code) since Thu 2014-07-17 02:24:01
UTC; 2min 21s ago
Process: 3102
ExecStart=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin/sshd
-i -f /nix/store/ai2a554az21b5zhd1kamcznbim4gd924-sshd_config
(code=exited, status=255)
Process: 3100
ExecStartPre=/nix/store/i5wnidc4707k3pgcbhyjq3qb4ajgyx5n-unit-script/bin/sshd@-pre-start
(code=exited, status=0/SUCCESS)
Main PID: 3102 (code=exited, status=255)

Jul 17 02:23:55 matrix-node systemd[1]: Started SSH Daemon (10.0.2.2:51014).
Jul 17 02:24:01 matrix-node systemd[1]:
***@3-10.0.2.15:22-10.0.2.2:51014.service: main process exited,
code=exited, status=255/n/a
Jul 17 02:24:01 matrix-node systemd[1]: Unit
***@3-10.0.2.15:22-10.0.2.2:51014.service entered failed state.
```

Perhaps it's 255?

Thanks,
Roger
Post by Zbigniew Jędrzejewski-Szmek
Post by Roger Qiu
Hello everybody,
I recently discovered that when using a socket activated SSHD
service on NixOS, it will show up as a failure on `sudo systemctl
status` when the client fails the connection.
The details are in this issue: https://github.com/NixOS/nixpkgs/issues/3279
Basically all I need to do is telnet to my VM and of course fail the
protocol, then run `sudo systemctl status` on the VM, and see 1
extra failure.
Is this correct behaviour for a service to be considered a failure
just because the client fails the connection protocol?
sshd chooses to exit with a failure code in this case. What we should really
code. Is it some specific value?
Zbyszek
Zbigniew Jędrzejewski-Szmek
2014-07-17 02:55:36 UTC
Permalink
Post by Roger Qiu
Hello,
```
Loaded: loaded
Active: failed (Result: exit-code) since Thu 2014-07-17 02:24:01
UTC; 2min 21s ago
Process: 3102 ExecStart=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin/sshd
-i -f /nix/store/ai2a554az21b5zhd1kamcznbim4gd924-sshd_config
(code=exited, status=255)
(code=exited, status=0/SUCCESS)
Main PID: 3102 (code=exited, status=255)
Jul 17 02:23:55 matrix-node systemd[1]: Started SSH Daemon (10.0.2.2:51014).
code=exited, status=255/n/a
Jul 17 02:24:01 matrix-node systemd[1]: Unit
```
Perhaps it's 255?
That looks like -1, but whatever. Does it work if
you add SuccessExitStatus=255?

Zbyszek
Roger Qiu
2014-07-17 03:37:12 UTC
Permalink
You mean something like this:

```
$ cat /nix/store/wr8r8jrj204q3i0v4vfav8m63ssnv8w1-unit/***@.service
[Unit]
Description=SSH Daemon

[Service]
Environment="LD_LIBRARY_PATH=/nix/store/sk3017fxx2vhj69saizk5634y3w6hqhn-systemd-212/lib"
Environment="LOCALE_ARCHIVE=/nix/store/5jxnz56h6ya1wy0dvcll65vkkx9679g4-glibc-locales-2.19/lib/locale/locale-archive"
Environment="PATH=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/bin:/nix/store/qlg0baw29r7fj4isc2x47c05d8q7br2j-gawk-4.1.0/bin:/nix/store/xdswdgz11w9lm9n9bzb6spl7dj45s2x2-coreutils-8.21/bin:/nix/store/sls52f6fkpqfs0kj4w0k525v57zrsz9x-findutils-4.4.2/bin:/nix/store/937i66g191shz0hwzll61vmi0zvy3gwy-gnugrep-2.14/bin:/nix/store/93wabgii15q4d74fqgcpfka1qy42igr4-gnused-4.2.2/bin:/nix/store/sk3017fxx2vhj69saizk5634y3w6hqhn-systemd-212/bin:/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin:/nix/store/qlg0baw29r7fj4isc2x47c05d8q7br2j-gawk-4.1.0/sbin:/nix/store/xdswdgz11w9lm9n9bzb6spl7dj45s2x2-coreutils-8.21/sbin:/nix/store/sls52f6fkpqfs0kj4w0k525v57zrsz9x-findutils-4.4.2/sbin:/nix/store/937i66g191shz0hwzll61vmi0zvy3gwy-gnugrep-2.14/sbin:/nix/store/93wabgii15q4d74fqgcpfka1qy42igr4-gnused-4.2.2/sbin:/nix/store/sk3017fxx2vhj69saizk5634y3w6hqhn-systemd-212/sbin"


X-StopIfChanged=false
ExecStart=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin/sshd
-i -f /nix/store/ai2a554az21b5zhd1kamcznbim4gd924-sshd_config
ExecStartPre=/nix/store/i5wnidc4707k3pgcbhyjq3qb4ajgyx5n-unit-script/bin/sshd@-pre-start
KillMode=process
StandardInput=socket
SuccessExitStatus=255
```

If I add that in, what about the normal exit status of "0"? Would that
override the successful exit code of 0?

Thanks,
Roger
Post by Zbigniew Jędrzejewski-Szmek
Post by Roger Qiu
Hello,
```
Loaded: loaded
Active: failed (Result: exit-code) since Thu 2014-07-17 02:24:01
UTC; 2min 21s ago
Process: 3102 ExecStart=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin/sshd
-i -f /nix/store/ai2a554az21b5zhd1kamcznbim4gd924-sshd_config
(code=exited, status=255)
(code=exited, status=0/SUCCESS)
Main PID: 3102 (code=exited, status=255)
Jul 17 02:23:55 matrix-node systemd[1]: Started SSH Daemon (10.0.2.2:51014).
code=exited, status=255/n/a
Jul 17 02:24:01 matrix-node systemd[1]: Unit
```
Perhaps it's 255?
That looks like -1, but whatever. Does it work if
you add SuccessExitStatus=255?
Zbyszek
Zbigniew Jędrzejewski-Szmek
2014-07-17 11:48:28 UTC
Permalink
Post by Roger Qiu
```
[Unit]
Description=SSH Daemon
[Service]
Environment="LD_LIBRARY_PATH=/nix/store/sk3017fxx2vhj69saizk5634y3w6hqhn-systemd-212/lib"
Environment="LOCALE_ARCHIVE=/nix/store/5jxnz56h6ya1wy0dvcll65vkkx9679g4-glibc-locales-2.19/lib/locale/locale-archive"
Environment="PATH=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/bin:/nix/store/qlg0baw29r7fj4isc2x47c05d8q7br2j-gawk-4.1.0/bin:/nix/store/xdswdgz11w9lm9n9bzb6spl7dj45s2x2-coreutils-8.21/bin:/nix/store/sls52f6fkpqfs0kj4w0k525v57zrsz9x-findutils-4.4.2/bin:/nix/store/937i66g191shz0hwzll61vmi0zvy3gwy-gnugrep-2.14/bin:/nix/store/93wabgii15q4d74fqgcpfka1qy42igr4-gnused-4.2.2/bin:/nix/store/sk3017fxx2vhj69saizk5634y3w6hqhn-systemd-212/bin:/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin:/nix/store/qlg0baw29r7fj4isc2x47c05d8q7br2j-gawk-4.1.0/sbin:/nix/store/xdswdgz11w9lm9n9bzb6spl7dj45s2x2-coreutils-8.21/sbin:/nix/store/sls52f6fkpqfs0kj4w0k525v57zrsz9x-findutils-4.4.2/sbin:/nix/store/937i66g191shz0hwzll61vmi0zvy3gwy-gnugrep-2.14/sbin:/nix/store/93wabgii15q4d74fqgcpfka1qy42igr4-gnused-4.2.2/sbin:/nix/store/sk3017fxx2vhj69saizk5634y3w6hqhn-systemd-212/sbin"
X-StopIfChanged=false
ExecStart=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin/sshd
-i -f /nix/store/ai2a554az21b5zhd1kamcznbim4gd924-sshd_config
KillMode=process
StandardInput=socket
SuccessExitStatus=255
Yes.
Post by Roger Qiu
If I add that in, what about the normal exit status of "0"? Would
that override the successful exit code of 0?
No, it's always in addition to code 0 and signals SIGHUP, SIGINT, SIGTERM, and SIGPIPE.

Zbyszek
Roger Qiu
2014-07-17 03:49:14 UTC
Permalink
I've googled around and saw that 255 error code comes up a lot. But most
resources talked about "ssh" not necessarily the "sshd". If we ignore
255 code, is it possible we're also ignoring some other real errors, and
not just the client failing the connection? Basically I would like sshd
to report an error, if it is indeed an error from the host's side, not
the client's side.
Post by Zbigniew Jędrzejewski-Szmek
Post by Roger Qiu
Hello,
```
Loaded: loaded
Active: failed (Result: exit-code) since Thu 2014-07-17 02:24:01
UTC; 2min 21s ago
Process: 3102 ExecStart=/nix/store/2wc50fcn54axkg2kk71jm2r5h0w5rbh6-openssh-6.6p1/sbin/sshd
-i -f /nix/store/ai2a554az21b5zhd1kamcznbim4gd924-sshd_config
(code=exited, status=255)
(code=exited, status=0/SUCCESS)
Main PID: 3102 (code=exited, status=255)
Jul 17 02:23:55 matrix-node systemd[1]: Started SSH Daemon (10.0.2.2:51014).
code=exited, status=255/n/a
Jul 17 02:24:01 matrix-node systemd[1]: Unit
```
Perhaps it's 255?
That looks like -1, but whatever. Does it work if
you add SuccessExitStatus=255?
Zbyszek
Lennart Poettering
2014-08-14 22:33:16 UTC
Permalink
Post by Roger Qiu
I've googled around and saw that 255 error code comes up a lot. But
most resources talked about "ssh" not necessarily the "sshd". If we
ignore 255 code, is it possible we're also ignoring some other real
errors, and not just the client failing the connection? Basically I
would like sshd to report an error, if it is indeed an error from
the host's side, not the client's side.
In general I'd recommend to use ExecStart=-/usr/sbin/sshd...,
i.e. with the "-" between the = and the /. This tells systemd to
completely ignore the exit/failure status of the process.

The reason for this is that if an sshd instance fails it would stay
around in failed state. Since you use per-connection instances this
might be able to give an attacker the chance to create tons of failed
services, until systemd refuses.

I'd hence recommend to always ignore errors for services that are
instantiated in theoretic unbounded numbers. You'll still get them
reported in the logs, but I'd recommend not making them enter a service
into "failed" mode.

I hope that makes some sense,

Lennart
--
Lennart Poettering, Red Hat
Loading...