Discussion:
No maintenance shell on local fs failure (/bin/sh: no such file or directory)
(too old to reply)
Frank Steiner
2015-11-23 13:22:30 UTC
Permalink
Hi,

on one of my serves I have a LVM that, due to an error in my lvm.conf,
couldn't be mounted on boot. But I didn't get a maintenance shell as
I usually got on other servers, the output just stopped after complaining
with
[ TIME ] Timed out waiting for device dev-mapper-exportraidcu...tcurry2.device.
[DEPEND] Dependency failed for /export.
[DEPEND] Dependency failed for Local File Systems.
...

So I debugged this and could grep the output from a serial console.
It seem that the rootfs vanishes at some point, because I see such
messages:

About to execute: /bin/echo -e 'Welcome to emergency mode! After logging in, type "journalctl -xb" to view\nsystem logs, "systemctl reboot" to reboot, "systemctl default" to try again\nto boot into default mode.'
Forked /bin/echo as 4290
Failed at step CHDIR spawning /bin/echo: No such file or directory
...

and the same for /bin/sh etc. But earlier in the log all kinds of
commands are started from /bin and /usr/bin. I wasn't able to find
a message in the log indicating why the rootfs seems to disappear.
I attach the output, hoping that someone might be able to get a clue
what's going wrong. The fstab looks like this:

/dev/sda2 swap swap defaults 0 0
/dev/sda1 / btrfs defaults 0 0
/dev/mapper/exportraidcurry-exportcurry2 /export btrfs defaults 0 0
/dev/sda5 /tmp btrfs defaults 0 0
/dev/sda6 /var btrfs defaults 0 0

where, for testing now, I just added the "2" to ...-exportcurry so that
the LVM doesn't exist.

I'd be very thankful for any hint pointing me to the right direction.
At the moment I've really no clue whats happening :-(

cu,
Frank
--
Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17 Phone: +49 89 2180-4049
80333 Muenchen, Germany Fax: +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
Frank Steiner
2016-01-15 10:40:29 UTC
Permalink
In case someone else steps in this;
Post by Frank Steiner
Hi,
on one of my serves I have a LVM that, due to an error in my lvm.conf,
couldn't be mounted on boot. But I didn't get a maintenance shell as
I usually got on other servers, the output just stopped after complaining
with
[ TIME ] Timed out waiting for device dev-mapper-exportraidcu...tcurry2.device.
[DEPEND] Dependency failed for /export.
[DEPEND] Dependency failed for Local File Systems.
...
So I debugged this and could grep the output from a serial console.
It seem that the rootfs vanishes at some point, because I see such
About to execute: /bin/echo -e 'Welcome to emergency mode! After logging in, type "journalctl -xb" to view\nsystem logs, "systemctl reboot" to reboot, "systemctl default" to try again\nto boot into default mode.'
Forked /bin/echo as 4290
Failed at step CHDIR spawning /bin/echo: No such file or directory
...
This didn't mean that /bin/echo was not found, but the CHDIR failed!
That was caused by /root being a symbolic link to /local/home/root
(we strictly separate any user data including root from the system, so
that we can reformat / on reinstalling or upgrading the system).
But /usr/lib/systemd/system/emergency.service contains:

[Service]
Environment=HOME=/root
WorkingDirectory=/root
...

and because /root was a link pointing to nowhere (/local is not mounted
when starting the emergency service, only /) the chdir failed.

I solved this buy changing the lines to
Environment=HOME=/
WorkingDirectory=/

Although we could change /root to being an empty dir and using a bind mount,
the links are safer at the moment due to several problems with bind mounts,
including IBMs Tivoli not being able to handle them correctly.

I don't think that having /root should be a mandatory requirement
for starting an emergency shell, so maybe the service file should
indeed be changed in the official release?

cu,
Frank
--
Dipl.-Inform. Frank Steiner Web: http://www.bio.ifi.lmu.de/~steiner/
Lehrstuhl f. Bioinformatik Mail: http://www.bio.ifi.lmu.de/~steiner/m/
LMU, Amalienstr. 17 Phone: +49 89 2180-4049
80333 Muenchen, Germany Fax: +49 89 2180-99-4049
* Rekursion kann man erst verstehen, wenn man Rekursion verstanden hat. *
Reindl Harald
2016-01-15 10:52:54 UTC
Permalink
Post by Frank Steiner
This didn't mean that /bin/echo was not found, but the CHDIR failed!
That was caused by /root being a symbolic link to /local/home/root
(we strictly separate any user data including root from the system, so
that we can reformat / on reinstalling or upgrading the system).
that is a BIG MISTAKE and there is a good reason that by default the
homedir auf root is *not* below /home just because /home is typically a
own partition

the reason is to avoid the problems you had and make sure that roots
environment is available even when all disks but the rootfs are failing
Post by Frank Steiner
[Service]
Environment=HOME=/root
WorkingDirectory=/root
...
and because /root was a link pointing to nowhere (/local is not mounted
when starting the emergency service, only /) the chdir failed.
I solved this buy changing the lines to
Environment=HOME=/
WorkingDirectory=/
Andrei Borzenkov
2016-01-15 11:18:36 UTC
Permalink
Post by Frank Steiner
This didn't mean that /bin/echo was not found, but the CHDIR failed!
That was caused by /root being a symbolic link to /local/home/root
(we strictly separate any user data including root from the system, so
that we can reformat / on reinstalling or upgrading the system).
that is a BIG MISTAKE and there is a good reason that by default the homedir
auf root is *not* below /home just because /home is typically a own
partition
While this is true, the concern is valid. We really want to minimize
chances that emergency service can fail and it really does not need to
be /root at this point. So I would vote for changing it to / as well.
the reason is to avoid the problems you had and make sure that roots
environment is available even when all disks but the rootfs are failing
Well, rm -rf /root; or /root removed as result of fsck'ing /. is it
enough reason to fail emergency service?
Jonathan de Boyne Pollard
2016-01-16 09:52:16 UTC
Permalink
About to execute: /bin/echo -e 'Welcome to emergency mode! After logging in, ...
Forked /bin/echo as 4290
Failed at step CHDIR spawning /bin/echo: No such file or directory
[Service]
Environment=HOME=/root
WorkingDirectory=/root
...
ExecStartPre=-/bin/echo -e 'Welcome to emergency mode! After logging in, ...
ExecStart=-/bin/sh -c "/sbin/sulogin; /usr/bin/systemctl --fail --no-block default"
...
[...] That was caused by |/root| being a symbolic link to
|/local/home/root| [...]
[...] So I would vote for changing it to |/| as well.
|
/bin/echo| certainly doesn't care what its working directory is.

When it comes to the later |ExecStart|, which hasn't yet been reached in
the scenario at hand, the convention is that the "userinit" part of the
procedure (i.e. the part that sets up the environment variables like
|HOME|||and changes working directory) is the responsibility of the
login program. The |@SULOGIN@| that is Miquel van Smoorenburg's
|sulogin| (as exemplified above) is no exception to this.

It /itself/ reads the home directory out of the password database, and
changes directory to it and sets the |HOME| environment variable to it;
without need for the invoking systemd unit to do so, and with less
drastic failure modes in the absence of root's home directory or when
if (chdir(pwd->pw_dir) < 0) {
if (chdir("/") < 0)
fprintf(stderr, "sulogin: change of working directory failed: %m\n\r");
}
Lennart Poettering
2016-01-17 16:19:38 UTC
Permalink
Post by Frank Steiner
In case someone else steps in this;
Post by Frank Steiner
Hi,
on one of my serves I have a LVM that, due to an error in my lvm.conf,
couldn't be mounted on boot. But I didn't get a maintenance shell as
I usually got on other servers, the output just stopped after complaining
with
[ TIME ] Timed out waiting for device dev-mapper-exportraidcu...tcurry2.device.
[DEPEND] Dependency failed for /export.
[DEPEND] Dependency failed for Local File Systems.
...
So I debugged this and could grep the output from a serial console.
It seem that the rootfs vanishes at some point, because I see such
About to execute: /bin/echo -e 'Welcome to emergency mode! After logging in, type "journalctl -xb" to view\nsystem logs, "systemctl reboot" to reboot, "systemctl default" to try again\nto boot into default mode.'
Forked /bin/echo as 4290
Failed at step CHDIR spawning /bin/echo: No such file or directory
...
This didn't mean that /bin/echo was not found, but the CHDIR failed!
That was caused by /root being a symbolic link to /local/home/root
(we strictly separate any user data including root from the system, so
that we can reformat / on reinstalling or upgrading the system).
Well, it's a really bad idea to move the root user's home directory
out of the root dir, and we don't really support that. /root is
hard-coded as the root user's home directory at a veriety of places in
systemd and other packages, hence you'll run into problems with this
not just here but at many other places too. Having /root on an
external partition in particular is doomed, precisely because it
creates the problems you just ran into. There's a reason why Linux
distributions place root's home directory at /root, rather than
/home/root, after all...

That said, we probably shouldn't choke on this in this case, hence
I'll prep a patch that changes WorkingDirectory=/root for those
services to WorkingDirectory=-/root, i.e. make the non-existence of
the specified working directory non-fatal.

Lennart
--
Lennart Poettering, Red Hat
Continue reading on narkive:
Loading...