Discussion:
[systemd-devel] DHCP6 client failing when /etc is mounted as overlayfs
Alessandro Tagliapietra
2021-06-01 07:06:44 UTC
Permalink
Hello everyone,

I'm using yocto to create a custom linux image for a raspberry pi.
We have an "agent" that writes /etc/systemd/network/20-eth.network when the
final user wants to have a static IP address and we remove the file when
they switch back to DHCP.

After creating/deleting the file above we run `networkctl reload &&
networkctl reconfigure eth0`.
We mount the overlayfs with a custom .mount unit.

We've noticed that DHCP works fine if systemd-networkd starts before we
mount the overlayfs but it doesn't if systemd-networkd is
restarted/reconfigured after the folder is mounted or started after the
overlay .mount unit.

Every interface DHCP fails with:

DHCPv6 CLIENT: Failed to set DUID-EN: No medium found
eth0: DHCP6 CLIENT: Failed to set DUID: No medium found

We're using systemd 246 (246.9+) and we're planning on upgrading to 247.6
soon.
Is this expected? Is it somehow incompatible with overlayfs?
Or maybe it's not related to the /etc filesystem and it's something else?

Any help is really appreciated.

Thanks

--
Alessandro Tagliapietra
Mantas Mikulėnas
2021-06-01 07:12:50 UTC
Permalink
On Tue, Jun 1, 2021 at 10:07 AM Alessandro Tagliapietra <
Post by Alessandro Tagliapietra
Hello everyone,
I'm using yocto to create a custom linux image for a raspberry pi.
We have an "agent" that writes /etc/systemd/network/20-eth.network when
the final user wants to have a static IP address and we remove the file
when they switch back to DHCP.
After creating/deleting the file above we run `networkctl reload &&
networkctl reconfigure eth0`.
We mount the overlayfs with a custom .mount unit.
We've noticed that DHCP works fine if systemd-networkd starts before we
mount the overlayfs but it doesn't if systemd-networkd is
restarted/reconfigured after the folder is mounted or started after the
overlay .mount unit.
DHCPv6 CLIENT: Failed to set DUID-EN: No medium found
eth0: DHCP6 CLIENT: Failed to set DUID: No medium found
My guess is that it's related to /etc/machine-id somehow becoming
inaccessible, since networkd's DUID-EN (DUIDType=vendor) is based on that.
--
Mantas Mikulėnas
Alessandro Tagliapietra
2021-06-01 16:42:31 UTC
Permalink
Thanks for helping Mantas,

What I saw is:
- before first boot /etc/machine-id is empty (and I think that's expected)
- right after boot, /etc/machine-id isn't writable because the root fs is
mounted as readonly from fstab
- after the /etc overlay is mounted /etc/machine-id should still be the
one from the underlying filesystem and at this point is also writable,
however it's still empty

During boot I see:

[ 3.577477] systemd[1]: Initializing machine ID from random generator.
[ 3.584284] systemd[1]: Installed transient /etc/machine-id file.

however /etc/machine-id shouldn't be writable at that point, what should I
do? Make our overlay mount unit depend on whatever service is generating
machine-id and make sure our mount happens before the generation of
machine-id?

Thanks

--
Alessandro Tagliapietra
Post by Mantas Mikulėnas
On Tue, Jun 1, 2021 at 10:07 AM Alessandro Tagliapietra <
Post by Alessandro Tagliapietra
Hello everyone,
I'm using yocto to create a custom linux image for a raspberry pi.
We have an "agent" that writes /etc/systemd/network/20-eth.network when
the final user wants to have a static IP address and we remove the file
when they switch back to DHCP.
After creating/deleting the file above we run `networkctl reload &&
networkctl reconfigure eth0`.
We mount the overlayfs with a custom .mount unit.
We've noticed that DHCP works fine if systemd-networkd starts before we
mount the overlayfs but it doesn't if systemd-networkd is
restarted/reconfigured after the folder is mounted or started after the
overlay .mount unit.
DHCPv6 CLIENT: Failed to set DUID-EN: No medium found
eth0: DHCP6 CLIENT: Failed to set DUID: No medium found
My guess is that it's related to /etc/machine-id somehow becoming
inaccessible, since networkd's DUID-EN (DUIDType=vendor) is based on that.
--
Mantas Mikulėnas
Mantas Mikulėnas
2021-06-01 20:09:10 UTC
Permalink
On Tue, Jun 1, 2021, 19:42 Alessandro Tagliapietra <
Post by Alessandro Tagliapietra
Thanks for helping Mantas,
- before first boot /etc/machine-id is empty (and I think that's expected)
- right after boot, /etc/machine-id isn't writable because the root fs is
mounted as readonly from fstab
- after the /etc overlay is mounted /etc/machine-id should still be the
one from the underlying filesystem and at this point is also writable,
however it's still empty
[ 3.577477] systemd[1]: Initializing machine ID from random generator.
[ 3.584284] systemd[1]: Installed transient /etc/machine-id file.
however /etc/machine-id shouldn't be writable at that point, what should I
do?
If it's not writable at that point, systemd will *mount* a temporary
writable file on top of it, and will generate an ID that's temporary for
that boot.

It's possible that your overlay goes on top of that and provides its own
empty machine-id file again...

Make our overlay mount unit depend on whatever service is generating
Post by Alessandro Tagliapietra
machine-id and make sure our mount happens before the generation of
machine-id?
That might work, and would allow the machine-id and thus the DUID to be
persistent.

As an alternative you could tell networkd to use DUID-LLT (?), which
doesn't need the machine-id and just uses the MAC address, but there may be
other things which use the machine-id anyway...
Post by Alessandro Tagliapietra
Thanks
--
Alessandro Tagliapietra
Post by Mantas Mikulėnas
On Tue, Jun 1, 2021 at 10:07 AM Alessandro Tagliapietra <
Post by Alessandro Tagliapietra
Hello everyone,
I'm using yocto to create a custom linux image for a raspberry pi.
We have an "agent" that writes /etc/systemd/network/20-eth.network when
the final user wants to have a static IP address and we remove the file
when they switch back to DHCP.
After creating/deleting the file above we run `networkctl reload &&
networkctl reconfigure eth0`.
We mount the overlayfs with a custom .mount unit.
We've noticed that DHCP works fine if systemd-networkd starts before we
mount the overlayfs but it doesn't if systemd-networkd is
restarted/reconfigured after the folder is mounted or started after the
overlay .mount unit.
DHCPv6 CLIENT: Failed to set DUID-EN: No medium found
eth0: DHCP6 CLIENT: Failed to set DUID: No medium found
My guess is that it's related to /etc/machine-id somehow becoming
inaccessible, since networkd's DUID-EN (DUIDType=vendor) is based on that.
--
Mantas Mikulėnas
Alessandro Tagliapietra
2021-06-01 20:55:56 UTC
Permalink
If it's not writable at that point, systemd will *mount* a temporary writable file on top of it, and will generate an ID that's temporary for that boot.
I've ended up copying /etc/machine-id onto our overlay etc directory
before mounting the actual overlay, this way after mounting /etc the
machine also persists across reboots.
I think systemd still regenerates the machine-id on boot because the
overlay isn't mounted yet, so a more long term solution would be to
use an initramfs to mount the overlay before system starts (unless
there are other ways to do so).
It's possible that your overlay goes on top of that and provides its own empty machine-id file again...
Make our overlay mount unit depend on whatever service is generating machine-id and make sure our mount happens before the generation of machine-id?
That might work, and would allow the machine-id and thus the DUID to be persistent.
As an alternative you could tell networkd to use DUID-LLT (?), which doesn't need the machine-id and just uses the MAC address, but there may be other things which use the machine-id anyway...
Thanks
--
Alessandro Tagliapietra
Post by Alessandro Tagliapietra
Hello everyone,
I'm using yocto to create a custom linux image for a raspberry pi.
We have an "agent" that writes /etc/systemd/network/20-eth.network when the final user wants to have a static IP address and we remove the file when they switch back to DHCP.
After creating/deleting the file above we run `networkctl reload && networkctl reconfigure eth0`.
We mount the overlayfs with a custom .mount unit.
We've noticed that DHCP works fine if systemd-networkd starts before we mount the overlayfs but it doesn't if systemd-networkd is restarted/reconfigured after the folder is mounted or started after the overlay .mount unit.
DHCPv6 CLIENT: Failed to set DUID-EN: No medium found
eth0: DHCP6 CLIENT: Failed to set DUID: No medium found
My guess is that it's related to /etc/machine-id somehow becoming inaccessible, since networkd's DUID-EN (DUIDType=vendor) is based on that.
--
Mantas Mikulėnas
Alessandro Tagliapietra
2021-06-01 21:32:43 UTC
Permalink
It seems that while DHCP6 doesn't return any error now and the DUID is
the same after reboot and after network restart by our agent:

DHCP6 Client DUID: DUID-EN/Vendor:0000ab118bf885a2ab7335a00000

it never gets a DHCP6 IP after network restart and the state stays in
"configuring".

Logs only show

DHCPv6 lease lost
IPv6 successfully enabled

and only a reboot fixes the problem. I think I'll probably just move
on and remove ipv6 support from the kernel (as it seems that changing
the default and all values from sysctl doesn't apply to networkd).

Thank you for helping
--
Alessandro Tagliapietra

On Tue, Jun 1, 2021 at 1:55 PM Alessandro Tagliapietra
Post by Alessandro Tagliapietra
If it's not writable at that point, systemd will *mount* a temporary writable file on top of it, and will generate an ID that's temporary for that boot.
I've ended up copying /etc/machine-id onto our overlay etc directory
before mounting the actual overlay, this way after mounting /etc the
machine also persists across reboots.
I think systemd still regenerates the machine-id on boot because the
overlay isn't mounted yet, so a more long term solution would be to
use an initramfs to mount the overlay before system starts (unless
there are other ways to do so).
It's possible that your overlay goes on top of that and provides its own empty machine-id file again...
Make our overlay mount unit depend on whatever service is generating machine-id and make sure our mount happens before the generation of machine-id?
That might work, and would allow the machine-id and thus the DUID to be persistent.
As an alternative you could tell networkd to use DUID-LLT (?), which doesn't need the machine-id and just uses the MAC address, but there may be other things which use the machine-id anyway...
Thanks
--
Alessandro Tagliapietra
Post by Alessandro Tagliapietra
Hello everyone,
I'm using yocto to create a custom linux image for a raspberry pi.
We have an "agent" that writes /etc/systemd/network/20-eth.network when the final user wants to have a static IP address and we remove the file when they switch back to DHCP.
After creating/deleting the file above we run `networkctl reload && networkctl reconfigure eth0`.
We mount the overlayfs with a custom .mount unit.
We've noticed that DHCP works fine if systemd-networkd starts before we mount the overlayfs but it doesn't if systemd-networkd is restarted/reconfigured after the folder is mounted or started after the overlay .mount unit.
DHCPv6 CLIENT: Failed to set DUID-EN: No medium found
eth0: DHCP6 CLIENT: Failed to set DUID: No medium found
My guess is that it's related to /etc/machine-id somehow becoming inaccessible, since networkd's DUID-EN (DUIDType=vendor) is based on that.
--
Mantas Mikulėnas
Lennart Poettering
2021-06-02 12:57:54 UTC
Permalink
Post by Alessandro Tagliapietra
Thanks for helping Mantas,
- before first boot /etc/machine-id is empty (and I think that's expected)
- right after boot, /etc/machine-id isn't writable because the root fs is
mounted as readonly from fstab
- after the /etc overlay is mounted /etc/machine-id should still be the
one from the underlying filesystem and at this point is also writable,
however it's still empty
[ 3.577477] systemd[1]: Initializing machine ID from random generator.
[ 3.584284] systemd[1]: Installed transient /etc/machine-id file.
however /etc/machine-id shouldn't be writable at that point, what should I
do? Make our overlay mount unit depend on whatever service is generating
machine-id and make sure our mount happens before the generation of
machine-id?
The assumption is that the machine-id is accessible and remains stable
during the entire system uptime, once the host PID 1 initialized
(i.e. afte transitioning from the initrd). Apps should be able to rely
that the machine ID just works and can be cached.

If you replace /etc/ with a different file system during runtime,
that's OK as long as that file remains accessible throughout.

Note that if /etc/machine-id is empty at boot and /etc read-only PID1
will generate a transient machine ID and write it to a file in /run
which it then bind mounts over /etc/machined-id, so that it appears
there unconditionallty. If you now replce /etc with your own overlayfs
you need to make sure to cover this bind mount too. Note that the
lower layers of an overlayfs refer to the specified top-level mount
points only: a lower layer is nt the whole tree of mounts but only the
mount you explicitly list.

This means you probably want to prepare your overlayfs at some
temporary location first, then bind mount the existing bind mount that
is /etc/machine-id over the overlayfs at the same place, and then move
the whole overlayfs to /etc into place. That way /etc/ is suddenly
replaced by your overlayfs but /etc/machine-id will be accessible in a
stable way continously.

Note that /etc/machine-id is used by various parts of systemd. DHCP
stuff is just one case. Logging uses it too and plenty other
stuff. Hence, you really should follow the documented behaviour of
machine-id, because if you don't then things will break all over the
place.

Please see machine-id(5) for details about the file.

Lennart

--
Lennart Poettering, Berlin

Loading...