timesyncd may try to sync when system is not really online
Add Reply
Alvin Šipraga
2021-03-16 14:07:25 UTC

systemd-timesyncd listens for changes in network state and will start a
timesync when it determines that the system is online. It does this via
network_is_online(), which just queries the global carrier and address
state advertised by systemd-networkd.

systemd-networkd determines the global carrier and address state by
simply iterating through all links and taking the maximum of the link
carrier and address states.

However, a routable link does not necessarily mean that the system is
online. One example is with wireless access point interfaces: an AP is
set up with hostapd and networkd configures a static IP address on that
interface, making it routable. By the above, this causes timesyncd to
attempt a timesync that is doomed to fail.

We are having trouble with this logic because we start hitting
timesyncd's internal rate limiter. After a non-AP interface gets an IP
and we are truly "online", timesyncd will try to start a timesync, but
may defer it for up to 30 seconds due to rate limiting.

Ideally, timesyncd should be able to distinguish between the two cases
such that it can ignore the state of the AP interface and avoid spurious
timesync attempts.

I can think of a few solutions to this problem but I wanted to consult
the mailing list to see if anybody has some better ideas. Here are some
proposals, in order of descending preference:

1. Add a new [Link] option NotForOnline= (or similar) to
systemd.networkd(5) to indicate that this link being routable does not
imply that the system is online. This could also implicitly set
RequiredForOnline=no if desired. Then:

i) Parse the above option in network_is_online() to exclude certain
links; or
ii) parse the above option only in timesyncd. Note that the only user
of network_is_online() is timesyncd anyway.

2. Add a new [Link] option IgnoreLinkState= (or similar) to
systemd.network(5) to indicate to systemd-networkd that this link's
carrier and address state should not be factored into the calculation of
global carrier and address state.

3. Same as (1) above, but rather than adding a new option, just parse

4. Something like the above, but set the ignored or allowed interfaces
in timesyncd.conf.

5. Make the rate limiter user-configurable so that a user can disable or
tune it accordingly.

Options (1) and (2) seem to be the only decent solutions IMO. But I'm a
bit cautious of (2) because I don't know what other stuff might be
depending on systemd's global state.

Happy to hear other ideas on this. Could be that we are also doing
something wrong and there is already a knob to twist. I can provide
further details if necessary.

I'll prepare a patch once I've waited some time for any feedback here.

Kind regards,