Discussion:
Is sd_journal_send non-blocking?
(too old to reply)
Florian Weimer
2017-02-06 10:12:45 UTC
Permalink
Raw Message
The manual page I've got does not say it is, but the implementation does
not wait for acknowledgment from the journal. The implementation does
not block explicitly, but I think the sendmsg call can block until the
receiver queue is empty.

The background for this question is that we have a feature request for a
non-blocking logging interface. I wonder if the journal fits this
requirement.

I assume that with “non-blocking”, the feature request submitter means
that the function does not block indefinitely, say due to a service
process outage. Memory allocations can take a fairly long time as well
(due to paging), but I think that doesn't count here.

Thanks,
Florian
Lennart Poettering
2017-02-06 11:36:01 UTC
Permalink
Raw Message
The manual page I've got does not say it is, but the implementation does not
wait for acknowledgment from the journal. The implementation does not block
explicitly, but I think the sendmsg call can block until the receiver queue
is empty.
The background for this question is that we have a feature request for a
non-blocking logging interface. I wonder if the journal fits this
requirement.
I assume that with “non-blocking”, the feature request submitter means that
the function does not block indefinitely, say due to a service process
outage. Memory allocations can take a fairly long time as well (due to
paging), but I think that doesn't count here.
The socket we use is blocking, but we try to increase the socket
buffer to 8M, to move the point where we start to block out late. But
that only works with sufficient privileges.

So yes, we are always blocking, we don't throw data away.

I'd be willing to take a patch however, that adds a call
sd_journal_set_block_timeout() or so, that takes a time value we pass
to SO_SNDTIMEO for the logging socket. This would permit clients to
precisely control how long they want us to wait before we give up. And
in the case where a zero timeout is set we'd instead set O_NONBLOCK,
thus making logging entirely non-blocking.

But of course, you can't have it both. Pick one of "don't lose
messages" and "never block". Right now we always opt for the former,
with such a function call you could give that up and opt for the
latter.

Lennart
--
Lennart Poettering, Red Hat
Florian Weimer
2017-02-06 12:25:29 UTC
Permalink
Raw Message
Post by Lennart Poettering
The manual page I've got does not say it is, but the implementation does not
wait for acknowledgment from the journal. The implementation does not block
explicitly, but I think the sendmsg call can block until the receiver queue
is empty.
The background for this question is that we have a feature request for a
non-blocking logging interface. I wonder if the journal fits this
requirement.
I assume that with “non-blocking”, the feature request submitter means that
the function does not block indefinitely, say due to a service process
outage. Memory allocations can take a fairly long time as well (due to
paging), but I think that doesn't count here.
The socket we use is blocking, but we try to increase the socket
buffer to 8M, to move the point where we start to block out late. But
that only works with sufficient privileges.
The sender buffer size is currently a limit on the datagram size. The
data goes straight into the receiver's buffer, so it doesn't have any
impact on blocking behavior.
Post by Lennart Poettering
So yes, we are always blocking, we don't throw data away.
Sure. I was wondering if indefinite blocking is considered a critical
service failure and if there is a watchdog which would catch a stuck
journal daemon.

(Hanging syslog servers are apparently a fairly common problem, and this
is where the RFE originally came from.)
Post by Lennart Poettering
I'd be willing to take a patch however, that adds a call
sd_journal_set_block_timeout() or so, that takes a time value we pass
to SO_SNDTIMEO for the logging socket. This would permit clients to
precisely control how long they want us to wait before we give up. And
in the case where a zero timeout is set we'd instead set O_NONBLOCK,
thus making logging entirely non-blocking.
This would help with discarding data. It would not help those who want
to integrate logging into an event loop. Which is probably bad idea
anyway, but I'm not quite sure yet what the purpose of non-blocking
logging is. Perhaps there is a desire to apply some back-pressure to
reduce the rate at which logging messages are generated. But in
general, this merely introduces deadlocks.

Thanks,
Florian
Lennart Poettering
2017-02-06 12:31:16 UTC
Permalink
Raw Message
Post by Lennart Poettering
The socket we use is blocking, but we try to increase the socket
buffer to 8M, to move the point where we start to block out late. But
that only works with sufficient privileges.
The sender buffer size is currently a limit on the datagram size. The data
goes straight into the receiver's buffer, so it doesn't have any impact on
blocking behavior.
Well, the receiver buffer islarge anyway, it's journald on the other
side, and that is privileged.
Post by Lennart Poettering
So yes, we are always blocking, we don't throw data away.
Sure. I was wondering if indefinite blocking is considered a critical
service failure and if there is a watchdog which would catch a stuck journal
daemon.
The watchdog logic is enabled for journald, and it's at 3min by
default, if that's what are you asking?
Post by Lennart Poettering
I'd be willing to take a patch however, that adds a call
sd_journal_set_block_timeout() or so, that takes a time value we pass
to SO_SNDTIMEO for the logging socket. This would permit clients to
precisely control how long they want us to wait before we give up. And
in the case where a zero timeout is set we'd instead set O_NONBLOCK,
thus making logging entirely non-blocking.
This would help with discarding data. It would not help those who want to
integrate logging into an event loop. Which is probably bad idea anyway,
but I'm not quite sure yet what the purpose of non-blocking logging is.
Perhaps there is a desire to apply some back-pressure to reduce the rate at
which logging messages are generated. But in general, this merely
introduces deadlocks.
We could expose the logging socket too, so that people could poll() on
it. But uh, I seriously doubt this is actually a good idea...

Lennart
--
Lennart Poettering, Red Hat
Loading...