Discussion:
possible message leak in bus->wqueue ?
Add Reply
eshark
2015-08-07 09:18:15 UTC
Reply
Permalink
Raw Message
Hi, all
If some message went into bus->wqueue, and failed to run ioctl(KDBUS_CMD_SEND) and returned r < 0,
I found that this message will remain in the bus->wqueue. If the peer is killed for some reason, this message will fail to be sent and remain in the wqueu for ever.


Because in dispatch_wqueue() , when bus_write_message() return r <0, dispatch_wqueue() will simply return this "r " into the caller.
And the wqueue is invisible to user application, so user application also cannot remove this message to handle this error case.


I wonder whether this is a problem, and if yes, should we remove this message in dispatch_wqueue() when r < 0 ?




Thanks a lot !




Li Cheng
cee1
2015-08-07 10:07:18 UTC
Reply
Permalink
Raw Message
Post by eshark
Hi, all
If some message went into bus->wqueue, and failed to run
ioctl(KDBUS_CMD_SEND) and returned r < 0,
I found that this message will remain in the bus->wqueue. If the peer is
killed for some reason, this message will fail to be sent and remain in the
wqueu for ever.
Because in dispatch_wqueue() , when bus_write_message() return r <0,
dispatch_wqueue() will simply return this "r " into the caller.
And the wqueue is invisible to user application, so user application also
cannot remove this message to handle this error case.
I wonder whether this is a problem, and if yes, should we remove this
message in dispatch_wqueue() when r < 0 ?
I've the same question.

E.g.

dispatch_wqueue()
bus_write_message()
bus_kernel_write_message()

"""
r = ioctl(bus->output_fd, KDBUS_CMD_SEND, &cmd);
if (r < 0) {
...
else if (errno == ENXIO || errno == ESRCH) {
...
if (m->header->type == SD_BUS_MESSAGE_METHOD_CALL)
sd_bus_error_setf(&error,
SD_BUS_ERROR_SERVICE_UNKNOWN, "Destination %s not known",
m->destination);
else {
log_debug("Could not deliver message
to %s as destination is not known. Ignoring.", m->destination);
return 0;
}
}
"""

If A __returns__ a result to B, but B has already died (After sending
a "method call" message):

1. It will return ENXIO or ESRCH, right?
2. dispatch_wqueue(), bus_write_message() and
bus_kernel_write_message() returns 0
3. Next time dispatch_wqueue() called, it will retry, but never
succeed - so, deadlocked?
--
Regards,

- cee1
David Herrmann
2015-08-24 10:26:23 UTC
Reply
Permalink
Raw Message
Hi
Post by cee1
Post by eshark
Hi, all
If some message went into bus->wqueue, and failed to run
ioctl(KDBUS_CMD_SEND) and returned r < 0,
I found that this message will remain in the bus->wqueue. If the peer is
killed for some reason, this message will fail to be sent and remain in the
wqueu for ever.
Because in dispatch_wqueue() , when bus_write_message() return r <0,
dispatch_wqueue() will simply return this "r " into the caller.
And the wqueue is invisible to user application, so user application also
cannot remove this message to handle this error case.
I wonder whether this is a problem, and if yes, should we remove this
message in dispatch_wqueue() when r < 0 ?
I've the same question.
E.g.
dispatch_wqueue()
bus_write_message()
bus_kernel_write_message()
"""
r = ioctl(bus->output_fd, KDBUS_CMD_SEND, &cmd);
if (r < 0) {
...
else if (errno == ENXIO || errno == ESRCH) {
...
if (m->header->type == SD_BUS_MESSAGE_METHOD_CALL)
sd_bus_error_setf(&error,
SD_BUS_ERROR_SERVICE_UNKNOWN, "Destination %s not known",
m->destination);
else {
log_debug("Could not deliver message
to %s as destination is not known. Ignoring.", m->destination);
return 0;
This probably needs to be "return 1;". Lennart, any comments?
Post by cee1
}
}
"""
If A __returns__ a result to B, but B has already died (After sending
1. It will return ENXIO or ESRCH, right?
It returns EBADSLT.
Post by cee1
2. dispatch_wqueue(), bus_write_message() and
bus_kernel_write_message() returns 0
3. Next time dispatch_wqueue() called, it will retry, but never
succeed - so, deadlocked?
Thanks
David
eshark
2015-11-26 04:27:36 UTC
Reply
Permalink
Raw Message
Hi, all,

It seems that the exec_spawn() will return 0 if fork() fails, because that
return log_unit_error_errno(params->unit_id, errno, "Failed to fork: %m");
will return -r eventually. And here r = exec_context_load_environment() , which has
exited successfully.

This may lead big trouble to the caller of exec_spawn(). For example, mount_spawn(), which
also calls exec_spawn(), will not goto fail but run continuelly in this case. Then the following unit_watch_pid()
will fail at assert(pid >= 1);.

I also commit a patch to fix this problem, please help to review.

Thanks a lot£¡

¡¡Li Cheng
Lennart Poettering
2015-11-27 13:18:08 UTC
Reply
Permalink
Raw Message
Post by eshark
Hi, all,
It seems that the exec_spawn() will return 0 if fork() fails, because that
return log_unit_error_errno(params->unit_id, errno, "Failed to fork: %m");
will return -r eventually. And here r = exec_context_load_environment() , which has
exited successfully.
This may lead big trouble to the caller of exec_spawn(). For example, mount_spawn(), which
also calls exec_spawn(), will not goto fail but run continuelly in this case. Then the following unit_watch_pid()
will fail at assert(pid >= 1);.
I also commit a patch to fix this problem, please help to review.
Indeed, this is a bug! Patch looks great! I turned this now into a
github PR (https://github.com/systemd/systemd/pull/2048), because
that's how we prefer submissions these days. Will merge as soon as the
CI likes it too.

Thanks a lot!

Lennart
--
Lennart Poettering, Red Hat
eshark
2015-12-06 09:34:26 UTC
Reply
Permalink
Raw Message
Hi, All


         In the sd-bus.c, there are four conditions for calling the bus_enter_closing, as is: 

                        if (r == -ENOTCONN || r == -ECONNRESET || r == -EPIPE || r == -ESHUTDOWN) {
                                   bus_enter_closing(bus);
                                   return -ECONNRESET;
                           }
   
         While working with kdbus in the kernel, the kdbus return -EPIPE  to the remote connection while the local connection dies, and sometimes the kdbus
also return -ECONNRESET when the local connection dies. 
  
         Thus, it seems unreasonable for the remote connection to enter BUS_CLOSING state when it gets -EPIPE or sometimes -ECONNRESET.


         If you agree with the above issues,  which is the better choice, to modify the conditions in the sd-bus.c , or to modify the errno in the kdbus kernel?

Because the kdbus kernel always return -EPIPE to the remote connection , so it is reasonable to remove the EPIPE from the sd-bus.c ?
And becasue the kdbus kernel sometimes returen -ECONNRESET to the remote connection when the local connection dies, so it is reasonable to give a new errno for this case? such as EPERM or EHOSTUNREACH..


B.R.
Li Cheng


   
eshark
2015-12-09 02:40:36 UTC
Reply
Permalink
Raw Message
Hi, All


         In the sd-bus.c, there are four conditions for calling the bus_enter_closing, as is: 

                        if (r == -ENOTCONN || r == -ECONNRESET || r == -EPIPE || r == -ESHUTDOWN) {
                                   bus_enter_closing(bus);
                                   return -ECONNRESET;
                           }
   
         While working with kdbus in the kernel, the kdbus return -EPIPE  to the remote connection while the local connection dies, and sometimes the kdbus
also return -ECONNRESET when the local connection dies. 
  
         Thus, it seems unreasonable for the remote connection to enter BUS_CLOSING state when it gets -EPIPE or sometimes -ECONNRESET.


         If you agree with the above issues,  which is the better choice, to modify the conditions in the sd-bus.c , or to modify the errno in the kdbus kernel?

Because the kdbus kernel always return -EPIPE to the remote connection , so it is reasonable to remove the EPIPE from the sd-bus.c ?
And becasue the kdbus kernel sometimes returen -ECONNRESET to the remote connection when the local connection dies, so it is reasonable to give a new errno for this case? such as EPERM or EHOSTUNREACH..


B.R.
Li Cheng


   
eshark
2017-12-29 09:19:35 UTC
Reply
Permalink
Raw Message
Hi, All
I tried to test the socket activation by a simple foobar.socket and foobar.service, which are as the following:
foobar.socket:
[Socket]
ListenStream=/dev/socket/foobar


[Install]
WantedBy=sockets.target


foobar.service:
[Service]
Type=simple
ExecStart=/usr/bin/test-socket
Restart=no


I also wrote a simple program to connect to /dev/socket/foobar , in order to activitate the foobar.service.
When I ran the program, the foobar.service was started by systemd , and the foobar.socket changed from 'listening' state to 'running' state.
All works OK as expected, but when I killed the test-socket, it was started again by the systemd, even if I didn't run my program.
And from the system journal logs , I found that
"
Line 2035: 31,29604,571630004,-;systemd[1]: foobar.socket got notified about service death (failed permanently: no)
Line 2038: 31,29605,571630065,-;systemd[1]: foobar.socket changed running -> listening
Line 2050: 31,29609,571632385,-;systemd[1]: Incoming traffic on foobar.socket
Line 2056: 28,29611,571633087,-;systemd[1]: Cannot add dependency job for unit systemd-bus-proxyd.socket, ignoring: Unit systemd-bus-proxyd.socket failed to load: No such file or directory.
Line 2056: 28,29611,571633087,-;systemd[1]: Cannot add dependency job for unit systemd-bus-proxyd.socket, ignoring: Unit systemd-bus-proxyd.socket failed to load: No such file or directory.
Line 2065: 31,29614,571633544,-;systemd[1]: foobar.socket changed listening -> running
"
It seems that immediately after the death of foobar.service, some unknown incoming traffic on foobar.socket made the foobar.service started again by the systemd .
Could anyone give me some suggestion that who connected to the foobar.socket ? Any idea about how to debug this problem is very appreciated.


Thanks a lot.

eshark
2015-11-27 02:51:18 UTC
Reply
Permalink
Raw Message
Hi, all,

It seems that the exec_spawn() will return 0 if fork() fails, because that
return log_unit_error_errno(params->unit_id, errno, "Failed to fork: %m");
will return -r eventually. And here r = exec_context_load_environment() , which has
exited successfully.

This may lead big trouble to the caller of exec_spawn(). For example, mount_spawn(), which
also calls exec_spawn(), will not goto fail but run continuelly in this case. Then the following unit_watch_pid()
will fail at assert(pid >= 1);.

I also commit a patch to fix this problem, please help to review.

Thanks a lot£¡

¡¡Li Cheng
Loading...