Discussion:
Robust getpid caching via MADV_WIPEONFORK?
(too old to reply)
Josh Triplett
2017-11-22 20:34:04 UTC
Permalink
Raw Message
[Bcced to other potentially interested folks.]

As discussed in various places, glibc removed the getpid() cache in
2.25, since it was not robust against all possible ways to fork a
process.

Linux 4.14 added MADV_WIPEONFORK, which robustly ensures that a page
gets wiped to zero on any possible process fork. The commit message
- systemd/pulseaudio API checks (fail after fork) (replacing a getpid
check, which is too slow without a PID cache)
Given that, I wanted to start a thread about the idea of making
getpid() caching, and for that matter other potential uses of
pthread_atfork(), robust using MADV_WIPEONFORK when available.

I don't necessarily want to advocate this; rather, since it seems likely
that other applications may wish to do things like this, I wanted to
collect some information and discuss whether it makes sense or not.

I wrote a simple test program, attached (warning: quick hack), that
benchmarks the stock getpid() versus a trivial cached version of
getpid() using MADV_WIPEONFORK (marked as "noinline" to simulate
providing it as a library function), just to get a rough idea. This
produced the following numbers on my system:

mmap: 3292.000000 ns
madvise: 3273.000000 ns
uncached getpid: 100000000 calls in 4569540458.000 ns; 45.695 ns/call
cached getpid: 100000000 calls in 132862952.000 ns; 1.329 ns/call

That's a significant speedup per call, but that savings only pays off if
the program calls getpid() more than ~150 times, or ~75 times if the
separate mmap can be avoided.

Given that, I don't think it makes sense for glibc to take this approach
in the standard getpid(). For any program that doesn't call getpid()
extensively, this seems like a pessimization, and many such programs can
do such caching themselves without worrying about an unexpected fork().
I think only specialized library code would ever want to do this.

Hopefully these numbers will help anyone looking to implement such
caching in their own code.

- Josh Triplett

Loading...